Skip to content

.Net: Support for configuring dimensions in Google AI embeddings generation #10489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 18, 2025

Conversation

ArieSLV
Copy link
Contributor

@ArieSLV ArieSLV commented Feb 11, 2025

Motivation and Context

This change addresses a limitation in the current implementation of the Google AI embeddings generation service in Semantic Kernel. Currently, users cannot configure the output dimensionality of the embeddings, even though the underlying Google AI API supports specifying the number of dimensions via the output_dimensionality parameter.

Why is this change required?
Allowing configuration of the dimensions provides greater flexibility for users to tailor the embeddings to their specific use cases—whether for optimizing memory usage, improving performance, or ensuring compatibility with downstream systems that expect a particular embedding size.

What problem does it solve?
It solves the issue of inflexibility by exposing the dimensions parameter in the service constructors, builder methods, and API request payloads. This ensures that developers can leverage the full capabilities of the Google API without being limited to the default embedding size.

What scenario does it contribute to?
This feature is particularly useful in scenarios where:

Description

This PR introduces support for specifying the output dimensionality in the Google AI embeddings generation workflow. The main changes include:

  • Service Constructor Update:
    The GoogleAITextEmbeddingGenerationService constructor now accepts an optional dimensions parameter, which is then forwarded to the lower-level client implementations.

  • Builder and Extension Methods:
    Extension methods such as AddGoogleAIEmbeddingGeneration have been updated to accept a dimensions parameter. This allows developers to configure the embedding dimensions using the builder pattern.

  • Request Payload Enhancement:
    The GoogleAIEmbeddingRequest class now includes a new optional property Dimensions (serialized as output_dimensionality). When provided, this value is included in the JSON payload sent to the Google AI API.

  • Metadata and Attributes Update:
    The service’s metadata now reflects the provided dimensions, ensuring consistency in configuration tracking.

  • Unit Testing:
    New unit tests have been added to confirm that:

    • When a dimensions value is provided, it is correctly included in the JSON request.
    • When not provided, the default behavior remains unchanged.

This enhancement maintains backward compatibility since the new parameter is optional. Existing implementations that do not specify a dimension will continue to work as before.

Contribution Checklist

@ArieSLV ArieSLV requested a review from a team as a code owner February 11, 2025 18:35
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel kernel.core labels Feb 11, 2025
@ArieSLV
Copy link
Contributor Author

ArieSLV commented Feb 12, 2025

Hi team,
I see the Contributor License Agreement (CLA) message and understand that I need to agree to it before my submission can be processed. I've reached out to my manager for approval to sign the CLA under my employer’s name. Once I receive the necessary clearance, I will reply accordingly.
I appreciate your patience.

@ArieSLV
Copy link
Contributor Author

ArieSLV commented Feb 16, 2025

@microsoft-github-policy-service agree

@ArieSLV
Copy link
Contributor Author

ArieSLV commented Feb 21, 2025

Hi team,
The Contributor License Agreement issue has been resolved. Are any further actions required from me to get a review?

@rogerbarreto
Copy link
Member

rogerbarreto commented Feb 28, 2025

Hi @ArieSLV thanks for you contributions, most of it looking good so far.

  • Some small spell check errors to fix.
  • Missing Integration tests, ensure we have one integration test (can be marked as skip but working in your local environment tests`.
  • Add a sample to the Concepts\Memory\Google_EmbeddingGeneration.cs for this update, will be important exploring this new feature. Similar how we have here: OpenAI_EmbeddingGeneration.cs

@rogerbarreto rogerbarreto added the PR: feedback to address Waiting for PR owner to address comments/questions label Feb 28, 2025
@ArieSLV
Copy link
Contributor Author

ArieSLV commented Mar 8, 2025

Hi @rogerbarreto,

Thanks for the thorough review. I've addressed all your feedback in the latest changes:

  • Added integration tests in EmbeddingGenerationTests.cs with both default and custom dimensions scenarios
  • Created a sample implementation in Concepts\Memory\Google_EmbeddingGeneration.cs showing how to use the new functionality
  • Added unit tests that verify proper request formatting with and without dimensions

Regarding the spell check errors, I believe those might have been resolved during the merge as they weren't directly related to my code changes.

@ArieSLV ArieSLV requested a review from rogerbarreto March 8, 2025 13:09
@rogerbarreto rogerbarreto added this pull request to the merge queue Mar 18, 2025
Merged via the queue into microsoft:main with commit fc6c2d4 Mar 18, 2025
20 checks passed
@github-project-automation github-project-automation bot moved this from Community PRs to Sprint: Done in Semantic Kernel Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel.core kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

.Net: New Feature: Support for configuring dimensions in Google AI embeddings generation
3 participants