Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: Adding Config classes for use with IKernelBuilder and IServiceCollection #6283

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

akordowski
Copy link

@akordowski akordowski commented May 16, 2024

Motivation and Context

This PR introduce Config classes as described in the issue #6081.

Description

For this PR I was inspiered by playing with the microsoft/chat-copilot project and looking into the configuration in the microsoft/kernel-memory project. The new classes and logic are copied or inspired by the code in the kernel-memory project. Some of the properties don't match with the C# convention, but I kept them as they are to have consistent API (e.g. APIType, APIType -> by C# convention should be rather ApiType, ApiType).

With this proposed addition we will have a similar API and developer experience through the microsoft/semantic-kernel and the microsoft/kernel-memory projects.

The following classes are inspired by:

Thank you for considering of this PR. Looking forward to your feedback.

Contribution Checklist

@akordowski akordowski requested a review from a team as a code owner May 16, 2024 06:40
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel kernel.core labels May 16, 2024
@github-actions github-actions bot changed the title Adding Config classes for use with IKernelBuilder and IServiceCollection .Net: Adding Config classes for use with IKernelBuilder and IServiceCollection May 16, 2024
@akordowski
Copy link
Author

Note: I also have seen that in the OpenAIServiceCollectionExtensions class some parameters are not consistent, e.g. TokenCredential credentials and TokenCredential credential. I haven't changed it yet but I think it would be good to have consistent naming.

Copy link
Member

@dmytrostruk dmytrostruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! I left initial comments. I also think that we need some alignment between Config and Options naming, because today we have some classes that are Options:
KernelFunctionFromMethodOptions.cs

Taking into account that Options is used as .NET pattern for strongly-typed settings, I would prefer to use it for extension methods as well:

cc: @markwallace-microsoft @matthewbolanos


config.Validate();

switch (config.Auth)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid using enums for this configuration. I think we can create hierarchy of configuration classes, where base class will contain common properties like DeploymentName etc, and different derived types will contain configuration-specific properties like TokenCredential or string ApiKey. I think the main goal would be to create the same set of overloads we have today, but with specific config models. This approach will also decrease the probability of incorrect configuration when AuthType is unknown. With specific classes we won't need to throw an exception.

Otherwise, if enum value is TokenCredential, we will have to verify that the property TokenCredential was initialized and so on. The more parameters will be added in the future, the harder it will be to maintain the validation. But having specific config models should simplify the usage and the maintenance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comment below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted as requested. Please take a look.

Comment on lines 227 to 229
AzureOpenAIConfig config,
string? serviceId = null,
HttpClient? httpClient = null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also want to put serviceId and httpClient to config model as nullable. If we introduce config models, I think no additional parameters should be present in the method. It will also simplify the usage, because if user already initialize config object, they won't have to pass serviceId and httpClient parameters separately.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure about that but I think it is a good idea. I will move it adjust it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the ServiceId property to the config class and adjusted the methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still some places where serviceId parameter is present in extension method:
image

Also, can we do the same for HttpClient?

Comment on lines 32 to 37
public APITypes APIType { get; set; } = APITypes.ChatCompletion;

/// <summary>
/// Azure authentication type.
/// </summary>
public AuthTypes Auth { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already mentioned, it would be nice to avoid using these enums. For AuthTypes - define specific classes. For APITypes - I'm not sure we need it at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comment below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As requested I created specific classes for the configuration and removed the enums. Please take a look.

@akordowski
Copy link
Author

akordowski commented May 17, 2024

Thank you for your feedback!

The proposed implementation reflects the implementation of the configuration in the microsoft/kernel-memory project. As I understand it the semantic-kernel and the kernel-memory projects are in the same domain. From developers perspective I would appreciate to have a consistent way to configure things (not ‘here we do it this way and here the other way’).

I don’t know if the semantic-kernel and the kernel-memory projects are developed by the same team or different teams, but maybe the team(s) could agree onone design which could/should be used across the projects. This way we would have a consistent API and it would result in a better developer experience.

I hope I was able to explain my thoughts. What do you think?

Taking into account that Options is used as .NET pattern for strongly-typed settings, I would prefer to use it for extension methods as well:

I’m aware of that pattern (and I agree), but the OpenAIServiceCollectionExtensions already uses in the AddAzureOpenAIChatCompletion method a AzureOpenAIChatCompletionWithDataConfig config parameter. Introducing Option classes may lead to confusion and were not consistent with the current implementation.

@dmytrostruk
Copy link
Member

From developers perspective I would appreciate to have a consistent way to configure things (not ‘here we do it this way and here the other way’).

I 100% agree, but we can always improve things in both projects if needed :)
I mentioned my concerns about using enums, and I believe we can make it simpler, comparing to using a lot of switch/case statements.

I’m aware of that pattern (and I agree), but the OpenAIServiceCollectionExtensions already uses in the AddAzureOpenAIChatCompletion method a AzureOpenAIChatCompletionWithDataConfig config parameter.

This class is going to be deprecated at some point, since current version of chat completion with data functionality was added at the point when this feature wasn't available in Azure .NET SDK. Since it's available now, we are going to use it directly, which means that WithData classes will be deprecated.

Copy link
Member

@dmytrostruk dmytrostruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akordowski Thank you for contribution. A couple of notes:

  1. Based on team discussion, we would like to proceed with Options naming instead of Config.
  2. There are some issues mentioned in CI pipeline, some of them are related to missing XML documentation. Make sure that solution builds without warnings and errors locally, and all tests are passed. It would be also great to add unit testing for new option classes.

Please let us know if you feel comfortable to resolve all comments. Otherwise, it's totally fine to leave this PR as it is, and our team will continue the work to resolve the issue. Thanks again!

/// Only supported in "text-embedding-3" and later models developed with
/// MRL, see https://arxiv.org/abs/2205.13147
/// </summary>
public int? EmbeddingDimensions { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property is not applicable for chat completion endpoints. Can we organize configuration model hierarchy by modality type (e.g. chat completion, text generation, audio) instead of auth type? Then, each model could contain overload constructors which accept either apiKey or TokenCredential.

This should be more scalable, as I expect more different modalities supported with more configuration properties comparing to auth types.

/// <summary>
/// Azure OpenAI deployment name.
/// </summary>
public string Deployment { get; set; } = string.Empty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In current extension methods this parameter has a name deploymentName. I would keep the same naming in config classes and current extension methods for consistency.

/// When using Auto, the client uses OpenAI model names to detect the correct protocol.
/// Most OpenAI models use Chat. If you're using a non-OpenAI model, you might want to set this manually.
/// </summary>
public TextGenerationTypes TextGenerationType { get; set; } = TextGenerationTypes.Auto;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find a place where this property and enum are used. Would it be possible to remove it?

endpoint: config.Endpoint,
credentials: config.TokenCredential!,
serviceId: config.ServiceId,
modelId: config.Deployment,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In current extension methods deploymentName and modelId properties are separate, same as on Azure side. We should keep that logic with new configuration models as well.


namespace Microsoft.SemanticKernel;

public class AzureOpenAIApiKeyConfig : AzureOpenAIConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing XML documentation. Also, every new class and method should be added as Experimental for initial period of time.

dotnet/src/SemanticKernel.Core/SemanticKernel.Core.csproj Outdated Show resolved Hide resolved
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
@akordowski
Copy link
Author

Please let us know if you feel comfortable to resolve all comments. Otherwise, it's totally fine to leave this PR as it is, and our team will continue the work to resolve the issue.

I would try to resolve the comments but the unit tests I would maybe left for the team 😆

  1. Based on team discussion, we would like to proceed with Options naming instead of Config.

I have some questions regarding this.

If we create option classes for each modality type (as you suggested above) it would lead to initialization like this:

var options = new AzureOpenAITextGenerationOptions(ApiKey / Token)
{
	DeploymentName = "...",
    ModelId = "...",
    ...
}
  • Or should the option classes be initialized only over the constructor?
  • If yes should the properties still be get/set or only get?
  • As we are going with the Options pattern
    • should I keep the Validate() method in the option class
    • or rather add property attributes to validate the object by the ConfigurationBuilder
    • or remove the validation and use constructor attribute validation instead?

The approach would also require a check for the ApiKey/Token in the extension methods in the OpenAIServiceCollectionExtensions class.

public static IKernelBuilder AddAzureOpenAITextGeneration(
    this IKernelBuilder builder,
    AzureOpenAITextGenerationOptions options)
{
    if (!string.IsNullOrWhiteSpace(options.ApiKey))
    {
        builder.Services.AddMethod(options.ApiKey))
    }
    else if (options.Token is not null)
    {
        builder.Services.AddMethod(options.Token))
    }
    else
    {
        // This case may never occour as we would have to set the auth via the constructor
        throw new InvalidOperationException("Please provide API key or Token authentication.");
    }
}

Please provide details on how the team wish it to be implemented. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel.core kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants