Skip to content

.Net: Bug: Input name: 'token_type_ids' is not in the metadata #11526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
klaus-bleck opened this issue Apr 12, 2025 · 0 comments
Open

.Net: Bug: Input name: 'token_type_ids' is not in the metadata #11526

klaus-bleck opened this issue Apr 12, 2025 · 0 comments
Assignees
Labels
bug Something isn't working .NET Issue or Pull requests regarding .NET code

Comments

@klaus-bleck
Copy link

klaus-bleck commented Apr 12, 2025

Describe the bug
Exception is thrown if you try to to use the BertOnnxTextEmbeddingGenerationService with a mpnet embedding model.

Microsoft.ML.OnnxRuntime.OnnxRuntimeException HResult=0x80131500 Message=[ErrorCode:InvalidArgument] Input name: 'token_type_ids' is not in the metadata Source=Microsoft.ML.OnnxRuntime StackTrace: at Microsoft.ML.OnnxRuntime.InferenceSession.LookupInputMetadata(String nodeName) at Microsoft.ML.OnnxRuntime.InferenceSession.LookupUtf8Names[T](IReadOnlyCollection``1 values, NameExtractor``1 nameExtractor, MetadataLookup metaLookup) at Microsoft.ML.OnnxRuntime.InferenceSession.Run(RunOptions runOptions, IReadOnlyCollection``1 inputNames, IReadOnlyCollection``1 inputValues, IReadOnlyCollection``1 outputNames) at Microsoft.SemanticKernel.Connectors.Onnx.BertOnnxTextEmbeddingGenerationService.<GenerateEmbeddingsAsync>d__18.MoveNext() at Microsoft.SemanticKernel.Embeddings.EmbeddingGenerationExtensions.<GenerateEmbeddingAsync>d__2``2.MoveNext() at Program.<>c.<<<Main>$>b__0_1>d.MoveNext()

It seems the BertOnnxTextEmbeddingGenerationService tries to fill the input token_type_ids even if the model does not declare it, e.g. the mpnet models.

To Reproduce
Steps to reproduce the behavior:

  1. Download ONNX-Model like sentence-transformers/all-mpnet-base-v2
  2. Create a test project with a service collection and register the embedding service and the kernel (builder.Services.AddBertOnnxTextEmbeddingGeneration) for the previously downloaded model with its vocabulary
  3. Use the service (await textEmbeddingGenerationService.GenerateEmbeddingAsync("this is an example")
  4. Exception is thrown

Expected behavior
Embedding is just working or configuration is possible. Something without the need to export the model with the additional input

Screenshots
Image

Platform

  • Language: C#
  • Source: Microsoft.SemanticKernel.Connectors.Onnx 1.45.0-alpha
  • AI model: sentence-transformers/all-mpnet-base-v2
  • IDE: VS Community 2022
  • OS: Windows
@klaus-bleck klaus-bleck added the bug Something isn't working label Apr 12, 2025
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Apr 12, 2025
@github-actions github-actions bot changed the title Bug: Input name: 'token_type_ids' is not in the metadata .Net: Bug: Input name: 'token_type_ids' is not in the metadata Apr 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working .NET Issue or Pull requests regarding .NET code
Projects
None yet
Development

No branches or pull requests

4 participants