Skip to content

Conversation

@DonalEvans
Copy link
Contributor

The name "text embedding" is used in many places where dense vector embeddings are handled, despite the type of the embedding vector not being exclusive to text embeddings. For example, image or multimodal embeddings may also produce a dense vector. To allow future reuse of classes related to dense vectors with multimodal embeddings, the naming is being changed to the more general "dense embedding". Classes which explicitly relate to text embeddings are not being renamed.

This rename is internal to the code only and does not change the name of any JSON objects which currently use "text_embedding", as doing so would be a breaking change.

  • For everything not exclusively related to text embedding, rename classes, methods and variables to use "dense embedding" instead of "text embedding"
  • Use correct class name in ElasticTextEmbeddingPayload.TextEmbeddingFloat.PARSER
  • Correct the javadoc in DenseEmbeddingBitResults

The name "text embedding" is used in many places where dense vector
embeddings are handled, despite the type of the embedding vector not
being exclusive to text embeddings. For example, image or multimodal
embeddings may also produce a dense vector. To allow future reuse of
classes related to dense vectors with multimodal embeddings, the naming
is being changed to the more general "dense embedding". Classes which
explicitly relate to text embeddings are not being renamed.

This rename is internal to the code only and does not change the name of
any JSON objects which currently use "text_embedding", as doing so would
be a breaking change.

- For everything not exclusively related to text embedding, rename
  classes, methods and variables to use "dense embedding" instead of
"text embedding"
- Use correct class name in
  ElasticTextEmbeddingPayload.TextEmbeddingFloat.PARSER
- Correct the javadoc in DenseEmbeddingBitResults
@DonalEvans DonalEvans added >refactoring :ml Machine learning Team:ML Meta label for the ML team v9.3.0 labels Oct 9, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Comment on lines -202 to +203
private static final ConstructingObjectParser<TextEmbeddingFloatResults, Void> PARSER = new ConstructingObjectParser<>(
TextEmbeddingByteResults.class.getSimpleName(),
private static final ConstructingObjectParser<DenseEmbeddingFloatResults, Void> PARSER = new ConstructingObjectParser<>(
DenseEmbeddingFloatResults.class.getSimpleName(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parser was previously using the incorrect class name, which would have led to any errors encountered when parsing to report the wrong class.

@DonalEvans DonalEvans requested a review from davidkyle October 9, 2025 22:25
Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

namedWriteables.add(new NamedWriteableRegistry.Entry(InferenceResults.class, TextExpansionResults.NAME, TextExpansionResults::new));
namedWriteables.add(
new NamedWriteableRegistry.Entry(InferenceResults.class, MlTextEmbeddingResults.NAME, MlTextEmbeddingResults::new)
new NamedWriteableRegistry.Entry(InferenceResults.class, MlDenseEmbeddingResults.NAME, MlDenseEmbeddingResults::new)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ this is why NAME can't change

+ embedding.embeddings().get(0).getClass().getName()
+ ". Expected TextEmbeddingFloatResults.Embedding or TextEmbeddingByteResults.Embedding."
+ ". Expected "
+ DenseEmbeddingFloatResults.Embedding.class.getName()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSimpleName() is the class name without the package prefix and and is a bit more readable

Suggested change
+ DenseEmbeddingFloatResults.Embedding.class.getName()
+ DenseEmbeddingFloatResults.Embedding.class.getSimpleName()

"Validation call did not return expected results type."
+ "Expected a result of type ["
+ TextEmbeddingFloatResults.NAME
+ DenseEmbeddingFloatResults.NAME
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
+ DenseEmbeddingFloatResults.NAME
+ DenseEmbeddingResults.NAME

The name without the specific element type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DenseEmbeddingResults is an interface, so it doesn't have a NAME constant, so I'll use the simple class name instead.

@DonalEvans DonalEvans requested a review from davidkyle October 13, 2025 15:30
@DonalEvans DonalEvans merged commit a0f415d into elastic:main Oct 17, 2025
35 checks passed
@DonalEvans DonalEvans deleted the rename-text-embeddings-to-dense-embeddings branch October 17, 2025 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:ml Machine learning >refactoring Team:ML Meta label for the ML team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants