[ML] Apply windowing and chunking to long documents #104363

davidkyle · 2024-01-15T11:18:51Z

Adds a chunkedInfer() method to the InferenceService interface which automatically splits long text before sending the inputs to the model. Chunking is done via a sliding window of length window_size with an overlap of span.

By default the window size is equal to the model's max sequence length and span is 50% of that (after accounting for special tokens). One reason to choose a smaller window size is that processing time is exponential on the number of input tokens, reducing the window size results in some lost context (fewer tokens per input) but may be the fastest strategy for ingesting long text.

This change only applies to the ELSER model and Text Embedding models deployed locally in the cluster

Response Structure

(Field names subject to change)

{
  "sparse_embedding_chunk": [
       {
          "text": "first text chunk",
          "inference": { sparse embedding tokens...}
        },
        {
          "text": "second text chunk",
          "inference": { sparse embedding tokens...}
        }, ...
    ]
}

elasticsearchmachine · 2024-01-15T11:19:15Z

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine · 2024-01-15T11:19:16Z

Hi @davidkyle, I've created a changelog YAML for you.

carlosdelest · 2024-01-15T16:30:38Z

Overall changes LGTM. 👍

How will chunking options work by default? Will models be deployed with default chunking options?

jonathan-buttner

Left a few questions and comments

...plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/action/ChunkedInferenceAction.java

...main/java/org/elasticsearch/xpack/core/ml/inference/results/ChunkedTextExpansionResults.java

...main/java/org/elasticsearch/xpack/core/ml/inference/results/ChunkedTextEmbeddingResults.java

...ugin/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportChunkedInferenceAction.java

...plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/nlp/TextExpansionProcessor.java

davidkyle · 2024-01-18T13:01:28Z

How will chunking options work by default? Will models be deployed with default chunking options?

Currently I'm not sure.

There are 2 options to consider: span and windowSize. Every model has a max sequence length which maps to windowSize if windowSize is not set.

But the model config doesn't have a default value of span. This is a problem as the span parameter is optional (Integer rather than int) but in practice is must be set as there is no default.

The default span could be a function of max sequence length

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

jonathan-buttner

Looks good, just a few questions around switching to readOptional*() and I think we need a few return statements after onFailure() calls.

.../core/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/Tokenization.java

...ugin/src/main/java/org/elasticsearch/xpack/inference/mock/TestInferenceServiceExtension.java

...erence/src/main/java/org/elasticsearch/xpack/inference/InferenceNamedWriteablesProvider.java

...rence/src/main/java/org/elasticsearch/xpack/inference/services/elser/ElserMlNodeService.java

...ml/src/main/java/org/elasticsearch/xpack/ml/inference/deployment/InferencePyTorchAction.java

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

Adds a chunkedInfer() method to the InferenceService interface which automatically splits long text before sending the inputs to the model. Chunking is done via a sliding window of length window_size with an overlap of span. This change only applies to the ELSER model and Text Embedding models deployed locally in the cluster

The changes in #105183 clashed with #104363

The changes in elastic#105183 clashed with elastic#104363

davidkyle added 2 commits January 15, 2024 10:28

Chunk text expansion and embedding results

3ba84bd

Add windowing options

07237e6

davidkyle added >enhancement :ml Machine learning v8.13.0 labels Jan 15, 2024

elasticsearchmachine added the Team:ML Meta label for the ML team label Jan 15, 2024

Update docs/changelog/104363.yaml

8c077ec

davidkyle added the Team:Search Meta label for search team label Jan 15, 2024

elasticsearchmachine removed the Team:Search Meta label for search team label Jan 15, 2024

davidkyle added 3 commits January 15, 2024 11:55

timout

f2657a9

oops

44caf21

fix operatorprivilegesit

9690911

jonathan-buttner reviewed Jan 17, 2024

View reviewed changes

Address review comments

f567e87

davidkyle added 3 commits January 18, 2024 13:21

Merge branch 'main' into chunking

45f275a

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

fix test

cc752e3

Default span value

aaed765

jonathan-buttner approved these changes Jan 18, 2024

View reviewed changes

davidkyle added 8 commits January 25, 2024 14:31

Merge branch 'main' into chunking

ef277e6

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

Add chunked inference method to services

27e277d

Merge branch 'main' into chunking

b92a2cf

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

fix compilation after merge

4ab95be

Remove chunked inference action

f0037a9

Add input type to chunk method

20fbd93

more tests for tokenisation update

1a3cdf9

validation on window size

2a2bcaa

davidkyle added 2 commits January 31, 2024 14:15

fix compilation

be4a8cd

Fix operator privileges test

3fde8b2

jonathan-buttner reviewed Jan 31, 2024

View reviewed changes

davidkyle added 4 commits February 1, 2024 14:41

address review feedback

2c7736f

Merge branch 'main' into chunking

9a6d644

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

revert accidental serialisation change

40173a1

Merge branch 'main' into chunking

d51ed8b

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

davidkyle merged commit 6439614 into elastic:main Feb 1, 2024
15 checks passed

davidkyle mentioned this pull request Feb 6, 2024

[ML] Fix compilation after in flight PRs crossed paths #105211

Merged

elasticsearchmachine pushed a commit that referenced this pull request Feb 6, 2024

Fix compilation (#105211)

3d380cc

The changes in #105183 clashed with #104363

felixbarny pushed a commit to felixbarny/elasticsearch that referenced this pull request Feb 8, 2024

Fix compilation (elastic#105211)

5768a01

The changes in elastic#105183 clashed with elastic#104363

dimkots assigned davidkyle Mar 26, 2024

davidkyle removed the >enhancement label Apr 2, 2024

davidkyle deleted the chunking branch April 2, 2024 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Apply windowing and chunking to long documents #104363

[ML] Apply windowing and chunking to long documents #104363

davidkyle commented Jan 15, 2024 •

edited

elasticsearchmachine commented Jan 15, 2024

elasticsearchmachine commented Jan 15, 2024

carlosdelest commented Jan 15, 2024

jonathan-buttner left a comment

davidkyle commented Jan 18, 2024

jonathan-buttner left a comment

[ML] Apply windowing and chunking to long documents #104363

[ML] Apply windowing and chunking to long documents #104363

Conversation

davidkyle commented Jan 15, 2024 • edited

Response Structure

elasticsearchmachine commented Jan 15, 2024

elasticsearchmachine commented Jan 15, 2024

carlosdelest commented Jan 15, 2024

jonathan-buttner left a comment

Choose a reason for hiding this comment

davidkyle commented Jan 18, 2024

jonathan-buttner left a comment

Choose a reason for hiding this comment

davidkyle commented Jan 15, 2024 •

edited