Reduce model chunk size to 1 MB #605

jonathan-buttner · 2023-09-20T15:23:16Z

This PR reduces the chunk size of the model stored in Elasticsearch from 4 MB to 1 MB. We've seen less memory pressure by using 1 MB chunks.

Part of issue: elastic/elasticsearch#99409

Related PR: elastic/elasticsearch#99677

Testing

I tested this by importing a model around ~300 MB and extracted the binary_definition field from the document which results in a file containing the base64 contents. The file is around 1 MB:

docker run -it --rm --network host elastic/eland \
    eland_import_hub_model \
      --url http://elastic:password@host.docker.internal:9200/ \
      --hub-model-id sentence-transformers/all-distilroberta-v1 \
--clear-previous

2023-09-20 14:56:34,232 INFO : Creating model with id 'sentence-transformers__all-distilroberta-v1'
2023-09-20 14:56:34,854 INFO : Uploading model definition
100%|███████████████████████████████████████████████████████████████████████████| 312/312 [00:12<00:00, 24.15 parts/s]
2023-09-20 14:56:47,776 INFO : Uploading model vocabulary
2023-09-20 14:56:47,957 INFO : Model successfully imported with id 'sentence-transformers__all-distilroberta-v1'

When searching for the chunks there were ~300 documents which means were are correctly storing the model in 1 MB chunks.

GET http://localhost:9200/.ml-inference-*/_search
{
    "size": 1,
    "sort": [
        {
            "_index": "desc"
        },
        {
            "doc_num": "asc"
        }
    ],
    "_source": false,
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "model_id": {
                            "value": "sentence-transformers__all-distilroberta-v1"
                        }
                    }
                },
                {
                    "term": {
                        "doc_type": {
                            "value": "trained_model_definition_doc"
                        }
                    }
                }
            ]
        }
    }
}

Result:

{
    "took": 7,
    "timed_out": false,
    "_shards": {
        "total": 2,
        "successful": 2,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 312, <--------------
            "relation": "eq"
        },
        "max_score": null,
        "hits": [
            {
                "_index": ".ml-inference-native-000001",
                "_id": "trained_model_definition_doc-sentence-transformers__all-distilroberta-v1-0",
                "_score": null,
                "sort": [
                    ".ml-inference-native-000001",
                    0
                ]
            }
        ]
    }
}

droberts195

LGTM

Setting chunk size to 1mb

7b7cf04

jonathan-buttner added enhancement New feature or request topic:NLP Issue or PR about NLP model support and eland_import_hub_model labels Sep 20, 2023

droberts195 approved these changes Sep 20, 2023

View reviewed changes

jonathan-buttner merged commit a8b76c3 into elastic:main Sep 20, 2023
4 checks passed

droberts195 mentioned this pull request Sep 22, 2023

[Ml] CircuitBreakingException when deploying the ELSER model elastic/elasticsearch#99409

Closed

droberts195 mentioned this pull request Nov 10, 2023

High memory pressure for Elasticsearch versions using JDK 20+ elastic/elasticsearch#99592

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce model chunk size to 1 MB #605

Reduce model chunk size to 1 MB #605

jonathan-buttner commented Sep 20, 2023

droberts195 left a comment

Reduce model chunk size to 1 MB #605

Reduce model chunk size to 1 MB #605

Conversation

jonathan-buttner commented Sep 20, 2023

Testing

droberts195 left a comment

Choose a reason for hiding this comment