Description
Can we modify convert_texts_to_vector in https://github.com/harmonydata/harmony/blob/main/src/harmony/matching/default_matcher.py to allow items to be batched when sent to the LLM?
Batch size should be variable
Rationale
If a user wants to harmonise 10,000 items, this will not fit in memory even in a high performance machine. Small laptops probably can only batch 20 items at a time. But the batching should be configurable as it will slow things down. Perhaps as a parameter.
People have reported that the website cannot cope with large harmonisations. E.g. below comment on Discord (23 Oct 2024)

Description
Can we modify
convert_texts_to_vectorinhttps://github.com/harmonydata/harmony/blob/main/src/harmony/matching/default_matcher.pyto allow items to be batched when sent to the LLM?Batch size should be variable
Rationale
If a user wants to harmonise 10,000 items, this will not fit in memory even in a high performance machine. Small laptops probably can only batch 20 items at a time. But the batching should be configurable as it will slow things down. Perhaps as a parameter.
People have reported that the website cannot cope with large harmonisations. E.g. below comment on Discord (23 Oct 2024)