⚡️ Speed up method _CollectionConfigCreate.validate_vector_names by 13%
#102
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 13% (0.13x) speedup for
_CollectionConfigCreate.validate_vector_namesinweaviate/collections/classes/config.py⏱️ Runtime :
81.0 microseconds→71.5 microseconds(best of61runs)📝 Explanation and details
The optimization replaces an O(n²) duplicate detection algorithm with an O(n) single-pass approach.
Key Change:
names.count(name) > 1inside a set comprehension, which callscount()for each name - requiring a full list scan for every elementseenanddups) to track visited names and duplicates in one passWhy it's faster:
list.count()is O(n) and gets called for each of the n elementsPerformance characteristics:
✅ Correctness verification report:
⏪ Replay Tests and Runtime
test_pytest_testcollectiontest_batch_py_testcollectiontest_classes_generative_py_testcollectiontest_confi__replay_test_0.py::test_weaviate_collections_classes_config__CollectionConfigCreate_validate_vector_namesTo edit these changes
git checkout codeflash/optimize-_CollectionConfigCreate.validate_vector_names-mh35zuncand push.