Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 23, 2025

📄 62% (0.62x) speedup for _NamedVectors.none in weaviate/collections/classes/config_named_vectors.py

⏱️ Runtime : 1.79 milliseconds 1.10 milliseconds (best of 47 runs)

📝 Explanation and details

The optimization introduces a simple but effective object creation caching strategy. Instead of creating a new _VectorizerConfigCreate(vectorizer=Vectorizers.NONE) object on every function call, the optimized version pre-creates this constant object once at module import time as _PRECREATED_NONE_VECTORIZER_CONFIG.

Key changes:

  • Added module-level constant _PRECREATED_NONE_VECTORIZER_CONFIG that's created once at import
  • Replaced inline object creation with reference to the pre-created constant

Why this speeds up the code:
The line profiler shows that creating _VectorizerConfigCreate(vectorizer=Vectorizers.NONE) took 1.77ms (44.8% of total time) in the original version. By eliminating this repeated object instantiation, the optimized version reduces this to just 188μs (7.2% of total time) - simply referencing the pre-existing object.

Performance characteristics:
This optimization is particularly effective for:

  • High-frequency calls - The test with 1000 calls shows 66.6% speedup, demonstrating the cumulative benefit
  • Simple use cases - Individual calls show 20-40% improvements across various test scenarios
  • Any usage pattern - Even single calls benefit from avoiding the object creation overhead

The 62% overall speedup comes from eliminating redundant object allocation while maintaining identical functionality and thread safety (since the vectorizer config is immutable).

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1026 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import Optional

# imports
import pytest  # used for our unit tests
from weaviate.collections.classes.config_named_vectors import _NamedVectors


class Vectorizers:
    NONE = "none"

class _VectorizerConfigCreate:
    def __init__(self, vectorizer):
        self.vectorizer = vectorizer

class _VectorIndexConfigCreate:
    def __init__(self, config_value=None):
        self.config_value = config_value

class _NamedVectorConfigCreate:
    def __init__(self, name, vectorizer, vector_index_config):
        self.name = name
        self.vectorizer = vectorizer
        self.vector_index_config = vector_index_config
from weaviate.collections.classes.config_named_vectors import _NamedVectors

# unit tests

# ----------- BASIC TEST CASES -----------

def test_none_basic_returns_named_vector_config():
    # Test that none returns a _NamedVectorConfigCreate object with correct attributes
    name = "my_vector"
    codeflash_output = _NamedVectors.none(name); result = codeflash_output # 12.0μs -> 9.68μs (24.2% faster)


def test_none_different_names():
    # Test with different valid names
    for n in ["a", "vector1", "123", "with_underscore", ""]:
        codeflash_output = _NamedVectors.none(n); result = codeflash_output # 19.0μs -> 14.1μs (35.2% faster)

# ----------- EDGE TEST CASES -----------

def test_none_empty_string_name():
    # Edge: name is empty string
    codeflash_output = _NamedVectors.none(""); result = codeflash_output # 5.31μs -> 3.77μs (41.0% faster)

def test_none_long_name():
    # Edge: very long name string
    long_name = "v" * 1000
    codeflash_output = _NamedVectors.none(long_name); result = codeflash_output # 5.23μs -> 3.74μs (40.0% faster)

def test_none_special_characters_in_name():
    # Edge: name contains special characters
    special_name = "!@#$%^&*()_+-=[]{}|;':,.<>/?"
    codeflash_output = _NamedVectors.none(special_name); result = codeflash_output # 4.93μs -> 3.79μs (30.3% faster)

def test_none_unicode_name():
    # Edge: name contains unicode characters
    unicode_name = "名字_🚀"
    codeflash_output = _NamedVectors.none(unicode_name); result = codeflash_output # 4.96μs -> 3.49μs (42.2% faster)

def test_none_vector_index_config_is_none_explicit():
    # Edge: vector_index_config is explicitly None
    codeflash_output = _NamedVectors.none("explicit_none", vector_index_config=None); result = codeflash_output # 5.08μs -> 3.85μs (32.2% faster)


def test_none_name_is_numeric_string():
    # Edge: name is a numeric string
    codeflash_output = _NamedVectors.none("123456"); result = codeflash_output # 11.6μs -> 9.16μs (26.3% faster)

def test_none_name_is_whitespace():
    # Edge: name is whitespace
    codeflash_output = _NamedVectors.none("   "); result = codeflash_output # 5.97μs -> 4.88μs (22.5% faster)

# ----------- LARGE SCALE TEST CASES -----------





def test_none_missing_name_argument_raises():
    # Edge: missing required argument 'name' (should raise TypeError)
    with pytest.raises(TypeError):
        _NamedVectors.none() # 3.01μs -> 3.13μs (3.74% slower)

def test_none_unexpected_kwargs_raises():
    # Edge: unexpected keyword argument
    with pytest.raises(TypeError):
        _NamedVectors.none("test", unexpected_kwarg=123) # 1.10μs -> 1.10μs (0.273% faster)

# ----------- TYPE CHECKING TEST CASES -----------



def test_none_returns_independent_objects():
    # Ensure that multiple calls return independent objects
    codeflash_output = _NamedVectors.none("name1"); result1 = codeflash_output # 11.6μs -> 9.29μs (24.9% faster)
    codeflash_output = _NamedVectors.none("name2"); result2 = codeflash_output # 2.35μs -> 1.53μs (53.6% faster)

# ----------- REPR/STR TEST CASES (optional, if implemented) -----------

def test_none_str_repr():
    # If __str__ or __repr__ is implemented for _NamedVectorConfigCreate, test it
    codeflash_output = _NamedVectors.none("repr_test"); result = codeflash_output # 5.56μs -> 3.90μs (42.6% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Optional

# imports
import pytest
from weaviate.collections.classes.config_named_vectors import _NamedVectors


class Vectorizers:
    NONE = "none"

class _VectorizerConfigCreate:
    def __init__(self, vectorizer):
        self.vectorizer = vectorizer

class _VectorIndexConfigCreate:
    def __init__(self, index_type):
        self.index_type = index_type

class _NamedVectorConfigCreate:
    def __init__(self, name, vectorizer, vector_index_config):
        self.name = name
        self.vectorizer = vectorizer
        self.vector_index_config = vector_index_config
from weaviate.collections.classes.config_named_vectors import _NamedVectors

# unit tests

# ----------- BASIC TEST CASES -----------

def test_none_basic_name_only():
    # Test with only the required name argument
    codeflash_output = _NamedVectors.none("my_vector"); result = codeflash_output # 5.36μs -> 3.82μs (40.4% faster)


def test_none_name_empty_string():
    # Test with empty string as name
    codeflash_output = _NamedVectors.none(""); result = codeflash_output # 11.4μs -> 9.27μs (22.6% faster)

def test_none_name_special_characters():
    # Test with special characters in name
    special_name = "!@#$%^&*()_+-=[]{}|;':,.<>/?"
    codeflash_output = _NamedVectors.none(special_name); result = codeflash_output # 6.05μs -> 4.61μs (31.2% faster)

# ----------- EDGE TEST CASES -----------





def test_none_name_unicode():
    # Test with unicode characters in name
    unicode_name = "名前"
    codeflash_output = _NamedVectors.none(unicode_name); result = codeflash_output # 11.3μs -> 9.38μs (20.0% faster)

def test_none_vector_index_config_is_none_explicit():
    # Test with vector_index_config explicitly set to None
    codeflash_output = _NamedVectors.none("vec", vector_index_config=None); result = codeflash_output # 6.37μs -> 5.18μs (22.9% faster)


def test_none_large_name():
    # Test with a very large name string (1000 characters)
    large_name = "v" * 1000
    codeflash_output = _NamedVectors.none(large_name); result = codeflash_output # 11.1μs -> 9.17μs (21.6% faster)



def test_none_performance_many_calls():
    # Test performance by making many calls (not exceeding 1000)
    for i in range(1000):
        codeflash_output = _NamedVectors.none(f"vec_{i}"); result = codeflash_output # 1.63ms -> 978μs (66.6% faster)

# ----------- ADDITIONAL EDGE CASES -----------

def test_none_name_whitespace():
    # Test with whitespace string as name
    codeflash_output = _NamedVectors.none("   "); result = codeflash_output # 7.15μs -> 5.54μs (29.1% faster)

def test_none_name_long_unicode():
    # Test with long unicode string
    long_unicode = "名" * 500
    codeflash_output = _NamedVectors.none(long_unicode); result = codeflash_output # 4.97μs -> 3.70μs (34.4% faster)



#------------------------------------------------
from weaviate.collections.classes.config_named_vectors import _NamedVectors
from weaviate.collections.classes.config_vector_index import VectorIndexType
from weaviate.collections.classes.config_vectorizers import Vectorizers

def test__NamedVectors_none():
    _NamedVectors.none('', vector_index_config=None)

Timer unit: 1e-09 s

To edit these changes git checkout codeflash/optimize-_NamedVectors.none-mh2vm719 and push.

Codeflash

The optimization introduces a simple but effective **object creation caching** strategy. Instead of creating a new `_VectorizerConfigCreate(vectorizer=Vectorizers.NONE)` object on every function call, the optimized version pre-creates this constant object once at module import time as `_PRECREATED_NONE_VECTORIZER_CONFIG`.

**Key changes:**
- Added module-level constant `_PRECREATED_NONE_VECTORIZER_CONFIG` that's created once at import
- Replaced inline object creation with reference to the pre-created constant

**Why this speeds up the code:**
The line profiler shows that creating `_VectorizerConfigCreate(vectorizer=Vectorizers.NONE)` took 1.77ms (44.8% of total time) in the original version. By eliminating this repeated object instantiation, the optimized version reduces this to just 188μs (7.2% of total time) - simply referencing the pre-existing object.

**Performance characteristics:**
This optimization is particularly effective for:
- **High-frequency calls** - The test with 1000 calls shows 66.6% speedup, demonstrating the cumulative benefit
- **Simple use cases** - Individual calls show 20-40% improvements across various test scenarios
- **Any usage pattern** - Even single calls benefit from avoiding the object creation overhead

The 62% overall speedup comes from eliminating redundant object allocation while maintaining identical functionality and thread safety (since the vectorizer config is immutable).
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 23, 2025 03:42
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant