⚡️ Speed up method `_Reranker.voyageai` by 9% #100

codeflash-ai · 2025-10-23T08:09:31Z

📄 9% (0.09x) speedup for `_Reranker.voyageai` in `weaviate/collections/classes/config.py`

⏱️ Runtime : 1.88 milliseconds → 1.72 milliseconds (best of 19 runs)

📝 Explanation and details

The optimization adds a conditional check to avoid passing the model parameter when it's None, which provides an 8% speedup by reducing Python's keyword argument overhead.

Key changes:

Added if model is None: check to call _RerankerVoyageAIConfig() without arguments
Only passes model=model when the value is not None

Why this is faster:
In CPython, when you pass keyword arguments, Python creates a dictionary to hold the key-value pairs before passing them to the function. By avoiding the model=model kwarg when model is None (which matches the default anyway), we eliminate this dictionary creation overhead. This is a micro-optimization that becomes measurable when the function is called frequently.

Performance benefits by test case:

Default/None cases see the biggest gains (6-23% faster): test_default_model_is_none, test_model_none_explicit, test_voyageai_stress_with_none
Non-None model cases see minimal impact (0.4-1.2% faster): the optimization only applies when model is None
Mixed workloads benefit proportionally based on the ratio of None vs non-None calls

The optimization is most effective when the function is frequently called with the default None value, which appears common based on the test cases focusing on default behavior.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 2058 Passed
⏪ Replay Tests	✅ 2 Passed
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

from typing import Literal, Optional, Union

# imports
import pytest
from weaviate.collections.classes.config import _Reranker

RerankerVoyageAIModel = Literal["rerank-2", "rerank-2-lite", "rerank-lite-1", "rerank-1"]

class _RerankerProvider:
    # Minimal stub for testing, simulates config object
    def __init__(self, model):
        self.model = model

class _RerankerVoyageAIConfig(_RerankerProvider):
    # Inherits from provider, used for type checking in tests
    pass
from weaviate.collections.classes.config import _Reranker

# unit tests

# -----------------------
# Basic Test Cases
# -----------------------

def test_default_model_is_none():
    """Test default call returns config with model=None."""
    codeflash_output = _Reranker.voyageai(); config = codeflash_output # 8.14μs -> 7.26μs (12.1% faster)

@pytest.mark.parametrize("model", [
    "rerank-2", "rerank-2-lite", "rerank-lite-1", "rerank-1"
])
def test_valid_models(model):
    """Test all valid model names are accepted."""
    codeflash_output = _Reranker.voyageai(model=model); config = codeflash_output # 19.2μs -> 19.1μs (0.429% faster)

def test_model_none_explicit():
    """Test passing model=None explicitly."""
    codeflash_output = _Reranker.voyageai(model=None); config = codeflash_output # 3.97μs -> 3.60μs (10.5% faster)

# -----------------------
# Edge Test Cases
# -----------------------

@pytest.mark.parametrize("invalid_model", [
    "",  # empty string
    "rerank-3",  # non-existent model
    "RERANK-2",  # case sensitivity
    "rerank 2",  # space instead of dash
    "rerank-lite-2",  # similar but invalid
    "rerank_2",  # underscore instead of dash
    "rerank-2!",  # special character
    "rerank-lite-1 ",  # trailing space
    " rerank-lite-1",  # leading space
    "rerank-1\n",  # newline
    "rerank",  # partial
    "lite-1",  # partial
    "rerank-2-lite-extra",  # extra suffix
])
def test_invalid_string_models_raise_valueerror(invalid_model):
    """Test that invalid string models raise ValueError."""
    with pytest.raises(ValueError):
        _Reranker.voyageai(model=invalid_model)

@pytest.mark.parametrize("non_str_model", [
    123,  # integer
    1.23,  # float
    True,  # boolean
    False,
    [],  # list
    {},  # dict
    object(),  # generic object
    b"rerank-2",  # bytes
])
def test_non_string_model_raises_typeerror(non_str_model):
    """Test that non-string, non-None model values raise TypeError."""
    with pytest.raises(TypeError):
        _Reranker.voyageai(model=non_str_model)

def test_model_is_tuple_string():
    """Test passing a tuple containing valid string raises TypeError."""
    with pytest.raises(TypeError):
        _Reranker.voyageai(model=("rerank-2",))

def test_model_is_list_string():
    """Test passing a list containing valid string raises TypeError."""
    with pytest.raises(TypeError):
        _Reranker.voyageai(model=["rerank-2"])

def test_model_is_none_type():
    """Test passing None explicitly is accepted."""
    codeflash_output = _Reranker.voyageai(model=None); config = codeflash_output # 8.20μs -> 7.73μs (6.07% faster)

# -----------------------
# Large Scale Test Cases
# -----------------------

def test_many_valid_models_in_sequence():
    """Test calling the function many times with valid models does not leak state."""
    valid_models = ["rerank-2", "rerank-2-lite", "rerank-lite-1", "rerank-1"]
    configs = []
    for i in range(250):  # 250*4 = 1000
        model = valid_models[i % 4]
        codeflash_output = _Reranker.voyageai(model=model); config = codeflash_output # 239μs -> 237μs (0.974% faster)
        configs.append(config)
    # All configs should have the correct model
    for i, config in enumerate(configs):
        pass


def test_performance_many_calls(monkeypatch):
    """Test performance: calling the function 1000 times with valid and None models."""
    # No assertion on timing, just that it doesn't crash or leak
    for i in range(1000):
        if i % 2 == 0:
            codeflash_output = _Reranker.voyageai(model="rerank-2"); config = codeflash_output
        else:
            codeflash_output = _Reranker.voyageai(); config = codeflash_output

def test_all_valid_and_invalid_models_mix():
    """Test a mix of valid and invalid models in a batch."""
    valid = ["rerank-2", "rerank-2-lite", "rerank-lite-1", "rerank-1"]
    invalid = ["rerank-3", "rerank_2", "RERANK-2", "rerank-lite-2"]
    results = []
    for i in range(500):
        if i % 2 == 0:
            # Valid
            m = valid[i % 4]
            codeflash_output = _Reranker.voyageai(model=m); config = codeflash_output
            results.append(config.model)
        else:
            # Invalid
            m = invalid[i % 4]
            with pytest.raises(ValueError):
                _Reranker.voyageai(model=m)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Literal, Optional, Union

# imports
import pytest
from weaviate.collections.classes.config import _Reranker

RerankerVoyageAIModel = Literal["rerank-2", "rerank-2-lite", "rerank-lite-1", "rerank-1"]

class _RerankerProvider:
    pass

class _RerankerVoyageAIConfig(_RerankerProvider):
    def __init__(self, model: Optional[Union[RerankerVoyageAIModel, str]] = None):
        # Validate model argument
        allowed = {"rerank-2", "rerank-2-lite", "rerank-lite-1", "rerank-1"}
        if model is not None:
            if not isinstance(model, str):
                raise TypeError("model must be a string or None")
            if model not in allowed:
                raise ValueError(f"Invalid model: {model!r}. Allowed values: {sorted(allowed)}")
        self.model = model
from weaviate.collections.classes.config import _Reranker

# unit tests

# --------------------------
# Basic Test Cases
# --------------------------

def test_voyageai_default_model():
    # Should succeed and set model to None (default)
    codeflash_output = _Reranker.voyageai(); config = codeflash_output # 7.52μs -> 6.95μs (8.22% faster)

@pytest.mark.parametrize(
    "model",
    ["rerank-2", "rerank-2-lite", "rerank-lite-1", "rerank-1"]
)
def test_voyageai_valid_models(model):
    # Should succeed for all valid model names
    codeflash_output = _Reranker.voyageai(model=model); config = codeflash_output # 19.1μs -> 19.0μs (0.590% faster)

def test_voyageai_valid_model_with_explicit_none():
    # Should succeed if model=None is passed explicitly
    codeflash_output = _Reranker.voyageai(model=None); config = codeflash_output # 3.90μs -> 3.78μs (3.10% faster)

# --------------------------
# Edge Test Cases
# --------------------------

@pytest.mark.parametrize(
    "model",
    [
        "RERANK-2",            # uppercase
        "rerank-2 ",           # trailing space
        " rerank-2",           # leading space
        "rerank2",             # missing dash
        "rerank-1.0",          # extra dot
        "rerank-lite-2",       # invalid number
        "",                    # empty string
        "rerank-2-lite-extra", # extra suffix
        "lite-1",              # partial
        "rerank",              # incomplete
    ]
)
def test_voyageai_invalid_model_strings(model):
    # Should raise ValueError for invalid model names
    with pytest.raises(ValueError):
        _Reranker.voyageai(model=model)

@pytest.mark.parametrize(
    "model",
    [123, 0.5, True, False, [], {}, object()]
)
def test_voyageai_invalid_model_types(model):
    # Should raise TypeError for non-string, non-None types
    with pytest.raises(TypeError):
        _Reranker.voyageai(model=model)

def test_voyageai_model_is_tuple():
    # Should raise TypeError if model is a tuple
    with pytest.raises(TypeError):
        _Reranker.voyageai(model=("rerank-2",))

def test_voyageai_model_is_bytes():
    # Should raise TypeError if model is bytes
    with pytest.raises(TypeError):
        _Reranker.voyageai(model=b"rerank-2")

def test_voyageai_model_is_none_string():
    # Should raise ValueError if model is string "None"
    with pytest.raises(ValueError):
        _Reranker.voyageai(model="None")

def test_voyageai_model_is_none_case_variation():
    # Should raise ValueError if model is string "none"
    with pytest.raises(ValueError):
        _Reranker.voyageai(model="none")

# --------------------------
# Large Scale Test Cases
# --------------------------

def test_voyageai_many_valid_and_invalid_models():
    # Test a mix of valid and invalid models in a loop (under 1000 iterations)
    valid_models = ["rerank-2", "rerank-2-lite", "rerank-lite-1", "rerank-1"]
    invalid_models = [f"rerank-{i}" for i in range(3, 100)]  # 97 invalid
    # All valid should succeed
    for model in valid_models:
        codeflash_output = _Reranker.voyageai(model=model); config = codeflash_output
    # All invalid should fail
    for model in invalid_models:
        with pytest.raises(ValueError):
            _Reranker.voyageai(model=model)

def test_voyageai_stress_with_none():
    # Stress test: call with None 500 times
    for _ in range(500):
        codeflash_output = _Reranker.voyageai(model=None); config = codeflash_output # 441μs -> 358μs (23.4% faster)

def test_voyageai_stress_with_valid_models():
    # Stress test: call with valid models in a loop
    valid_models = ["rerank-2", "rerank-2-lite", "rerank-lite-1", "rerank-1"]
    for i in range(250):
        model = valid_models[i % 4]
        codeflash_output = _Reranker.voyageai(model=model); config = codeflash_output # 236μs -> 233μs (1.20% faster)

⏪ Replay Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`test_pytest_testcollectiontest_batch_py_testcollectiontest_classes_generative_py_testcollectiontest_confi__replay_test_0.py::test_weaviate_collections_classes_config__Reranker_voyageai`	12.3μs	12.5μs	-1.91%⚠️

To edit these changes git checkout codeflash/optimize-_Reranker.voyageai-mh355k5l and push.

The optimization adds a conditional check to avoid passing the `model` parameter when it's `None`, which provides an 8% speedup by reducing Python's keyword argument overhead. **Key changes:** - Added `if model is None:` check to call `_RerankerVoyageAIConfig()` without arguments - Only passes `model=model` when the value is not `None` **Why this is faster:** In CPython, when you pass keyword arguments, Python creates a dictionary to hold the key-value pairs before passing them to the function. By avoiding the `model=model` kwarg when `model` is `None` (which matches the default anyway), we eliminate this dictionary creation overhead. This is a micro-optimization that becomes measurable when the function is called frequently. **Performance benefits by test case:** - **Default/None cases** see the biggest gains (6-23% faster): `test_default_model_is_none`, `test_model_none_explicit`, `test_voyageai_stress_with_none` - **Non-None model cases** see minimal impact (0.4-1.2% faster): the optimization only applies when model is None - **Mixed workloads** benefit proportionally based on the ratio of None vs non-None calls The optimization is most effective when the function is frequently called with the default `None` value, which appears common based on the test cases focusing on default behavior.

codeflash-ai bot requested a review from mashraf-222 October 23, 2025 08:09

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `_Reranker.voyageai` by 9% #100

⚡️ Speed up method `_Reranker.voyageai` by 9% #100

Uh oh!

codeflash-ai bot commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method _Reranker.voyageai by 9% #100

Are you sure you want to change the base?

⚡️ Speed up method _Reranker.voyageai by 9% #100

Uh oh!

Conversation

codeflash-ai bot commented Oct 23, 2025

📄 9% (0.09x) speedup for _Reranker.voyageai in weaviate/collections/classes/config.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method `_Reranker.voyageai` by 9% #100

⚡️ Speed up method `_Reranker.voyageai` by 9% #100

📄 9% (0.09x) speedup for `_Reranker.voyageai` in `weaviate/collections/classes/config.py`