Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 23, 2025

📄 11% (0.11x) speedup for extract_deepgram_headers in src/deepgram/extensions/core/telemetry_events.py

⏱️ Runtime : 1.18 milliseconds 1.07 milliseconds (best of 241 runs)

📝 Explanation and details

The optimization eliminates redundant key.lower() calls by storing the lowercased key in a variable lkey and reusing it. In the original code, when an x-dg- header is found, key.lower() is called twice - once for the startswith check and again when storing in the dictionary. The optimized version calls key.lower() only once per iteration and reuses the result.

Additionally, the code localizes str.startswith as lower_startswith to avoid attribute lookup overhead in the tight loop, providing a small but measurable performance gain when processing many headers.

These optimizations are most effective for test cases with many x-dg- headers, where the savings from avoiding duplicate lower() calls accumulate significantly. The annotated tests show the largest speedups (11-19%) occur in large-scale scenarios with hundreds of x-dg- headers, while smaller improvements (2-7%) appear in basic cases with just a few headers. The optimization performs slightly slower on edge cases with no x-dg- headers since there's overhead from the variable assignment, but the overall 10% speedup demonstrates clear benefits for the primary use case.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 44 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

from typing import Dict, Mapping

# imports
import pytest  # used for our unit tests
from deepgram.extensions.core.telemetry_events import extract_deepgram_headers

# unit tests

# -------------------------
# 1. Basic Test Cases
# -------------------------

def test_basic_single_xdg_header():
    # Single x-dg- header present
    headers = {"X-DG-Test": "value"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.38μs -> 1.32μs (4.39% faster)

def test_basic_multiple_xdg_headers():
    # Multiple x-dg- headers present
    headers = {
        "X-DG-Alpha": "a",
        "X-DG-Beta": "b",
        "Content-Type": "application/json"
    }
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.78μs -> 1.67μs (7.02% faster)

def test_basic_no_xdg_headers():
    # No x-dg- headers present
    headers = {"Content-Type": "application/json", "Authorization": "Bearer xyz"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.15μs -> 1.20μs (4.42% slower)

def test_basic_none_input():
    # None input should return None
    codeflash_output = extract_deepgram_headers(None); result = codeflash_output # 306ns -> 317ns (3.47% slower)

def test_basic_empty_dict():
    # Empty dict input should return None
    codeflash_output = extract_deepgram_headers({}); result = codeflash_output # 332ns -> 337ns (1.48% slower)

def test_basic_case_insensitivity():
    # Header keys with different cases
    headers = {"x-DG-Header": "foo", "X-dg-Another": "bar"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.59μs -> 1.51μs (5.31% faster)

def test_basic_non_string_values():
    # Non-string values should be converted to string
    headers = {"X-DG-Int": 123, "X-DG-Bool": True}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.74μs -> 1.70μs (2.48% faster)

# -------------------------
# 2. Edge Test Cases
# -------------------------

def test_edge_header_with_only_xdg_prefix():
    # Header key is exactly 'x-dg-'
    headers = {"x-dg-": "empty"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.28μs -> 1.21μs (5.85% faster)

def test_edge_header_with_spaces():
    # Header key with spaces after prefix
    headers = {"X-DG- Test": "spaced"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.26μs -> 1.21μs (4.48% faster)

def test_edge_header_with_special_characters():
    # Header key with special characters after prefix
    headers = {"X-DG-@!$": "special"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.24μs -> 1.19μs (4.13% faster)

def test_edge_header_with_empty_value():
    # Header with empty string value
    headers = {"X-DG-Empty": ""}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.23μs -> 1.19μs (3.10% faster)

def test_edge_header_with_none_value():
    # Header with None value should be converted to 'None'
    headers = {"X-DG-None": None}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.43μs -> 1.36μs (5.37% faster)

def test_edge_header_with_leading_trailing_spaces():
    # Header key and value with leading/trailing spaces
    headers = {"  X-DG-LeadTrail  ": "  spaced  "}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.04μs -> 1.15μs (9.46% slower)

def test_edge_mixed_types_in_headers():
    # Mixed types in header values
    headers = {"X-DG-Num": 42, "X-DG-List": [1, 2], "X-DG-Dict": {"a": 1}}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 3.59μs -> 3.56μs (0.956% faster)


def test_edge_header_value_is_bytes():
    # Header value is bytes, should be stringified
    headers = {"X-DG-Bytes": b"bytes"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.88μs -> 1.80μs (4.90% faster)

# -------------------------
# 3. Large Scale Test Cases
# -------------------------

def test_large_scale_all_xdg_headers():
    # 1000 x-dg- headers, all should be extracted
    headers = {f"X-DG-Key{i}": f"val{i}" for i in range(1000)}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 146μs -> 128μs (13.8% faster)
    expected = {f"x-dg-key{i}": f"val{i}" for i in range(1000)}

def test_large_scale_mixed_headers():
    # 500 x-dg- headers, 500 other headers, only x-dg- should be extracted
    headers = {f"X-DG-Key{i}": f"val{i}" for i in range(500)}
    headers.update({f"Other-Key{i}": f"otherval{i}" for i in range(500)})
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 103μs -> 92.9μs (11.6% faster)
    expected = {f"x-dg-key{i}": f"val{i}" for i in range(500)}

def test_large_scale_no_xdg_headers():
    # 1000 non-x-dg- headers, should return None
    headers = {f"Non-DG-Key{i}": f"val{i}" for i in range(1000)}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 65.0μs -> 64.1μs (1.42% faster)

def test_large_scale_some_xdg_headers_with_non_string_values():
    # 500 x-dg- headers with integer values
    headers = {f"X-DG-Int{i}": i for i in range(500)}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 86.8μs -> 73.0μs (18.9% faster)
    expected = {f"x-dg-int{i}": str(i) for i in range(500)}

def test_large_scale_performance():
    # Test function doesn't crash or slow down with 999 headers
    headers = {f"X-DG-Key{i}": f"val{i}" for i in range(999)}
    headers.update({f"Other-Key{i}": f"otherval{i}" for i in range(999)})
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 203μs -> 182μs (11.3% faster)
    expected = {f"x-dg-key{i}": f"val{i}" for i in range(999)}

# -------------------------
# Miscellaneous/Mutation-Resistant Tests
# -------------------------

def test_mutation_resistant_key_case():
    # If function doesn't lowercase keys, this will fail
    headers = {"X-DG-MixedCase": "abc"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.46μs -> 1.43μs (2.24% faster)

def test_mutation_resistant_value_stringification():
    # If function doesn't stringify values, this will fail
    headers = {"X-DG-Int": 100}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.39μs -> 1.34μs (4.19% faster)

def test_mutation_resistant_no_false_positive():
    # If function includes non-x-dg- headers, this will fail
    headers = {"X-DG-Yes": "y", "Not-DG": "n"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 1.47μs -> 1.52μs (3.35% slower)

def test_mutation_resistant_empty_result():
    # If function returns {} instead of None when no x-dg- headers
    headers = {"Not-XDG": "foo"}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 972ns -> 1.08μs (10.1% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from __future__ import annotations

from typing import Dict, Mapping

# imports
import pytest  # used for our unit tests
from deepgram.extensions.core.telemetry_events import extract_deepgram_headers

# unit tests

# -------------------------------
# Basic Test Cases
# -------------------------------

def test_none_input_returns_none():
    # Test when headers is None
    codeflash_output = extract_deepgram_headers(None) # 301ns -> 330ns (8.79% slower)

def test_empty_dict_returns_none():
    # Test when headers is empty dict
    codeflash_output = extract_deepgram_headers({}) # 327ns -> 333ns (1.80% slower)

def test_no_xdg_headers_returns_none():
    # Test when headers has no x-dg- headers
    headers = {'content-type': 'application/json', 'authorization': 'Bearer abc'}
    codeflash_output = extract_deepgram_headers(headers) # 1.27μs -> 1.44μs (11.6% slower)

def test_single_xdg_header():
    # Test when headers has a single x-dg- header
    headers = {'x-dg-request-id': '12345', 'content-type': 'application/json'}
    expected = {'x-dg-request-id': '12345'}
    codeflash_output = extract_deepgram_headers(headers) # 1.56μs -> 1.58μs (1.20% slower)

def test_multiple_xdg_headers():
    # Test when headers has multiple x-dg- headers
    headers = {
        'x-dg-request-id': '12345',
        'x-dg-response-time': '100ms',
        'content-type': 'application/json'
    }
    expected = {
        'x-dg-request-id': '12345',
        'x-dg-response-time': '100ms'
    }
    codeflash_output = extract_deepgram_headers(headers) # 1.78μs -> 1.69μs (5.57% faster)

def test_header_case_insensitivity():
    # Test that keys are matched case-insensitively and stored as lowercase
    headers = {
        'X-DG-Request-ID': 'abc',
        'X-dg-response-time': '200ms',
        'x-DG-foo': 'bar',
        'Content-Type': 'application/json'
    }
    expected = {
        'x-dg-request-id': 'abc',
        'x-dg-response-time': '200ms',
        'x-dg-foo': 'bar'
    }
    codeflash_output = extract_deepgram_headers(headers) # 1.92μs -> 1.79μs (7.37% faster)

# -------------------------------
# Edge Test Cases
# -------------------------------

def test_xdg_header_with_non_string_value():
    # Test that non-string values are converted to strings
    headers = {
        'x-dg-int': 123,
        'x-dg-bool': True,
        'x-dg-none': None,
        'content-type': 'application/json'
    }
    expected = {
        'x-dg-int': '123',
        'x-dg-bool': 'True',
        'x-dg-none': 'None'
    }
    codeflash_output = extract_deepgram_headers(headers) # 2.13μs -> 2.07μs (3.24% faster)

def test_xdg_header_with_empty_string_value():
    # Test that x-dg- headers with empty string values are included
    headers = {
        'x-dg-empty': '',
        'x-dg-nonempty': 'value'
    }
    expected = {
        'x-dg-empty': '',
        'x-dg-nonempty': 'value'
    }
    codeflash_output = extract_deepgram_headers(headers) # 1.52μs -> 1.42μs (6.76% faster)

def test_xdg_header_with_whitespace_key():
    # Test that keys with leading/trailing whitespace are not matched
    headers = {
        ' x-dg-foo': 'bar',  # should NOT match
        'x-dg-bar ': 'baz',  # should NOT match
        'x-dg-baz': 'qux'    # should match
    }
    expected = {
        'x-dg-baz': 'qux'
    }
    codeflash_output = extract_deepgram_headers(headers) # 1.66μs -> 1.63μs (1.91% faster)

def test_xdg_header_with_mixed_prefix():
    # Test that headers like 'x-dgish' or 'x-dg--foo' are not matched
    headers = {
        'x-dgish': 'nope',
        'x-dg--foo': 'nope',
        'x-dg-foo': 'yes'
    }
    expected = {
        'x-dg-foo': 'yes'
    }
    codeflash_output = extract_deepgram_headers(headers) # 1.59μs -> 1.62μs (1.91% slower)

def test_xdg_header_with_unicode_characters():
    # Test that unicode in keys and values is handled correctly
    headers = {
        'X-DG-Ünicode': 'välüe',
        'x-dg-normal': 'ascii'
    }
    expected = {
        'x-dg-ünicode': 'välüe',
        'x-dg-normal': 'ascii'
    }
    codeflash_output = extract_deepgram_headers(headers) # 1.96μs -> 1.77μs (10.5% faster)

def test_xdg_header_with_numeric_key():
    # Test keys like 'x-dg-123' are matched
    headers = {
        'x-dg-123': 'num',
        'x-dg-': 'empty'
    }
    expected = {
        'x-dg-123': 'num',
        'x-dg-': 'empty'
    }
    codeflash_output = extract_deepgram_headers(headers) # 1.46μs -> 1.47μs (1.22% slower)

def test_xdg_header_with_duplicate_keys_different_case():
    # Test that keys differing only in case are merged to lowercase and last value wins
    headers = {
        'X-DG-FOO': 'bar',
        'x-dg-foo': 'baz'
    }
    expected = {
        'x-dg-foo': 'baz'
    }
    codeflash_output = extract_deepgram_headers(headers) # 1.55μs -> 1.54μs (0.716% faster)


def test_xdg_header_with_only_xdg_headers():
    # Test when all headers are x-dg- headers
    headers = {
        'x-dg-a': '1',
        'x-dg-b': '2',
        'x-dg-c': '3'
    }
    expected = {
        'x-dg-a': '1',
        'x-dg-b': '2',
        'x-dg-c': '3'
    }
    codeflash_output = extract_deepgram_headers(headers) # 2.00μs -> 2.10μs (4.86% slower)

def test_xdg_header_with_headers_object():
    # Test with a Mapping type other than dict (simulate e.g. HTTP headers object)
    class DummyHeaders(Mapping):
        def __init__(self, items):
            self._items = items
        def __getitem__(self, key):
            return self._items[key]
        def __iter__(self):
            return iter(self._items)
        def __len__(self):
            return len(self._items)
        def items(self):
            return self._items.items()
    headers = DummyHeaders({'X-DG-FOO': 'bar', 'Content-Type': 'json'})
    expected = {'x-dg-foo': 'bar'}
    codeflash_output = extract_deepgram_headers(headers) # 2.22μs -> 2.19μs (1.05% faster)

# -------------------------------
# Large Scale Test Cases
# -------------------------------

def test_large_number_of_headers():
    # Test with a large number of headers, only some are x-dg-
    headers = {}
    # Add 500 non-xdg headers
    for i in range(500):
        headers[f'header-{i}'] = str(i)
    # Add 500 x-dg- headers
    for i in range(500):
        headers[f'x-dg-header-{i}'] = f'value-{i}'
    # Expected: Only the 500 x-dg- headers, all lowercased
    expected = {f'x-dg-header-{i}': f'value-{i}' for i in range(500)}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 105μs -> 97.7μs (7.98% faster)

def test_large_number_of_xdg_headers_with_duplicates():
    # Test with duplicate x-dg- headers differing in case; last value wins
    headers = {}
    for i in range(500):
        headers[f'X-DG-HEADER-{i}'] = f'value-upper-{i}'
        headers[f'x-dg-header-{i}'] = f'value-lower-{i}'
    expected = {f'x-dg-header-{i}': f'value-lower-{i}' for i in range(500)}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 135μs -> 121μs (11.5% faster)

def test_large_number_of_xdg_headers_with_varied_types():
    # Test with varied value types for large number of x-dg- headers
    headers = {}
    for i in range(250):
        headers[f'x-dg-int-{i}'] = i
    for i in range(250, 500):
        headers[f'x-dg-bool-{i}'] = (i % 2 == 0)
    expected = {f'x-dg-int-{i}': str(i) for i in range(250)}
    expected.update({f'x-dg-bool-{i}': str(i % 2 == 0) for i in range(250, 500)})
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 80.5μs -> 69.3μs (16.2% faster)

def test_large_scale_no_xdg_headers():
    # Test with a large number of headers, none are x-dg-
    headers = {f'header-{i}': str(i) for i in range(1000)}
    codeflash_output = extract_deepgram_headers(headers) # 61.9μs -> 61.5μs (0.632% faster)

def test_large_scale_all_xdg_headers():
    # Test with a large number of x-dg- headers only
    headers = {f'x-dg-header-{i}': f'value-{i}' for i in range(1000)}
    expected = {f'x-dg-header-{i}': f'value-{i}' for i in range(1000)}
    codeflash_output = extract_deepgram_headers(headers); result = codeflash_output # 142μs -> 124μs (14.8% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from deepgram.extensions.core.telemetry_events import extract_deepgram_headers

def test_extract_deepgram_headers():
    extract_deepgram_headers({'': ''})

def test_extract_deepgram_headers_2():
    extract_deepgram_headers({})
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_d0k9fm5y/tmpn19_883j/test_concolic_coverage.py::test_extract_deepgram_headers 1.13μs 1.23μs -8.03%⚠️
codeflash_concolic_d0k9fm5y/tmpn19_883j/test_concolic_coverage.py::test_extract_deepgram_headers_2 357ns 349ns 2.29%✅

To edit these changes git checkout codeflash/optimize-extract_deepgram_headers-mh2vmh36 and push.

Codeflash

The optimization eliminates redundant `key.lower()` calls by storing the lowercased key in a variable `lkey` and reusing it. In the original code, when an x-dg- header is found, `key.lower()` is called twice - once for the `startswith` check and again when storing in the dictionary. The optimized version calls `key.lower()` only once per iteration and reuses the result.

Additionally, the code localizes `str.startswith` as `lower_startswith` to avoid attribute lookup overhead in the tight loop, providing a small but measurable performance gain when processing many headers.

These optimizations are most effective for test cases with many x-dg- headers, where the savings from avoiding duplicate `lower()` calls accumulate significantly. The annotated tests show the largest speedups (11-19%) occur in large-scale scenarios with hundreds of x-dg- headers, while smaller improvements (2-7%) appear in basic cases with just a few headers. The optimization performs slightly slower on edge cases with no x-dg- headers since there's overhead from the variable assignment, but the overall 10% speedup demonstrates clear benefits for the primary use case.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 23, 2025 03:42
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant