Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 21, 2025

📄 14% (0.14x) speedup for parse_stream_helper in src/together/abstract/api_requestor.py

⏱️ Runtime : 1.68 milliseconds 1.47 milliseconds (best of 169 runs)

📝 Explanation and details

The optimized code achieves a 14% speedup by replacing expensive string method calls with faster slice operations and direct byte comparisons.

Key optimizations:

  1. Replaced startswith() with slice comparison: Changed line.startswith(b"data:") to line[:5] == b"data:". Slice operations are faster than method calls since they avoid function call overhead.

  2. Eliminated redundant startswith() call: The original code called startswith() twice - once to check for "data:" and again for "data: ". The optimized version uses a single length check len(line) > 5 combined with slice comparison line[5:6] == b" " to detect the space variant.

  3. Direct slice indexing instead of len() calculations: Replaced line[len(b"data: "):] and line[len(b"data:"):] with direct indexing line[6:] and line[5:]. This avoids computing string lengths at runtime.

Performance impact by test type:

  • Empty/invalid inputs: 25-45% faster (e.g., non-data prefixes, empty lines) because slice comparison fails faster than startswith()
  • Valid data lines: 6-20% faster, with biggest gains on short payloads where the parsing overhead dominates
  • [DONE] detection: 14-16% faster due to the eliminated second startswith() call
  • Large payloads: Smaller but consistent 3-10% improvements since parsing overhead is proportionally less significant

The optimization is most effective for workloads with many short data lines or frequent invalid/edge cases, which is typical for streaming API response parsing.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 4597 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

# imports
import pytest  # used for our unit tests
from together.abstract.api_requestor import parse_stream_helper

# unit tests

# ---------------------------
# Basic Test Cases
# ---------------------------

def test_basic_data_with_space():
    # Basic valid input with space after 'data: '
    codeflash_output = parse_stream_helper(b"data: hello world") # 1.73μs -> 1.62μs (6.59% faster)

def test_basic_data_without_space():
    # Basic valid input without space after 'data:'
    codeflash_output = parse_stream_helper(b"data:hello world") # 1.52μs -> 1.48μs (2.91% faster)

def test_basic_data_with_leading_spaces_in_content():
    # Content with leading spaces after 'data: '
    codeflash_output = parse_stream_helper(b"data:    hello world") # 1.60μs -> 1.45μs (10.6% faster)

def test_basic_data_empty_content_with_space():
    # 'data: ' followed by nothing
    codeflash_output = parse_stream_helper(b"data: ") # 1.39μs -> 1.34μs (3.96% faster)

def test_basic_data_empty_content_without_space():
    # 'data:' followed by nothing
    codeflash_output = parse_stream_helper(b"data:") # 1.44μs -> 1.15μs (24.6% faster)

def test_basic_done_with_space():
    # '[DONE]' with space after 'data: '
    codeflash_output = parse_stream_helper(b"data: [DONE]") # 1.44μs -> 1.25μs (14.9% faster)

def test_basic_done_without_space():
    # '[DONE]' without space after 'data:'
    codeflash_output = parse_stream_helper(b"data:[DONE]") # 1.42μs -> 1.25μs (13.6% faster)

def test_basic_done_lowercase():
    # '[done]' in lowercase
    codeflash_output = parse_stream_helper(b"data: [done]") # 1.35μs -> 1.17μs (15.7% faster)

def test_basic_done_mixed_case():
    # '[DoNe]' in mixed case
    codeflash_output = parse_stream_helper(b"data: [DoNe]") # 1.31μs -> 1.13μs (16.2% faster)

def test_basic_done_with_leading_trailing_spaces():
    # '[DONE]' with leading and trailing spaces
    codeflash_output = parse_stream_helper(b"data:    [DONE]   ") # 1.43μs -> 1.25μs (14.3% faster)

def test_basic_non_data_prefix():
    # Line does not start with 'data:'
    codeflash_output = parse_stream_helper(b"notdata: hello world") # 625ns -> 485ns (28.9% faster)

def test_basic_empty_line():
    # Empty input
    codeflash_output = parse_stream_helper(b"") # 291ns -> 271ns (7.38% faster)

def test_basic_none_line():
    # None input (should be handled gracefully)
    codeflash_output = parse_stream_helper(None) # 280ns -> 252ns (11.1% faster)

# ---------------------------
# Edge Test Cases
# ---------------------------

def test_edge_data_with_only_prefix_and_spaces():
    # 'data:' followed by only spaces
    codeflash_output = parse_stream_helper(b"data:    ") # 1.78μs -> 1.53μs (16.2% faster)

def test_edge_data_with_tab_after_prefix():
    # 'data:' followed by a tab character
    codeflash_output = parse_stream_helper(b"data:\thello") # 1.60μs -> 1.41μs (13.2% faster)

def test_edge_data_with_newline_in_content():
    # Content contains a newline character
    codeflash_output = parse_stream_helper(b"data: line1\nline2") # 1.56μs -> 1.39μs (12.8% faster)

def test_edge_data_with_unicode_content():
    # Content contains unicode characters
    codeflash_output = parse_stream_helper("data: café".encode("utf-8")) # 2.13μs -> 2.01μs (5.97% faster)

def test_edge_data_with_non_utf8_bytes():
    # Content contains non-UTF8 bytes (should raise)
    with pytest.raises(UnicodeDecodeError):
        parse_stream_helper(b"data: \xff\xfe") # 3.89μs -> 3.75μs (3.84% faster)

def test_edge_data_with_multiple_spaces_after_prefix():
    # Multiple spaces after 'data: '
    codeflash_output = parse_stream_helper(b"data:      hello") # 1.62μs -> 1.47μs (10.4% faster)

def test_edge_data_with_trailing_newline():
    # Trailing newline after content
    codeflash_output = parse_stream_helper(b"data: hello world\n") # 1.55μs -> 1.39μs (11.5% faster)

def test_edge_data_with_only_prefix():
    # Only 'data:' and nothing else
    codeflash_output = parse_stream_helper(b"data:") # 1.40μs -> 1.18μs (19.4% faster)

def test_edge_data_with_only_prefix_and_tab():
    # Only 'data:' and a tab
    codeflash_output = parse_stream_helper(b"data:\t") # 1.51μs -> 1.36μs (11.2% faster)

def test_edge_done_with_extra_content():
    # '[DONE]' with extra content after
    codeflash_output = parse_stream_helper(b"data: [DONE] extra") # 1.53μs -> 1.38μs (10.9% faster)

def test_edge_done_with_embedded_done():
    # '[DONE]' embedded in content (should not match)
    codeflash_output = parse_stream_helper(b"data: something [DONE] else") # 1.53μs -> 1.36μs (12.5% faster)

def test_edge_data_with_whitespace_before_prefix():
    # Whitespace before 'data:' (should not match)
    codeflash_output = parse_stream_helper(b"   data: hello") # 636ns -> 508ns (25.2% faster)

def test_edge_data_with_crlf_line_endings():
    # Content with CRLF line endings
    codeflash_output = parse_stream_helper(b"data: hello\r\nworld") # 1.50μs -> 1.34μs (12.1% faster)

def test_edge_data_with_only_colon():
    # Only 'data:' and colon
    codeflash_output = parse_stream_helper(b"data:") # 1.46μs -> 1.15μs (27.3% faster)

def test_edge_data_with_only_colon_and_space():
    # Only 'data: ' and space
    codeflash_output = parse_stream_helper(b"data: ") # 1.35μs -> 1.28μs (5.45% faster)

def test_edge_data_with_extra_colons():
    # Extra colons in prefix
    codeflash_output = parse_stream_helper(b"data:: something") # 1.53μs -> 1.37μs (11.3% faster)

def test_edge_data_with_leading_non_ascii():
    # Non-ascii before 'data:' (should not match)
    codeflash_output = parse_stream_helper(b"\xe2\x98\x83data: hello") # 635ns -> 505ns (25.7% faster)

def test_edge_data_with_long_prefix():
    # Prefix longer than 'data:' (should not match)
    codeflash_output = parse_stream_helper(b"data:data: hello") # 1.54μs -> 1.44μs (6.65% faster)

# ---------------------------
# Large Scale Test Cases
# ---------------------------

def test_large_scale_many_lines():
    # Many valid lines, all should parse correctly
    for i in range(1000):
        s = f"data: line {i}".encode("utf-8")
        codeflash_output = parse_stream_helper(s) # 360μs -> 320μs (12.4% faster)

def test_large_scale_many_done_lines():
    # Many '[DONE]' lines, all should return None
    for i in range(1000):
        s = b"data: [DONE]"
        codeflash_output = parse_stream_helper(s) # 330μs -> 287μs (14.7% faster)

def test_large_scale_long_content():
    # Very long content after 'data: '
    long_content = "x" * 1000
    codeflash_output = parse_stream_helper(b"data: " + long_content.encode("utf-8")) # 2.74μs -> 2.50μs (9.67% faster)

def test_large_scale_long_content_without_space():
    # Very long content after 'data:'
    long_content = "y" * 1000
    codeflash_output = parse_stream_helper(b"data:" + long_content.encode("utf-8")) # 2.47μs -> 2.31μs (6.80% faster)

def test_large_scale_mixed_valid_and_done_lines():
    # Mix valid and '[DONE]' lines
    for i in range(500):
        codeflash_output = parse_stream_helper(f"data: line {i}".encode("utf-8")) # 184μs -> 164μs (12.2% faster)
        codeflash_output = parse_stream_helper(b"data: [DONE]")

def test_large_scale_all_edge_cases():
    # Mix of edge cases and basic cases
    for i in range(250):
        # Basic valid
        codeflash_output = parse_stream_helper(f"data: {i}".encode("utf-8")) # 93.4μs -> 83.3μs (12.1% faster)
        # Empty content
        codeflash_output = parse_stream_helper(b"data: ")
        # Prefix only
        codeflash_output = parse_stream_helper(b"data:") # 88.6μs -> 76.9μs (15.2% faster)
        # Lowercase '[done]'
        codeflash_output = parse_stream_helper(b"data: [done]")
        # Non-data prefix
        codeflash_output = parse_stream_helper(f"notdata: {i}".encode("utf-8")) # 85.7μs -> 67.6μs (26.7% faster)
        # Embedded '[DONE]'
        codeflash_output = parse_stream_helper(f"data: something [DONE] {i}".encode("utf-8"))
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest
from together.abstract.api_requestor import parse_stream_helper

# =========================
# Basic Test Cases
# =========================

def test_basic_valid_data_with_space():
    # Should extract and decode the string after "data: "
    codeflash_output = parse_stream_helper(b"data: Hello, world!") # 1.69μs -> 1.60μs (5.50% faster)

def test_basic_valid_data_without_space():
    # Should extract and decode the string after "data:"
    codeflash_output = parse_stream_helper(b"data:Hello, world!") # 1.51μs -> 1.33μs (14.2% faster)

def test_basic_done_with_space():
    # Should return None for "[DONE]" (case-insensitive), with space
    codeflash_output = parse_stream_helper(b"data: [DONE]") # 1.37μs -> 1.20μs (14.8% faster)

def test_basic_done_without_space():
    # Should return None for "[DONE]" (case-insensitive), without space
    codeflash_output = parse_stream_helper(b"data:[DONE]") # 1.42μs -> 1.23μs (15.8% faster)

def test_basic_done_lowercase():
    # Should return None for "[done]" in lowercase
    codeflash_output = parse_stream_helper(b"data: [done]") # 1.35μs -> 1.18μs (14.9% faster)

def test_basic_done_mixed_case():
    # Should return None for "[DoNe]" in mixed case
    codeflash_output = parse_stream_helper(b"data: [DoNe]") # 1.27μs -> 1.09μs (16.0% faster)

def test_basic_data_with_leading_and_trailing_spaces():
    # Should preserve leading/trailing spaces after "data: "
    codeflash_output = parse_stream_helper(b"data:   foo bar   ") # 1.62μs -> 1.38μs (17.7% faster)

def test_basic_data_with_empty_payload():
    # Should return empty string if payload is empty after "data: "
    codeflash_output = parse_stream_helper(b"data: ") # 1.47μs -> 1.26μs (16.2% faster)

def test_basic_data_with_empty_payload_no_space():
    # Should return empty string if payload is empty after "data:"
    codeflash_output = parse_stream_helper(b"data:") # 1.42μs -> 1.11μs (27.2% faster)

# =========================
# Edge Test Cases
# =========================

def test_edge_empty_bytes():
    # Should return None for empty bytes
    codeflash_output = parse_stream_helper(b"") # 303ns -> 290ns (4.48% faster)

def test_edge_none_input():
    # Should return None for None input (should not raise)
    codeflash_output = parse_stream_helper(None) # 269ns -> 280ns (3.93% slower)

def test_edge_no_data_prefix():
    # Should return None if line does not start with "data:"
    codeflash_output = parse_stream_helper(b"foo: bar") # 730ns -> 506ns (44.3% faster)

def test_edge_partial_prefix():
    # Should return None if line starts with "dat:" (not "data:")
    codeflash_output = parse_stream_helper(b"dat: hello") # 665ns -> 529ns (25.7% faster)

def test_edge_data_colon_only():
    # Should return empty string if only "data:" is present
    codeflash_output = parse_stream_helper(b"data:") # 1.60μs -> 1.28μs (25.5% faster)

def test_edge_data_colon_space_only():
    # Should return empty string if only "data: " is present
    codeflash_output = parse_stream_helper(b"data: ") # 1.37μs -> 1.32μs (4.18% faster)

def test_edge_data_with_only_spaces():
    # Should return string with only spaces
    codeflash_output = parse_stream_helper(b"data:    ") # 1.54μs -> 1.43μs (7.77% faster)

def test_edge_data_with_newline():
    # Should decode newlines as part of the string
    codeflash_output = parse_stream_helper(b"data: foo\nbar") # 1.51μs -> 1.38μs (9.05% faster)

def test_edge_data_with_carriage_return():
    # Should decode carriage returns as part of the string
    codeflash_output = parse_stream_helper(b"data: foo\rbar") # 1.45μs -> 1.29μs (12.9% faster)

def test_edge_data_with_utf8_characters():
    # Should decode UTF-8 characters correctly
    codeflash_output = parse_stream_helper("data: café 😊".encode("utf-8")) # 2.53μs -> 2.20μs (14.9% faster)

def test_edge_done_with_leading_trailing_spaces():
    # Should return None if "[DONE]" is surrounded by spaces
    codeflash_output = parse_stream_helper(b"data:    [DONE]   ") # 1.46μs -> 1.27μs (15.2% faster)

def test_edge_done_with_non_ascii_case():
    # Should still match "[DONE]" even if encoded in UTF-8
    codeflash_output = parse_stream_helper("data: [DONE]".encode("utf-8")) # 1.42μs -> 1.15μs (24.0% faster)

def test_edge_data_with_extra_colons():
    # Should treat everything after the first "data:" as payload
    codeflash_output = parse_stream_helper(b"data: foo:bar:baz") # 1.52μs -> 1.28μs (19.0% faster)

def test_edge_data_with_tab_after_colon():
    # Should treat tab after colon as part of payload
    codeflash_output = parse_stream_helper(b"data:\tfoo") # 1.60μs -> 1.35μs (18.2% faster)

def test_edge_data_with_only_whitespace_after_colon():
    # Should return string with only whitespace
    codeflash_output = parse_stream_helper(b"data: \t  ") # 1.54μs -> 1.33μs (15.7% faster)

def test_edge_data_with_trailing_newline():
    # Should preserve trailing newlines in the decoded string
    codeflash_output = parse_stream_helper(b"data: foo\n") # 1.54μs -> 1.37μs (12.3% faster)

def test_edge_data_with_multiple_spaces_after_colon():
    # Should preserve all spaces after "data: "
    codeflash_output = parse_stream_helper(b"data:     foo") # 1.56μs -> 1.31μs (18.9% faster)

def test_edge_data_with_non_utf8_bytes():
    # Should raise UnicodeDecodeError for non-UTF-8 bytes
    with pytest.raises(UnicodeDecodeError):
        parse_stream_helper(b"data: \xff\xfe\xfd") # 3.85μs -> 3.86μs (0.311% slower)

# =========================
# Large Scale Test Cases
# =========================

def test_large_scale_long_payload():
    # Should handle a large payload (1000 characters)
    payload = "x" * 1000
    codeflash_output = parse_stream_helper(f"data: {payload}".encode("utf-8")) # 2.66μs -> 2.56μs (3.95% faster)

def test_large_scale_many_spaces():
    # Should handle a payload with many spaces
    payload = " " * 999 + "x"
    codeflash_output = parse_stream_helper(f"data: {payload}".encode("utf-8")) # 2.45μs -> 2.24μs (9.23% faster)

def test_large_scale_done_with_lots_of_spaces():
    # Should return None for "[DONE]" with lots of spaces around
    codeflash_output = parse_stream_helper(b"data:    [DONE]    ") # 1.51μs -> 1.26μs (19.7% faster)

def test_large_scale_many_lines():
    # Should correctly process multiple different large lines
    for i in range(1, 100, 10):
        payload = "y" * i
        codeflash_output = parse_stream_helper(f"data: {payload}".encode("utf-8")) # 6.11μs -> 5.58μs (9.50% faster)

def test_large_scale_unicode_payload():
    # Should handle large unicode payloads
    payload = "😀" * 500  # 500 unicode emoji
    codeflash_output = parse_stream_helper(f"data: {payload}".encode("utf-8")) # 4.75μs -> 4.64μs (2.33% faster)

def test_large_scale_done_in_middle_of_payload():
    # Should not return None if "[DONE]" appears in the middle of payload
    payload = "foo [DONE] bar"
    codeflash_output = parse_stream_helper(f"data: {payload}".encode("utf-8")) # 1.45μs -> 1.25μs (16.3% faster)

def test_large_scale_only_done_with_extra_content():
    # Should not return None if "[DONE]" is not the entire payload (case-insensitive)
    payload = "[DONE]foo"
    codeflash_output = parse_stream_helper(f"data: {payload}".encode("utf-8")) # 1.39μs -> 1.22μs (14.2% faster)

def test_large_scale_data_with_newlines():
    # Should handle payloads with many newlines
    payload = "\n".join(["abc"] * 300)
    codeflash_output = parse_stream_helper(f"data: {payload}".encode("utf-8")) # 2.68μs -> 2.51μs (6.90% faster)

# =========================
# Additional Robustness Tests
# =========================

@pytest.mark.parametrize(
    "input_bytes,expected",
    [
        (b"data: [DONE]", None),
        (b"data:[DONE]", None),
        (b"data: [done]", None),
        (b"data: [DoNe]", None),
        (b"data: [DONE]   ", None),
        (b"data:   [DONE]", None),
        (b"data:foo", "foo"),
        (b"data: foo", "foo"),
        (b"data:foo bar", "foo bar"),
        (b"data:  foo bar", " foo bar"),
        (b"data: [DONE]foo", "[DONE]foo"),
        (b"data:foo [DONE]", "foo [DONE]"),
        (b"data:foo [done]", "foo [done]"),
        (b"data:", ""),
        (b"data: ", ""),
        (b"data:   ", "  "),
        (b"", None),
        (None, None),
        (b"foo: bar", None),
        (b"dat: bar", None),
    ]
)
def test_parametrized_cases(input_bytes, expected):
    # Parametrized test for a wide range of basic and edge cases
    codeflash_output = parse_stream_helper(input_bytes) # 26.6μs -> 22.8μs (16.6% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from together.abstract.api_requestor import parse_stream_helper

def test_parse_stream_helper():
    parse_stream_helper(b' ')
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_atws5rsq/tmpzs_bqcbj/test_concolic_coverage.py::test_parse_stream_helper 933ns 594ns 57.1%✅

To edit these changes git checkout codeflash/optimize-parse_stream_helper-mgzunz1a and push.

Codeflash

The optimized code achieves a **14% speedup** by replacing expensive string method calls with faster slice operations and direct byte comparisons.

**Key optimizations:**

1. **Replaced `startswith()` with slice comparison**: Changed `line.startswith(b"data:")` to `line[:5] == b"data:"`. Slice operations are faster than method calls since they avoid function call overhead.

2. **Eliminated redundant `startswith()` call**: The original code called `startswith()` twice - once to check for "data:" and again for "data: ". The optimized version uses a single length check `len(line) > 5` combined with slice comparison `line[5:6] == b" "` to detect the space variant.

3. **Direct slice indexing instead of `len()` calculations**: Replaced `line[len(b"data: "):]` and `line[len(b"data:"):]` with direct indexing `line[6:]` and `line[5:]`. This avoids computing string lengths at runtime.

**Performance impact by test type:**
- **Empty/invalid inputs**: 25-45% faster (e.g., non-data prefixes, empty lines) because slice comparison fails faster than `startswith()`
- **Valid data lines**: 6-20% faster, with biggest gains on short payloads where the parsing overhead dominates
- **[DONE] detection**: 14-16% faster due to the eliminated second `startswith()` call
- **Large payloads**: Smaller but consistent 3-10% improvements since parsing overhead is proportionally less significant

The optimization is most effective for workloads with many short data lines or frequent invalid/edge cases, which is typical for streaming API response parsing.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 21, 2025 00:52
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant