Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 12, 2025

📄 140% (1.40x) speedup for Resources.components_for in src/bokeh/resources.py

⏱️ Runtime : 511 microseconds 213 microseconds (best of 250 runs)

📝 Explanation and details

The optimization replaces the inefficient list comprehension with membership testing against a list with a more efficient approach using set operations.

Key optimization: In the components_for method, instead of checking comp in self._component_defs[kind] for each component (which requires linear search through the list), the code now:

  1. Pre-converts the target list to a set: kind_set = set(kind_comps)
  2. Uses set membership testing: comp in kind_set (O(1) average case vs O(n) for list)
  3. Adds early exit optimization: Returns empty list immediately when kind_comps is empty

Why this is faster: Set membership testing in Python is O(1) average case using hash lookups, while list membership testing is O(n) requiring linear scanning. For the filtering operation, this changes the overall complexity from O(n*m) to O(n+m) where n is the number of components and m is the size of the component definitions list.

Performance impact: The test results show this optimization is particularly effective for:

  • Large-scale filtering: 177% speedup for 500+ components, 271% speedup for 1000 invalid components
  • CSS operations: Up to 7472% speedup since CSS component definitions are empty, triggering the early exit
  • Mixed valid/invalid scenarios: 160% speedup when filtering large mixed lists

The optimization trades a small upfront cost of set creation for significant savings during the filtering loop, making it especially beneficial when components_for is called with large component lists or frequently in performance-critical code paths.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 86 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from bokeh.resources import Resources

# unit tests

# 1. Basic Test Cases

def test_basic_js_default_components():
    # Should return all default JS components in order
    r = Resources()
    expected = ["bokeh", "bokeh-gl", "bokeh-widgets", "bokeh-tables", "bokeh-mathjax"]
    codeflash_output = r.components_for("js") # 1.68μs -> 1.72μs (2.33% slower)

def test_basic_css_default_components():
    # Should return empty list for css kind, as no default CSS components
    r = Resources()
    codeflash_output = r.components_for("css") # 1.34μs -> 547ns (145% faster)

def test_basic_js_custom_subset():
    # Should filter out only valid JS components from a custom list
    r = Resources(components=["bokeh", "bokeh-api"])
    codeflash_output = r.components_for("js") # 1.35μs -> 1.64μs (17.8% slower)

def test_basic_js_custom_with_invalid():
    # Should ignore components not in _component_defs for js
    r = Resources(components=["bokeh", "not-a-component", "bokeh-widgets"])
    codeflash_output = r.components_for("js") # 1.41μs -> 1.63μs (13.8% slower)

def test_basic_css_custom_with_invalid():
    # Should always return empty list for css kind, even with invalid components
    r = Resources(components=["bokeh", "not-a-component", "bokeh-widgets"])
    codeflash_output = r.components_for("css") # 1.15μs -> 515ns (123% faster)

def test_basic_empty_components():
    # Should return empty list for any kind if no components given
    r = Resources(components=[])
    codeflash_output = r.components_for("js") # 781ns -> 1.36μs (42.4% slower)
    codeflash_output = r.components_for("css") # 331ns -> 296ns (11.8% faster)

# 2. Edge Test Cases

def test_edge_all_invalid_components():
    # All components are invalid, should return empty list
    r = Resources(components=["foo", "bar", "baz"])
    codeflash_output = r.components_for("js") # 1.32μs -> 1.42μs (6.56% slower)
    codeflash_output = r.components_for("css") # 577ns -> 316ns (82.6% faster)

def test_edge_mixed_case_components():
    # Components are case-sensitive, so "BOKEH" should not match "bokeh"
    r = Resources(components=["BOKEH", "bokeh"])
    codeflash_output = r.components_for("js") # 1.18μs -> 1.47μs (20.0% slower)

def test_edge_duplicate_components():
    # Duplicates should be preserved in output if present in input
    r = Resources(components=["bokeh", "bokeh", "bokeh-gl"])
    codeflash_output = r.components_for("js") # 1.29μs -> 1.52μs (15.2% slower)

def test_edge_empty_string_component():
    # Empty string is not a valid component, should be ignored
    r = Resources(components=["", "bokeh"])
    codeflash_output = r.components_for("js") # 1.17μs -> 1.39μs (15.7% slower)

def test_edge_none_as_component():
    # None is not a valid component, should be ignored
    r = Resources(components=[None, "bokeh"])
    # Filter out None from input, as None is not a string
    expected = ["bokeh"]
    codeflash_output = r.components_for("js"); result = codeflash_output # 1.26μs -> 1.39μs (9.54% slower)

def test_edge_numeric_component():
    # Numeric values are not valid components, should be ignored
    r = Resources(components=[123, "bokeh"])
    # Filter out non-str components
    expected = ["bokeh"]
    codeflash_output = r.components_for("js"); result = codeflash_output # 1.28μs -> 1.43μs (10.8% slower)

def test_edge_order_preservation():
    # Output order should match input order for valid components
    r = Resources(components=["bokeh-widgets", "bokeh-gl", "bokeh"])
    codeflash_output = r.components_for("js") # 1.32μs -> 1.63μs (18.9% slower)

def test_edge_components_for_invalid_kind():
    # Should raise KeyError for unknown kind
    r = Resources()
    with pytest.raises(KeyError):
        r.components_for("not-a-kind") # 1.35μs -> 759ns (77.3% faster)

# 3. Large Scale Test Cases

def test_large_all_valid_js_components():
    # Use all valid JS components repeated up to 1000 times
    valid_js = Resources._component_defs["js"]
    large_list = valid_js * (1000 // len(valid_js))
    r = Resources(components=large_list)
    codeflash_output = r.components_for("js") # 41.7μs -> 18.7μs (123% faster)
    codeflash_output = r.components_for("css") # 24.2μs -> 356ns (6699% faster)

def test_large_mixed_valid_invalid_components():
    # Mix valid and invalid components in a large list
    valid_js = Resources._component_defs["js"]
    invalids = ["foo", "bar", "baz"]
    large_list = (valid_js + invalids) * (1000 // (len(valid_js) + len(invalids)))
    r = Resources(components=large_list)
    # Only valid JS components should be returned, in order and with correct count
    expected = []
    for group in range(1000 // (len(valid_js) + len(invalids))):
        expected.extend(valid_js)
    codeflash_output = r.components_for("js") # 44.5μs -> 17.1μs (160% faster)

def test_large_all_invalid_components():
    # Large list of only invalid components
    large_list = ["not-a-component"] * 1000
    r = Resources(components=large_list)
    codeflash_output = r.components_for("js") # 45.5μs -> 12.3μs (271% faster)
    codeflash_output = r.components_for("css") # 24.0μs -> 317ns (7472% faster)

def test_large_empty_components():
    # Large scale: empty input should always return empty
    r = Resources(components=[])
    codeflash_output = r.components_for("js") # 713ns -> 1.15μs (38.2% slower)
    codeflash_output = r.components_for("css") # 335ns -> 311ns (7.72% faster)

def test_large_duplicates():
    # Large list with many duplicates of a valid component
    large_list = ["bokeh"] * 1000
    r = Resources(components=large_list)
    codeflash_output = r.components_for("js") # 32.6μs -> 18.3μs (77.9% faster)

def test_large_performance():
    # Performance: Should not take excessive time for 1000 elements
    import time
    large_list = Resources._component_defs["js"] * (1000 // len(Resources._component_defs["js"]))
    r = Resources(components=large_list)
    start = time.time()
    codeflash_output = r.components_for("js"); result = codeflash_output # 41.3μs -> 18.0μs (130% faster)
    end = time.time()
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from bokeh.resources import Resources

# unit tests

# ---------------------------
# Basic Test Cases
# ---------------------------

def test_basic_js_components_default():
    """Test that the default JS components are returned correctly."""
    r = Resources()
    expected = ["bokeh", "bokeh-gl", "bokeh-widgets", "bokeh-tables", "bokeh-mathjax"]
    codeflash_output = r.components_for("js") # 1.43μs -> 1.53μs (6.40% slower)

def test_basic_js_components_custom_subset():
    """Test that a custom subset of JS components is returned correctly."""
    r = Resources(components=["bokeh", "bokeh-gl"])
    codeflash_output = r.components_for("js") # 1.22μs -> 1.48μs (17.9% slower)

def test_basic_js_components_order_preserved():
    """Test that the order of components is preserved."""
    r = Resources(components=["bokeh-widgets", "bokeh", "bokeh-gl"])
    codeflash_output = r.components_for("js") # 1.29μs -> 1.47μs (12.1% slower)

def test_basic_css_components_default():
    """Test that no CSS components are returned when none are defined."""
    r = Resources()
    codeflash_output = r.components_for("css") # 1.33μs -> 496ns (168% faster)

def test_basic_empty_components():
    """Test that an empty components list returns an empty list for any kind."""
    r = Resources(components=[])
    codeflash_output = r.components_for("js") # 843ns -> 1.34μs (36.9% slower)
    codeflash_output = r.components_for("css") # 362ns -> 313ns (15.7% faster)

# ---------------------------
# Edge Test Cases
# ---------------------------

def test_edge_unknown_component():
    """Test that unknown components are ignored."""
    r = Resources(components=["bokeh", "unknown", "bokeh-gl"])
    codeflash_output = r.components_for("js") # 1.46μs -> 1.50μs (2.34% slower)

def test_edge_only_unknown_components():
    """Test that only unknown components returns an empty list."""
    r = Resources(components=["foo", "bar"])
    codeflash_output = r.components_for("js") # 1.23μs -> 1.37μs (10.6% slower)
    codeflash_output = r.components_for("css") # 532ns -> 323ns (64.7% faster)

def test_edge_duplicate_components():
    """Test that duplicate components are preserved in output."""
    r = Resources(components=["bokeh", "bokeh", "bokeh-gl", "bokeh-gl"])
    codeflash_output = r.components_for("js") # 1.35μs -> 1.51μs (10.5% slower)

def test_edge_kind_is_css_with_custom_components():
    """Test that kind='css' always returns an empty list, even with custom components."""
    r = Resources(components=["bokeh", "bokeh-gl", "bokeh-widgets"])
    codeflash_output = r.components_for("css") # 1.09μs -> 499ns (119% faster)

def test_edge_case_sensitive_components():
    """Test that component matching is case-sensitive."""
    r = Resources(components=["BOKEH", "bokeh"])
    codeflash_output = r.components_for("js") # 1.22μs -> 1.54μs (20.8% slower)

def test_edge_invalid_kind():
    """Test that an invalid kind raises a KeyError."""
    r = Resources()
    with pytest.raises(KeyError):
        r.components_for("invalid_kind") # 1.39μs -> 757ns (83.2% faster)

def test_edge_none_kind():
    """Test that None as kind raises a TypeError or KeyError."""
    r = Resources()
    with pytest.raises(Exception):
        r.components_for(None) # 1.39μs -> 774ns (79.1% faster)

def test_edge_empty_string_kind():
    """Test that empty string as kind raises a KeyError."""
    r = Resources()
    with pytest.raises(KeyError):
        r.components_for("") # 1.35μs -> 742ns (82.5% faster)

def test_edge_components_with_spaces():
    """Test that components with extra spaces are not matched."""
    r = Resources(components=[" bokeh ", "bokeh"])
    codeflash_output = r.components_for("js") # 1.37μs -> 1.67μs (17.8% slower)

# ---------------------------
# Large Scale Test Cases
# ---------------------------

def test_large_many_components():
    """Test with a large number of valid and invalid components."""
    valid = Resources._component_defs["js"]
    # Create 500 valid, 500 invalid components
    components = valid * 100 + ["foo", "bar"] * 250
    r = Resources(components=components)
    # Only valid components should be returned, in the order they appear
    expected = valid * 100
    codeflash_output = r.components_for("js") # 49.2μs -> 17.8μs (177% faster)

def test_large_only_invalid_components():
    """Test with a large number of invalid components."""
    components = ["foo{}".format(i) for i in range(1000)]
    r = Resources(components=components)
    codeflash_output = r.components_for("js") # 46.1μs -> 24.3μs (89.3% faster)
    codeflash_output = r.components_for("css") # 24.1μs -> 343ns (6934% faster)

def test_large_duplicate_valid_components():
    """Test with many duplicates of valid components."""
    components = ["bokeh"] * 500 + ["bokeh-gl"] * 500
    r = Resources(components=components)
    expected = ["bokeh"] * 500 + ["bokeh-gl"] * 500
    codeflash_output = r.components_for("js") # 36.2μs -> 19.9μs (82.1% faster)

def test_large_mixed_case_components():
    """Test with large number of valid and similar-looking invalid components."""
    valid = Resources._component_defs["js"]
    # Mix valid, uppercase, and lowercase variants
    components = []
    for v in valid:
        components.extend([v, v.upper(), v.lower()])
    r = Resources(components=components * 50)
    # Only exact matches should be returned, in order
    expected = valid * 50
    codeflash_output = r.components_for("js") # 42.9μs -> 17.1μs (151% faster)

def test_large_order_preservation():
    """Test that order is preserved with large input."""
    valid = Resources._component_defs["js"]
    components = []
    # Interleave valid and invalid components
    for i in range(200):
        components.append(valid[i % len(valid)])
        components.append("invalid{}".format(i))
    r = Resources(components=components)
    expected = [valid[i % len(valid)] for i in range(200)]
    codeflash_output = r.components_for("js") # 20.3μs -> 10.9μs (85.8% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-Resources.components_for-mhvdbgxr and push.

Codeflash Static Badge

The optimization replaces the inefficient list comprehension with membership testing against a list with a more efficient approach using set operations. 

**Key optimization**: In the `components_for` method, instead of checking `comp in self._component_defs[kind]` for each component (which requires linear search through the list), the code now:

1. **Pre-converts the target list to a set**: `kind_set = set(kind_comps)` 
2. **Uses set membership testing**: `comp in kind_set` (O(1) average case vs O(n) for list)
3. **Adds early exit optimization**: Returns empty list immediately when `kind_comps` is empty

**Why this is faster**: Set membership testing in Python is O(1) average case using hash lookups, while list membership testing is O(n) requiring linear scanning. For the filtering operation, this changes the overall complexity from O(n*m) to O(n+m) where n is the number of components and m is the size of the component definitions list.

**Performance impact**: The test results show this optimization is particularly effective for:
- **Large-scale filtering**: 177% speedup for 500+ components, 271% speedup for 1000 invalid components
- **CSS operations**: Up to 7472% speedup since CSS component definitions are empty, triggering the early exit
- **Mixed valid/invalid scenarios**: 160% speedup when filtering large mixed lists

The optimization trades a small upfront cost of set creation for significant savings during the filtering loop, making it especially beneficial when `components_for` is called with large component lists or frequently in performance-critical code paths.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 12, 2025 02:15
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant