Skip to content

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Oct 7, 2025

📄 10% (0.10x) speedup for _flatten_action_profile in quantecon/game_theory/mclennan_tourky.py

⏱️ Runtime : 377 microseconds 342 microseconds (best of 456 runs)

📝 Explanation and details

The optimization eliminates the expensive pure2mixed function call for pure actions by inlining the conversion directly in _flatten_action_profile.

Key changes:

  • Removed function call overhead: Instead of calling pure2mixed(num_actions, action_profile[i]) for pure actions, the code now directly sets the output array slice to zero and then sets the specific action index to 1.
  • Eliminated temporary array creation: The original code created a temporary mixed_action array via pure2mixed, then copied it to the output. The optimized version writes directly to the output array.

Why this is faster:

  • Function calls in Python have significant overhead (~37.5% of original runtime was spent in pure2mixed)
  • Eliminates one array allocation and copy operation per pure action
  • Reduces memory allocation pressure and improves cache locality

Performance characteristics:
The optimization shows the best gains for test cases with many pure actions (13.5-21.6% speedup), while mixed-action-only cases see minimal change (0-3% improvement). This makes sense since mixed actions still follow the same code path, but pure actions now avoid the expensive function call entirely.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 22 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import numbers

import numpy as np
# imports
import pytest  # used for our unit tests
from quantecon.game_theory.mclennan_tourky import _flatten_action_profile

# unit tests

# 1. Basic Test Cases

def test_single_player_pure_action():
    # One player, pure action
    action_profile = [2]
    indptr = [0, 4]  # 4 actions for player 0
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 8.34μs -> 7.35μs (13.5% faster)
    expected = np.array([0, 0, 1, 0])

def test_single_player_mixed_action():
    # One player, mixed action
    action_profile = [np.array([0.1, 0.3, 0.6])]
    indptr = [0, 3]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 3.83μs -> 3.84μs (0.209% slower)
    expected = np.array([0.1, 0.3, 0.6])

def test_two_players_pure_actions():
    # Two players, both pure actions
    action_profile = [1, 0]
    indptr = [0, 2, 4]  # player 0: 2 actions, player 1: 2 actions
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 8.03μs -> 7.36μs (9.09% faster)
    expected = np.array([0, 1, 1, 0])

def test_two_players_mixed_actions():
    # Two players, both mixed actions
    action_profile = [np.array([0.2, 0.8]), np.array([0.5, 0.5])]
    indptr = [0, 2, 4]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 4.08μs -> 3.97μs (2.97% faster)
    expected = np.array([0.2, 0.8, 0.5, 0.5])

def test_two_players_mixed_and_pure():
    # Mixed and pure actions
    action_profile = [np.array([0.7, 0.3]), 1]
    indptr = [0, 2, 4]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 6.40μs -> 5.68μs (12.8% faster)
    expected = np.array([0.7, 0.3, 0, 1])

def test_three_players_varied_actions():
    # Three players, mixed, pure, mixed
    action_profile = [np.array([0.5, 0.5]), 2, np.array([0.1, 0.9])]
    indptr = [0, 2, 5, 7]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 6.60μs -> 6.10μs (8.20% faster)
    expected = np.array([0.5, 0.5, 0, 0, 1, 0.1, 0.9])

# 2. Edge Test Cases

def test_empty_action_profile():
    # No players
    action_profile = []
    indptr = [0]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 1.91μs -> 1.82μs (4.83% faster)
    expected = np.array([])

def test_player_with_one_action_pure():
    # Player with only one possible action (pure)
    action_profile = [0]
    indptr = [0, 1]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 7.06μs -> 5.93μs (19.0% faster)
    expected = np.array([1])

def test_player_with_one_action_mixed():
    # Player with only one possible action (mixed)
    action_profile = [np.array([1.0])]
    indptr = [0, 1]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 3.61μs -> 3.41μs (5.72% faster)
    expected = np.array([1.0])

def test_all_pure_actions_zero():
    # All players choose action 0
    action_profile = [0, 0, 0]
    indptr = [0, 2, 4, 6]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 9.16μs -> 7.54μs (21.6% faster)
    expected = np.array([1, 0, 1, 0, 1, 0])

def test_mixed_action_not_sum_to_one():
    # Mixed action does not sum to one (should still be accepted)
    action_profile = [np.array([0.5, 0.2, 0.2])]
    indptr = [0, 3]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 3.41μs -> 3.37μs (1.16% faster)
    expected = np.array([0.5, 0.2, 0.2])


def test_action_index_out_of_bounds():
    # Pure action index out of bounds
    action_profile = [3]
    indptr = [0, 2]
    # Should raise IndexError
    with pytest.raises(IndexError):
        _flatten_action_profile(action_profile, indptr) # 11.9μs -> 14.1μs (15.7% slower)

def test_mixed_action_wrong_length():
    # Mixed action of wrong length
    action_profile = [np.array([0.5, 0.5, 0.0])]
    indptr = [0, 2]
    # Should raise ValueError due to broadcasting error
    with pytest.raises(ValueError):
        _flatten_action_profile(action_profile, indptr) # 7.15μs -> 6.51μs (9.88% faster)


def test_mixed_action_as_list():
    # Mixed action as list instead of numpy array
    action_profile = [[0.2, 0.8]]
    indptr = [0, 2]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 9.30μs -> 8.99μs (3.47% faster)
    expected = np.array([0.2, 0.8])

def test_mixed_action_with_zeroes():
    # Mixed action with all zeroes
    action_profile = [np.array([0.0, 0.0, 0.0])]
    indptr = [0, 3]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 5.29μs -> 4.92μs (7.65% faster)
    expected = np.array([0.0, 0.0, 0.0])

def test_empty_mixed_action():
    # Mixed action of length zero
    action_profile = [np.array([])]
    indptr = [0, 0]
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 4.42μs -> 4.11μs (7.54% faster)
    expected = np.array([])

def test_empty_indptr_and_profile():
    # Both action_profile and indptr are empty
    action_profile = []
    indptr = []
    # Should raise IndexError due to indptr[-1] access
    with pytest.raises(IndexError):
        _flatten_action_profile(action_profile, indptr) # 1.03μs -> 981ns (4.59% faster)

# 3. Large Scale Test Cases

def test_many_players_all_pure():
    # 100 players, each with 5 actions, all pure actions
    N = 100
    action_profile = [i % 5 for i in range(N)]
    indptr = [0]
    for _ in range(N):
        indptr.append(indptr[-1] + 5)
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 83.9μs -> 70.5μs (19.1% faster)
    expected = np.zeros(N * 5)
    for i in range(N):
        expected[indptr[i] + (i % 5)] = 1

def test_many_players_all_mixed():
    # 50 players, each with 10 actions, all mixed actions
    N = 50
    action_profile = [np.ones(10) / 10 for _ in range(N)]
    indptr = [0]
    for _ in range(N):
        indptr.append(indptr[-1] + 10)
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 19.9μs -> 19.9μs (0.136% slower)
    expected = np.ones(N * 10) / 10

def test_large_varied_profile():
    # 200 players, alternating pure and mixed, 3 actions each
    N = 200
    action_profile = []
    for i in range(N):
        if i % 2 == 0:
            action_profile.append(i % 3)
        else:
            arr = np.zeros(3)
            arr[i % 3] = 1.0
            action_profile.append(arr)
    indptr = [0]
    for _ in range(N):
        indptr.append(indptr[-1] + 3)
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 115μs -> 101μs (14.5% faster)
    expected = np.zeros(N * 3)
    for i in range(N):
        if i % 2 == 0:
            expected[indptr[i] + (i % 3)] = 1.0
        else:
            expected[indptr[i] + (i % 3)] = 1.0

def test_large_random_mixed_profile():
    # 100 players, random mixed actions, 5 actions each
    np.random.seed(42)
    N = 100
    action_profile = [np.random.dirichlet(np.ones(5)) for _ in range(N)]
    indptr = [0]
    for _ in range(N):
        indptr.append(indptr[-1] + 5)
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 36.2μs -> 35.5μs (1.89% faster)
    expected = np.concatenate(action_profile)

def test_large_sparse_profile():
    # 50 players, each with 20 actions, mostly zero mixed actions
    N = 50
    action_profile = []
    for i in range(N):
        arr = np.zeros(20)
        arr[i % 20] = 1.0
        action_profile.append(arr)
    indptr = [0]
    for _ in range(N):
        indptr.append(indptr[-1] + 20)
    codeflash_output = _flatten_action_profile(action_profile, indptr); result = codeflash_output # 19.8μs -> 19.4μs (2.22% faster)
    expected = np.zeros(N * 20)
    for i in range(N):
        expected[indptr[i] + (i % 20)] = 1.0
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_flatten_action_profile-mgg0c4ej and push.

Codeflash

The optimization eliminates the expensive `pure2mixed` function call for pure actions by inlining the conversion directly in `_flatten_action_profile`. 

**Key changes:**
- **Removed function call overhead**: Instead of calling `pure2mixed(num_actions, action_profile[i])` for pure actions, the code now directly sets the output array slice to zero and then sets the specific action index to 1.
- **Eliminated temporary array creation**: The original code created a temporary `mixed_action` array via `pure2mixed`, then copied it to the output. The optimized version writes directly to the output array.

**Why this is faster:**
- Function calls in Python have significant overhead (~37.5% of original runtime was spent in `pure2mixed`)
- Eliminates one array allocation and copy operation per pure action
- Reduces memory allocation pressure and improves cache locality

**Performance characteristics:**
The optimization shows the best gains for test cases with many pure actions (13.5-21.6% speedup), while mixed-action-only cases see minimal change (0-3% improvement). This makes sense since mixed actions still follow the same code path, but pure actions now avoid the expensive function call entirely.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 7, 2025 03:36
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants