Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 7, 2025

📄 16% (0.16x) speedup for LQMarkov.stationary_values in quantecon/_lqcontrol.py

⏱️ Runtime : 12.6 seconds 10.9 seconds (best of 5 runs)

📝 Explanation and details

The optimized code achieves a 15% speedup through several key optimizations focused on reducing memory allocations and improving computational efficiency:

Matrix Buffer Optimizations:

  • Replaced np.copy(Ps) with np.empty_like(Ps) to avoid unnecessary copying
  • Implemented buffer swapping (Ps, Ps1 = Ps1, Ps) instead of expensive array copying operations
  • Used .fill(0.0) instead of slice assignment for zeroing arrays, which is more efficient

Precomputation and Vectorization:

  • Pre-transposed matrices (Bs_T, As_T, Ns_T) to avoid repeated transpose operations in inner loops
  • Vectorized inner loop computations by precomputing intermediate matrix products like pij_Bs = pij @ Bs[i]
  • Reduced Python-level arithmetic by consolidating matrix operations

Improved Linear Algebra:

  • Used LU factorization (lu_factor/lu_solve) instead of direct solve() calls for better numerical stability and potential reuse
  • Optimized the solve operations in the Riccati iteration by restructuring the computation flow

Memory Management:

  • Preallocated temporary arrays to avoid repeated allocations in tight loops
  • Used more efficient array initialization patterns

The optimizations are particularly effective for larger systems (50+ Markov states show 16.9% speedup) and medium-scale problems (10 states/10 dimensions show 8.11% speedup), where the reduced memory overhead and vectorization benefits compound. For very small systems, some optimizations introduce slight overhead due to additional setup costs, but the overall trend favors larger, more computationally intensive problems.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 71 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import numpy as np
# imports
import pytest  # used for our unit tests
from quantecon._lqcontrol import LQMarkov

# unit tests

# ----------- Basic Test Cases ------------

def test_single_state_scalar():
    # One Markov state, scalar system
    Π = [[1.0]]
    Qs = [[1.0]]
    Rs = [[1.0]]
    As = [[1.0]]
    Bs = [[1.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values() # 704μs -> 643μs (9.40% faster)

def test_two_states_2d():
    # Two Markov states, 2D system
    Π = [[0.9, 0.1], [0.2, 0.8]]
    Qs = [np.eye(1), np.eye(1)*2]
    Rs = [np.eye(2), np.eye(2)*3]
    As = [np.array([[0.5, 0.1],[0.2, 0.7]]), np.array([[0.6, 0.0],[0.0, 0.6]])]
    Bs = [np.ones((2,1)), np.ones((2,1))]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values() # 2.37ms -> 2.26ms (4.74% faster)
    # Check symmetry
    for i in range(2):
        pass
    # Check positive semi-definiteness
    for i in range(2):
        eigs = np.linalg.eigvalsh(Ps[i])

def test_cross_term_payoff():
    # Test with cross-term N(s)
    Π = [[1.0]]
    Qs = [[2.0]]
    Rs = [[1.0]]
    As = [[1.0]]
    Bs = [[1.0]]
    Ns = [[0.5]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, Ns=Ns)
    Ps, ds, Fs = lq.stationary_values() # 610μs -> 675μs (9.52% slower)
    # Check that cross-term affects policy
    Fs_no_N, _, _ = LQMarkov(Π, Qs, Rs, As, Bs).stationary_values() # 560μs -> 679μs (17.5% slower)

def test_deterministic_vs_stochastic():
    # Test with stochastic C(s)
    Π = [[1.0]]
    Qs = [[1.0]]
    Rs = [[1.0]]
    As = [[1.0]]
    Bs = [[1.0]]
    Cs = [[1.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs)
    Ps, ds, Fs = lq.stationary_values() # 574μs -> 576μs (0.229% slower)

def test_discount_factor():
    # Test with beta < 1
    Π = [[1.0]]
    Qs = [[1.0]]
    Rs = [[1.0]]
    As = [[1.0]]
    Bs = [[1.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, beta=0.95)
    Ps, ds, Fs = lq.stationary_values() # 524μs -> 563μs (6.99% slower)

# ----------- Edge Test Cases ------------

def test_zero_control_matrix():
    # B(s) = 0, control has no effect
    Π = [[1.0]]
    Qs = [[1.0]]
    Rs = [[1.0]]
    As = [[1.0]]
    Bs = [[0.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values()

def test_zero_payoff_matrices():
    # Q(s) and R(s) are zero
    Π = [[1.0]]
    Qs = [[0.0]]
    Rs = [[0.0]]
    As = [[1.0]]
    Bs = [[1.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values()

def test_singular_Q_matrix():
    # Q(s) is singular, but system is controllable
    Π = [[1.0]]
    Qs = [[0.0]]
    Rs = [[1.0]]
    As = [[1.0]]
    Bs = [[1.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values() # 230μs -> 229μs (0.424% faster)

def test_nonergodic_markov_chain():
    # Markov chain with zero transition probability
    Π = [[1.0, 0.0], [0.0, 1.0]]
    Qs = [np.eye(1), np.eye(1)]
    Rs = [np.eye(1), np.eye(1)]
    As = [np.eye(1), np.eye(1)]
    Bs = [np.eye(1), np.eye(1)]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values() # 1.49ms -> 1.32ms (13.4% faster)

def test_high_discount_factor():
    # beta very close to 1
    Π = [[1.0]]
    Qs = [[1.0]]
    Rs = [[1.0]]
    As = [[1.0]]
    Bs = [[1.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, beta=0.999999)
    Ps, ds, Fs = lq.stationary_values() # 471μs -> 453μs (4.01% faster)

def test_negative_eigenvalues_in_A():
    # A(s) with negative eigenvalues, system should still converge
    Π = [[1.0]]
    Qs = [[1.0]]
    Rs = [[1.0]]
    As = [[-0.5]]
    Bs = [[1.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values() # 401μs -> 373μs (7.71% faster)

def test_zero_transition_matrix():
    # Π is zero everywhere (invalid Markov chain)
    Π = [[0.0]]
    Qs = [[1.0]]
    Rs = [[1.0]]
    As = [[1.0]]
    Bs = [[1.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    # Should not raise, but output should be Rs
    Ps, ds, Fs = lq.stationary_values() # 166μs -> 173μs (4.19% slower)

def test_max_iter_exceeded():
    # Test that max_iter is enforced
    Π = [[1.0]]
    Qs = [[1.0]]
    Rs = [[1.0]]
    As = [[1.0]]
    Bs = [[1.0]]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    # Artificially set tolerance very small to force fail
    lq._solve_discrete_riccati_system = lambda *a, **k: \
        LQMarkov._solve_discrete_riccati_system(lq, *a, tolerance=1e-30, max_iter=1)
    with pytest.raises(ValueError):
        lq.stationary_values(max_iter=1) # 105μs -> 102μs (2.78% faster)

# ----------- Large Scale Test Cases ------------

def test_large_markov_states():
    # 50 Markov states, 2D system
    m = 50
    Π = np.eye(m) * 0.9 + np.ones((m, m)) * 0.1 / m
    Qs = [np.eye(1) for _ in range(m)]
    Rs = [np.eye(2) for _ in range(m)]
    As = [np.eye(2) * 0.95 for _ in range(m)]
    Bs = [np.ones((2,1)) for _ in range(m)]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values() # 11.6s -> 9.94s (16.9% faster)
    # Check symmetry for a few states
    for i in [0, m//2, m-1]:
        pass

def test_large_state_space():
    # 1 Markov state, 20D system
    n = 20
    Π = [[1.0]]
    Qs = [np.eye(1)]
    Rs = [np.eye(n)]
    As = [np.eye(n) * 0.99]
    Bs = [np.ones((n,1))]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values()
    # Check positive semi-definiteness
    eigs = np.linalg.eigvalsh(Ps[0])

def test_large_control_space():
    # 1 Markov state, 2D system, 10 controls
    n = 2
    k = 10
    Π = [[1.0]]
    Qs = [np.eye(k)]
    Rs = [np.eye(n)]
    As = [np.eye(n) * 0.95]
    Bs = [np.ones((n,k))]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values() # 6.00ms -> 7.48ms (19.8% slower)

def test_many_markov_states_and_large_state_space():
    # 10 Markov states, 10D system
    m = 10
    n = 10
    Π = np.eye(m) * 0.8 + np.ones((m, m)) * 0.2 / m
    Qs = [np.eye(1) for _ in range(m)]
    Rs = [np.eye(n) for _ in range(m)]
    As = [np.eye(n) * 0.95 for _ in range(m)]
    Bs = [np.ones((n,1)) for _ in range(m)]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values() # 482ms -> 446ms (8.11% faster)
    # Check symmetry for a few states
    for i in [0, m//2, m-1]:
        pass

def test_performance_large_system():
    # 5 Markov states, 50D system
    m = 5
    n = 50
    Π = np.eye(m) * 0.8 + np.ones((m, m)) * 0.2 / m
    Qs = [np.eye(1) for _ in range(m)]
    Rs = [np.eye(n) for _ in range(m)]
    As = [np.eye(n) * 0.99 for _ in range(m)]
    Bs = [np.ones((n,1)) for _ in range(m)]
    lq = LQMarkov(Π, Qs, Rs, As, Bs)
    Ps, ds, Fs = lq.stationary_values()
    # Check symmetry for a few states
    for i in [0, m//2, m-1]:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import numpy as np
# imports
import pytest  # used for our unit tests
from quantecon._lqcontrol import LQMarkov

# ------------------- UNIT TESTS -------------------

# Helper for deterministic model setup
def make_simple_lqmarkov(m=1, n=1, k=1, beta=0.95):
    Π = np.eye(m)
    Qs = [np.eye(k) for _ in range(m)]
    Rs = [np.eye(n) for _ in range(m)]
    As = [np.eye(n) for _ in range(m)]
    Bs = [np.ones((n, k)) for _ in range(m)]
    return LQMarkov(Π, Qs, Rs, As, Bs, beta=beta)

# 1. BASIC TEST CASES

def test_single_state_scalar():
    # One Markov state, scalar system
    lq = make_simple_lqmarkov(m=1, n=1, k=1, beta=0.9)
    Ps, ds, Fs = lq.stationary_values() # 576μs -> 518μs (11.2% faster)

def test_two_state_symmetric():
    # Two Markov states, symmetric matrices
    m, n, k = 2, 2, 1
    Π = np.array([[0.7, 0.3], [0.2, 0.8]])
    Qs = [np.eye(k)*2, np.eye(k)*3]
    Rs = [np.eye(n), np.eye(n)*2]
    As = [np.eye(n), np.eye(n)*0.9]
    Bs = [np.ones((n, k)), np.ones((n, k))*2]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, beta=0.95)
    Ps, ds, Fs = lq.stationary_values() # 12.6ms -> 11.1ms (13.2% faster)
    # Ps should be positive semi-definite
    for i in range(2):
        pass

def test_deterministic_vs_stochastic():
    # Compare deterministic vs stochastic (with Cs)
    m, n, k, j = 1, 2, 1, 1
    Π = np.eye(m)
    Qs = [np.eye(k)]
    Rs = [np.eye(n)]
    As = [np.eye(n)]
    Bs = [np.ones((n, k))]
    Cs = [np.ones((n, j))]
    lq_det = LQMarkov(Π, Qs, Rs, As, Bs, beta=0.95)
    lq_stoch = LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, beta=0.95)
    Ps_det, ds_det, Fs_det = lq_det.stationary_values() # 10.7ms -> 13.0ms (18.2% slower)
    Ps_stoch, ds_stoch, Fs_stoch = lq_stoch.stationary_values() # 10.5ms -> 12.9ms (18.4% slower)

# 2. EDGE TEST CASES

def test_zero_discount():
    # beta = 0, only immediate reward matters
    lq = make_simple_lqmarkov(m=1, n=2, k=1, beta=0.0)
    Ps, ds, Fs = lq.stationary_values() # 177μs -> 209μs (15.1% slower)

def test_identity_transition():
    # Π is identity, so no switching
    lq = make_simple_lqmarkov(m=3, n=1, k=1, beta=0.9)
    Ps, ds, Fs = lq.stationary_values() # 2.86ms -> 3.49ms (18.0% slower)
    # Each state's Ps should be same as if solved independently
    for i in range(3):
        pass

def test_high_discount():
    # beta very close to 1, should still converge
    lq = make_simple_lqmarkov(m=2, n=2, k=1, beta=0.999)
    Ps, ds, Fs = lq.stationary_values()

def test_nontrivial_Ns():
    # Test with nonzero cross-product Ns
    m, n, k = 1, 2, 1
    Π = np.eye(m)
    Qs = [np.eye(k)]
    Rs = [np.eye(n)]
    As = [np.eye(n)]
    Bs = [np.ones((n, k))]
    Ns = [np.ones((k, n))]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, Ns=Ns, beta=0.95)
    Ps, ds, Fs = lq.stationary_values()

def test_nontrivial_Cs():
    # Test with nonzero Cs (stochastic shocks)
    m, n, k, j = 1, 2, 1, 2
    Π = np.eye(m)
    Qs = [np.eye(k)]
    Rs = [np.eye(n)]
    As = [np.eye(n)]
    Bs = [np.ones((n, k))]
    Cs = [np.eye(n, j)]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, beta=0.95)
    Ps, ds, Fs = lq.stationary_values() # 10.6ms -> 9.58ms (11.1% faster)

def test_max_iter_fail():
    # Should raise ValueError if max_iter is too small
    lq = make_simple_lqmarkov(m=1, n=1, k=1, beta=0.95)
    with pytest.raises(ValueError):
        lq.stationary_values(max_iter=1) # 99.3μs -> 113μs (12.3% slower)

def test_negative_definite_Q_R():
    # Should handle positive semi-definite, but not negative definite Q/R
    m, n, k = 1, 2, 1
    Π = np.eye(m)
    Qs = [-np.eye(k)]  # Negative definite Q
    Rs = [np.eye(n)]
    As = [np.eye(n)]
    Bs = [np.ones((n, k))]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, beta=0.95)
    with pytest.raises(np.linalg.LinAlgError):
        lq.stationary_values()

def test_non_square_matrices():
    # Should fail if matrices are not square where required
    m, n, k = 1, 2, 1
    Π = np.eye(m)
    Qs = [np.ones((k, k+1))]  # Not square
    Rs = [np.eye(n)]
    As = [np.eye(n)]
    Bs = [np.ones((n, k))]
    with pytest.raises(ValueError):
        LQMarkov(Π, Qs, Rs, As, Bs, beta=0.95).stationary_values() # 54.5μs -> 52.3μs (4.10% faster)

# 3. LARGE SCALE TEST CASES


def test_large_state_dimension():
    # Test with large n (state dimension)
    m, n, k = 2, 20, 5
    Π = np.eye(m)
    Qs = [np.eye(k) for _ in range(m)]
    Rs = [np.eye(n) for _ in range(m)]
    As = [np.eye(n) for _ in range(m)]
    Bs = [np.ones((n, k)) for _ in range(m)]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, beta=0.95)
    Ps, ds, Fs = lq.stationary_values() # 48.5ms -> 42.9ms (12.9% faster)

def test_large_control_dimension():
    # Test with large k (control dimension)
    m, n, k = 2, 5, 20
    Π = np.eye(m)
    Qs = [np.eye(k) for _ in range(m)]
    Rs = [np.eye(n) for _ in range(m)]
    As = [np.eye(n) for _ in range(m)]
    Bs = [np.ones((n, k)) for _ in range(m)]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, beta=0.95)
    Ps, ds, Fs = lq.stationary_values() # 48.0ms -> 61.3ms (21.7% slower)

def test_large_scale_randomized():
    # Large random matrices, but still well-conditioned
    m, n, k = 10, 10, 10
    np.random.seed(42)
    Π = np.eye(m) * 0.9 + np.ones((m, m)) * 0.1 / m
    Qs = [np.eye(k) + np.random.rand(k, k)*0.1 for _ in range(m)]
    Rs = [np.eye(n) + np.random.rand(n, n)*0.1 for _ in range(m)]
    As = [np.eye(n) + np.random.rand(n, n)*0.01 for _ in range(m)]
    Bs = [np.random.rand(n, k) for _ in range(m)]
    lq = LQMarkov(Π, Qs, Rs, As, Bs, beta=0.95)
    Ps, ds, Fs = lq.stationary_values() # 300ms -> 296ms (1.38% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-LQMarkov.stationary_values-mggxz774 and push.

Codeflash

The optimized code achieves a **15% speedup** through several key optimizations focused on reducing memory allocations and improving computational efficiency:

**Matrix Buffer Optimizations:**
- Replaced `np.copy(Ps)` with `np.empty_like(Ps)` to avoid unnecessary copying
- Implemented buffer swapping (`Ps, Ps1 = Ps1, Ps`) instead of expensive array copying operations
- Used `.fill(0.0)` instead of slice assignment for zeroing arrays, which is more efficient

**Precomputation and Vectorization:**
- Pre-transposed matrices (`Bs_T`, `As_T`, `Ns_T`) to avoid repeated transpose operations in inner loops
- Vectorized inner loop computations by precomputing intermediate matrix products like `pij_Bs = pij @ Bs[i]`
- Reduced Python-level arithmetic by consolidating matrix operations

**Improved Linear Algebra:**
- Used LU factorization (`lu_factor`/`lu_solve`) instead of direct `solve()` calls for better numerical stability and potential reuse
- Optimized the solve operations in the Riccati iteration by restructuring the computation flow

**Memory Management:**
- Preallocated temporary arrays to avoid repeated allocations in tight loops
- Used more efficient array initialization patterns

The optimizations are particularly effective for **larger systems** (50+ Markov states show 16.9% speedup) and **medium-scale problems** (10 states/10 dimensions show 8.11% speedup), where the reduced memory overhead and vectorization benefits compound. For very small systems, some optimizations introduce slight overhead due to additional setup costs, but the overall trend favors larger, more computationally intensive problems.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 7, 2025 19:17
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants