⚡️ Speed up method `IVP._integrate_fixed_trajectory` by 16% #11

codeflash-ai · 2025-10-07T15:08:58Z

📄 16% (0.16x) speedup for `IVP._integrate_fixed_trajectory` in `quantecon/_ivp.py`

⏱️ Runtime : 14.9 milliseconds → 12.8 milliseconds (best of 311 runs)

📝 Explanation and details

The optimization replaces the O(n²) performance bottleneck caused by repeated np.vstack() operations with an efficient pre-allocation strategy.

Key optimization:

Pre-allocates solution array: Instead of starting with a single row and repeatedly calling np.vstack() to grow the array (which requires copying all existing data each time), the code pre-allocates chunks of 128 rows and grows by additional chunks only when needed.
Direct array indexing: Replaces expensive np.hstack() and np.vstack() calls with direct array assignment using indexing (sol[idx, 0] = self.t).
Final trimming: After integration completes, unused pre-allocated rows are trimmed with a single slice operation.

Why it's faster:
The original code's np.vstack() operations had O(n) cost per iteration, leading to O(n²) total complexity as the solution array grew. The line profiler shows 8.5% of runtime was spent on vstack operations. The optimized version achieves amortized O(1) growth by pre-allocating space and only occasionally resizing, reducing memory allocations and data copying.

Performance gains by test type:

Small integrations (few steps): 20-30% speedup due to eliminating array creation overhead
Medium integrations (10-100 steps): 40-60% speedup as vstack costs become more significant
Large systems (1000+ variables): Smaller but consistent 3-10% gains since integration dominates runtime
Many steps (500+ iterations): 60-65% speedup where the O(n²) vstack penalty was most severe

The optimization maintains identical output format and behavior while dramatically improving scalability for longer integration trajectories.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 54 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import numpy as np
# imports
import pytest  # used for our unit tests
from quantecon._ivp import IVP
from scipy import integrate

# ---- UNIT TESTS ----

# Basic Test Cases

def test_basic_linear_ode():
    # Test a simple ODE: dy/dt = 1, y(0) = 0, analytic solution: y = t
    def f(t, y): return 1
    ivp = IVP(f)
    ivp.set_initial_value(0, 0)
    # Integrate from t=0 to t=1 in steps of h=0.2
    h = 0.2
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 117μs -> 75.3μs (55.7% faster)
    # Check solution is approximately t
    for row in sol:
        pass

def test_basic_quadratic_ode():
    # dy/dt = 2t, y(0) = 0, analytic solution: y = t^2
    def f(t, y): return 2 * t
    ivp = IVP(f)
    ivp.set_initial_value(0, 0)
    h = 0.1
    T = 0.5
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 105μs -> 68.1μs (55.6% faster)
    # Check solution is close to t^2
    for row in sol:
        pass

def test_basic_system_ode():
    # System: dy1/dt = y2, dy2/dt = -y1, y1(0)=1, y2(0)=0 (simple harmonic oscillator)
    def f(t, y): return [y[1], -y[0]]
    ivp = IVP(f)
    ivp.set_initial_value([1, 0], 0)
    h = 0.1
    T = 0.3
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 199μs -> 122μs (62.4% faster)

# Edge Test Cases

def test_zero_step_size():
    # h=0 should not advance time; should return only initial condition
    def f(t, y): return 1
    ivp = IVP(f)
    ivp.set_initial_value(0, 0)
    h = 0.0
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 62.8μs -> 50.8μs (23.6% faster)

def test_negative_step_forward_in_time():
    # Negative h, but T > t0, should not advance; should return only initial condition
    def f(t, y): return 1
    ivp = IVP(f)
    ivp.set_initial_value(0, 0)
    h = -0.1
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 29.9μs -> 23.9μs (24.8% faster)

def test_backward_integration():
    # Integrate backwards in time: h<0, T < t0
    def f(t, y): return -1
    ivp = IVP(f)
    ivp.set_initial_value(1, 1)
    h = -0.2
    T = 0.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 49.0μs -> 34.8μs (40.8% faster)
    # Check solution is approximately y = t
    for row in sol:
        pass

def test_non_integer_steps():
    # h does not evenly divide T-t0
    def f(t, y): return 2
    ivp = IVP(f)
    ivp.set_initial_value(0, 0)
    h = 0.3
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 106μs -> 65.4μs (62.6% faster)
    # Solution should be close to y = 2*t
    for row in sol:
        pass

def test_multidimensional_y():
    # ODE with y as a 3-element vector
    def f(t, y): return [1, 2, 3]
    ivp = IVP(f)
    ivp.set_initial_value([0, 0, 0], 0)
    h = 0.5
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 109μs -> 69.7μs (57.6% faster)
    # Each y should be [t, t*2, t*3]
    for row in sol:
        pass

def test_initial_condition_at_target():
    # t0 == T, should return only initial condition
    def f(t, y): return 1
    ivp = IVP(f)
    ivp.set_initial_value(1.0, 1.0)
    h = 0.1
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 25.9μs -> 20.8μs (24.8% faster)

def test_non_list_return():
    # f returns a scalar
    def f(t, y): return 2
    ivp = IVP(f)
    ivp.set_initial_value(0, 0)
    h = 0.1
    T = 0.2
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 106μs -> 65.3μs (62.9% faster)
    # Solution should be close to y = 2*t
    for row in sol:
        pass

# Large Scale Test Cases

def test_large_scale_linear_ode():
    # Large number of steps, dy/dt = 1, y(0)=0
    def f(t, y): return 1
    ivp = IVP(f)
    ivp.set_initial_value(0, 0)
    h = 0.001
    T = 0.999
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 112μs -> 67.7μs (66.0% faster)
    # Solution should be close to y = t
    for row in sol[::100]:  # check every 100th row for efficiency
        pass

def test_large_scale_multidimensional_ode():
    # Large system: y is a 10-element vector, dy/dt = range(1,11)
    def f(t, y): return list(range(1, 11))
    ivp = IVP(f)
    ivp.set_initial_value([0]*10, 0)
    h = 0.01
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 129μs -> 83.2μs (55.3% faster)
    # Each y[i] should be close to (i+1)*t
    for row in sol[::10]:  # check every 10th row for efficiency
        for i in range(10):
            pass

def test_large_scale_backward_integration():
    # Large backward integration: dy/dt = -2, y(2)=2, integrate to t=1
    def f(t, y): return -2
    ivp = IVP(f)
    ivp.set_initial_value(2, 2)
    h = -0.001
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 61.5μs -> 41.8μs (47.2% faster)
    # Solution should be close to y = t
    for row in sol[::100]:  # check every 100th row for efficiency
        pass

def test_large_scale_noninteger_steps():
    # Large scale, h does not evenly divide T-t0
    def f(t, y): return 3
    ivp = IVP(f)
    ivp.set_initial_value(0, 0)
    h = 0.007
    T = 0.999
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 113μs -> 68.6μs (65.2% faster)
    # Solution should be close to y = 3*t
    for row in sol[::100]:  # check every 100th row for efficiency
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import numpy as np
# imports
import pytest  # used for our unit tests
from quantecon._ivp import IVP
from scipy import integrate

# unit tests

# ---- Basic Test Cases ----

def test_basic_scalar_ode_forward():
    # Test integrating dy/dt = -y, y(0)=1, analytical solution y(t)=exp(-t)
    def f(t, y): return -y
    ivp = IVP(f)
    ivp.set_initial_value(1.0, 0.0)
    h = 0.1
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 185μs -> 117μs (57.6% faster)

def test_basic_vector_ode_forward():
    # Test integrating dx/dt = y, dy/dt = -x, x(0)=1, y(0)=0 (circle)
    def f(t, Y): return [Y[1], -Y[0]]
    ivp = IVP(f)
    ivp.set_initial_value([1.0, 0.0], 0.0)
    h = 0.1
    T = np.pi/2
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 302μs -> 178μs (69.0% faster)

def test_basic_backward_integration():
    # Test integrating backward in time
    def f(t, y): return -y
    ivp = IVP(f)
    ivp.set_initial_value(1.0, 1.0)
    h = -0.1
    T = 0.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 183μs -> 115μs (59.4% faster)

# ---- Edge Test Cases ----

def test_zero_step_size():
    # Test h=0 should not advance time, should break immediately
    def f(t, y): return -y
    ivp = IVP(f)
    ivp.set_initial_value(1.0, 0.0)
    h = 0.0
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 59.6μs -> 47.5μs (25.6% faster)

def test_t_equals_T():
    # Test initial time equals target time
    def f(t, y): return -y
    ivp = IVP(f)
    ivp.set_initial_value(2.0, 2.0)
    h = 0.1
    T = 2.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 31.8μs -> 27.2μs (16.9% faster)

def test_negative_h_forward_T():
    # Test negative h with T > t0, should not advance (wrong direction)
    def f(t, y): return -y
    ivp = IVP(f)
    ivp.set_initial_value(1.0, 0.0)
    h = -0.1
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 29.9μs -> 24.5μs (22.1% faster)

def test_large_h_overshoot():
    # Test h larger than (T-t0), should overshoot T in one step
    def f(t, y): return -y
    ivp = IVP(f)
    ivp.set_initial_value(1.0, 0.0)
    h = 5.0
    T = 1.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 187μs -> 115μs (61.5% faster)

def test_non_list_output():
    # Test ODE function returning a numpy array
    def f(t, y): return np.array([-y])
    ivp = IVP(f)
    ivp.set_initial_value(1.0, 0.0)
    h = 0.1
    T = 0.2
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 127μs -> 87.8μs (44.7% faster)

def test_jacobian_argument():
    # Test with a jacobian provided
    def f(t, y): return -y
    def jac(t, y): return np.array([[-1]])
    ivp = IVP(f, jac=jac)
    ivp.set_initial_value(1.0, 0.0)
    h = 0.1
    T = 0.2
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 120μs -> 81.8μs (47.6% faster)

def test_non_successful_integration():
    # Test with ODE that fails after first step (by returning tuple)
    def f(t, y): return (y,)
    ivp = IVP(f)
    ivp.set_initial_value(1.0, 0.0)
    h = 0.1
    T = 0.2
    step = True
    relax = False
    # Should raise an error (scipy.ode expects ndarray/list, not tuple)
    with pytest.raises(Exception):
        ivp._integrate_fixed_trajectory(h, T, step, relax)

# ---- Large Scale Test Cases ----

def test_large_number_of_steps():
    # Test integrating with many steps, but <1000
    def f(t, y): return -y
    ivp = IVP(f)
    ivp.set_initial_value(1.0, 0.0)
    h = 0.01
    T = 5.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 415μs -> 250μs (65.8% faster)
    # Number of steps should be about T/h + 1
    expected_steps = int(T/h) + 1

def test_large_vector_ode():
    # Test system with 1000 variables, each decaying independently
    def f(t, y): return [-yi for yi in y]
    y0 = [1.0]*1000
    ivp = IVP(f)
    ivp.set_initial_value(y0, 0.0)
    h = 0.1
    T = 0.2
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 1.82ms -> 1.76ms (3.42% faster)

def test_performance_large_steps():
    # Test that integration of a large system is reasonably fast and doesn't crash
    def f(t, y): return [-yi for yi in y]
    y0 = [1.0]*999
    ivp = IVP(f)
    ivp.set_initial_value(y0, 0.0)
    h = 0.5
    T = 10.0
    step = True
    relax = False
    codeflash_output = ivp._integrate_fixed_trajectory(h, T, step, relax); sol = codeflash_output # 10.1ms -> 9.17ms (9.68% faster)
    # Should not crash, and shape should be correct
    expected_steps = int(T/h) + 1
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-IVP._integrate_fixed_trajectory-mggp3cyz and push.

The optimization replaces the O(n²) performance bottleneck caused by repeated `np.vstack()` operations with an efficient pre-allocation strategy. **Key optimization:** - **Pre-allocates solution array**: Instead of starting with a single row and repeatedly calling `np.vstack()` to grow the array (which requires copying all existing data each time), the code pre-allocates chunks of 128 rows and grows by additional chunks only when needed. - **Direct array indexing**: Replaces expensive `np.hstack()` and `np.vstack()` calls with direct array assignment using indexing (`sol[idx, 0] = self.t`). - **Final trimming**: After integration completes, unused pre-allocated rows are trimmed with a single slice operation. **Why it's faster:** The original code's `np.vstack()` operations had O(n) cost per iteration, leading to O(n²) total complexity as the solution array grew. The line profiler shows 8.5% of runtime was spent on vstack operations. The optimized version achieves amortized O(1) growth by pre-allocating space and only occasionally resizing, reducing memory allocations and data copying. **Performance gains by test type:** - **Small integrations** (few steps): 20-30% speedup due to eliminating array creation overhead - **Medium integrations** (10-100 steps): 40-60% speedup as vstack costs become more significant - **Large systems** (1000+ variables): Smaller but consistent 3-10% gains since integration dominates runtime - **Many steps** (500+ iterations): 60-65% speedup where the O(n²) vstack penalty was most severe The optimization maintains identical output format and behavior while dramatically improving scalability for longer integration trajectories.

codeflash-ai bot requested a review from mashraf-222 October 7, 2025 15:09

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 7, 2025

misrasaurabh1 approved these changes Oct 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `IVP._integrate_fixed_trajectory` by 16% #11

⚡️ Speed up method `IVP._integrate_fixed_trajectory` by 16% #11

Uh oh!

codeflash-ai bot commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method IVP._integrate_fixed_trajectory by 16% #11

Are you sure you want to change the base?

⚡️ Speed up method IVP._integrate_fixed_trajectory by 16% #11

Uh oh!

Conversation

codeflash-ai bot commented Oct 7, 2025

📄 16% (0.16x) speedup for IVP._integrate_fixed_trajectory in quantecon/_ivp.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method `IVP._integrate_fixed_trajectory` by 16% #11

⚡️ Speed up method `IVP._integrate_fixed_trajectory` by 16% #11

📄 16% (0.16x) speedup for `IVP._integrate_fixed_trajectory` in `quantecon/_ivp.py`