# INTEGRATE Timing Analysis Example

This example demonstrates how to perform comprehensive timing analysis of the INTEGRATE workflow
using the built-in timing_compute() and timing_plot() functions. 

The timing analysis benchmarks four main components:
1. Prior model generation (layered geological models)
2. Forward modeling using GA-AEM electromagnetic simulation  
3. Rejection sampling for probabilistic inversion
4. Posterior statistics computation

Results are automatically saved and comprehensive plots are generated showing:
- Performance scaling with dataset size and processor count
- Speedup analysis and parallel efficiency
- Comparisons with traditional least squares and MCMC methods
- Component-wise timing breakdowns

In [1]:
try:
    # Check if the code is running in an IPython kernel (which includes Jupyter notebooks)
    get_ipython()
    # If the above line doesn't raise an error, it means we are in a Jupyter environment
    # Execute the magic commands using IPython's run_line_magic function
    get_ipython().run_line_magic('load_ext', 'autoreload')
    get_ipython().run_line_magic('autoreload', '2')
except:
    # If get_ipython() raises an error, we are not in a Jupyter environment
    pass

In [2]:
import integrate as ig
import numpy as np
import matplotlib.pyplot as plt
import time

# Check if parallel computations can be performed
parallel = ig.use_parallel(showInfo=1)

Notebook detected. Parallel processing is OK


## Quick Timing Test

This example runs a quick timing test with a small subset of dataset sizes 
and processor counts to demonstrate the timing functions.

In [3]:
print("# Running Quick Timing Test")
print("="*50)

# Define test parameters - small arrays for quick demonstration
N_arr_quick = [100, 500, 1000]  # Small dataset sizes for quick test
Nproc_arr_quick = [1, 2, 4]     # Limited processor counts

# Run timing computation
timing_file = ig.timing_compute(N_arr=N_arr_quick, Nproc_arr=Nproc_arr_quick)

print(f"\nTiming results saved to: {timing_file}")

# Running Quick Timing Test
Notebook detected. Parallel processing is OK
# TIMING TEST
Hostname (system): 3990X (Linux) 
Number of processors: 64
Using data file: DAUGAARD_AVG.h5
Using GEX file: TX07_20231016_2x4_RC20-33.gex
Testing on 3 data sets of size(s): [100, 500, 1000]
Testing on 3 sets of core(s): [1, 2, 4]
Writing results to timing_3990X-Linux-64core_Nproc3_N3.npz 
TIMING: N=100, Ncpu=1, Ncpu_min=0


                                                                                                                                                                                  

prior_data_gaaem: Using 1 parallel threads.


                                                                                                                                                                                  

prior_data_gaaem: Time=  3.1s/100 soundings. 31.2ms/sounding, 32.0it/s


                                                                                                                                                                                  

integrate_rejection: Time=  1.9s/11693 soundings,  0.2ms/sounding, 6023.2it/s. T_av=2350.6, EV_av=-875.5


                                                                                                                                                                                  

TIMING: N=500, Ncpu=1, Ncpu_min=0


                                                                                                                                                                                  

prior_data_gaaem: Using 1 parallel threads.


                                                                                                                                                                                  

prior_data_gaaem: Time= 16.6s/500 soundings. 33.2ms/sounding, 30.1it/s


                                                                                                                                                                                  

integrate_rejection: Time=  2.9s/11693 soundings,  0.2ms/sounding, 4058.9it/s. T_av=512.3, EV_av=-312.4


                                                                                                                                                                                  

TIMING: N=1000, Ncpu=1, Ncpu_min=0


                                                                                                                                                                                  

prior_data_gaaem: Using 1 parallel threads.


                                                                                                                                                                                  

prior_data_gaaem: Time= 30.0s/1000 soundings. 30.0ms/sounding, 33.3it/s


                                                                                                                                                                                  

integrate_rejection: Time=  3.8s/11693 soundings,  0.3ms/sounding, 3059.0it/s. T_av=314.6, EV_av=-231.6


                                                                                                                                                                                  

TIMING: N=100, Ncpu=2, Ncpu_min=0


                                                                                                                                                                                  

prior_data_gaaem: Using 2 parallel threads.


                                                                                                                                                                                  

prior_data_gaaem: Time=  3.6s/100 soundings. 36.3ms/sounding, 27.6it/s


                                                                                                                                                                                  

integrate_rejection: Time=  1.2s/11693 soundings,  0.1ms/sounding, 9559.0it/s. T_av=2040.2, EV_av=-756.3


                                                                                                                                                                                  

TIMING: N=500, Ncpu=2, Ncpu_min=0


                                                                                                                                                                                  

prior_data_gaaem: Using 2 parallel threads.


                                                                                                                                                                                  

prior_data_gaaem: Time=  7.4s/500 soundings. 14.9ms/sounding, 67.2it/s


                                                                                                                                                                                  

integrate_rejection: Time=  1.6s/11693 soundings,  0.1ms/sounding, 7413.6it/s. T_av=625.1, EV_av=-331.6


                                                                                                                                                                                  

TIMING: N=1000, Ncpu=2, Ncpu_min=0


                                                                                                                                                                                  

prior_data_gaaem: Using 2 parallel threads.


                                                                                                                                                                                  

prior_data_gaaem: Time= 14.9s/1000 soundings. 14.9ms/sounding, 67.2it/s


                                                                                                                                                                                  

integrate_rejection: Time=  2.1s/11693 soundings,  0.2ms/sounding, 5619.3it/s. T_av=417.0, EV_av=-313.7


                                                                                                                                                                                  

TIMING: N=100, Ncpu=4, Ncpu_min=0


                                                                                                                                                                                  

prior_data_gaaem: Using 4 parallel threads.


                                                                                                                                                                                  

prior_data_gaaem: Time=  1.2s/100 soundings. 11.8ms/sounding, 85.1it/s


                                                                                                                                                                                  

integrate_rejection: Time=  0.7s/11693 soundings,  0.1ms/sounding, 15654.3it/s. T_av=2192.2, EV_av=-972.6


                                                                                                                                                                                  

TIMING: N=500, Ncpu=4, Ncpu_min=0


                                                                                                                                                                                  

prior_data_gaaem: Using 4 parallel threads.


                                                                                                                                                                                  

prior_data_gaaem: Time=  3.9s/500 soundings.  7.8ms/sounding, 128.3it/s


                                                                                                                                                                                  

integrate_rejection: Time=  0.9s/11693 soundings,  0.1ms/sounding, 12685.3it/s. T_av=519.7, EV_av=-293.0


                                                                                                                                                                                  

TIMING: N=1000, Ncpu=4, Ncpu_min=0


                                                                                                                                                                                  

prior_data_gaaem: Using 4 parallel threads.


                                                                                                                                                                                  

prior_data_gaaem: Time=  9.9s/1000 soundings.  9.9ms/sounding, 100.8it/s


                                                                                                                                                                                  

integrate_rejection: Time=  1.3s/11693 soundings,  0.1ms/sounding, 9238.2it/s. T_av=394.6, EV_av=-285.2


                                                                                                                                                                                  


Timing results saved to: timing_3990X-Linux-64core_Nproc3_N3.npz




## Generate Comprehensive Timing Plots

The timing_plot() function generates multiple analysis plots automatically

In [8]:
print("\n# Generating Timing Plots")
print("="*50)

# Generate comprehensive timing plots
ig.timing_plot(f_timing=timing_file)

print(f"Timing plots generated with prefix: {timing_file.split('.')[0]}")
print("Generated plots include:")
print("- Total execution time analysis")
print("- Forward modeling performance and speedup")
print("- Rejection sampling scaling analysis") 
print("- Posterior statistics performance")
print("- Cumulative time breakdowns")


# Generating Timing Plots
Plotting timing results from timing_3990X-Linux-64core_Nproc3_N3.npz
Timing plots generated with prefix: timing_3990X-Linux-64core_Nproc3_N3
Generated plots include:
- Total execution time analysis
- Forward modeling performance and speedup
- Rejection sampling scaling analysis
- Posterior statistics performance
- Cumulative time breakdowns


## Medium Scale Timing Test

This example shows how to run a more comprehensive timing test with larger datasets.
Uncomment the code below to run a medium-scale test (takes longer to complete).

In [5]:
# Uncomment the block below for medium-scale timing test
"""
print("\n# Running Medium Scale Timing Test")
print("="*50)

# Define medium-scale test parameters  
N_arr_medium = [100, 500, 1000, 5000, 10000]  # Medium dataset sizes
Nproc_arr_medium = [1, 2, 4, 8]               # More processor counts

# Run timing computation
timing_file_medium = ig.timing_compute(N_arr=N_arr_medium, Nproc_arr=Nproc_arr_medium)

print(f"Medium-scale timing results saved to: {timing_file_medium}")

# Generate plots
ig.timing_plot(f_timing=timing_file_medium)
print(f"Medium-scale timing plots generated with prefix: {timing_file_medium.split('.')[0]}")
"""

'\nprint("\n# Running Medium Scale Timing Test")\nprint("="*50)\n\n# Define medium-scale test parameters  \nN_arr_medium = [100, 500, 1000, 5000, 10000]  # Medium dataset sizes\nNproc_arr_medium = [1, 2, 4, 8]               # More processor counts\n\n# Run timing computation\ntiming_file_medium = ig.timing_compute(N_arr=N_arr_medium, Nproc_arr=Nproc_arr_medium)\n\nprint(f"Medium-scale timing results saved to: {timing_file_medium}")\n\n# Generate plots\nig.timing_plot(f_timing=timing_file_medium)\nprint(f"Medium-scale timing plots generated with prefix: {timing_file_medium.split(\'.\')[0]}")\n'

## Full Scale Timing Test  

For production timing analysis, you can run the full test with the default parameters.
This will test a wide range of dataset sizes and all available processor counts.

In [6]:
# Uncomment the block below for full-scale timing test (takes significant time)
"""
print("\n# Running Full Scale Timing Test")
print("="*50)

# Run with default parameters (comprehensive test)
timing_file_full = ig.timing_compute()  # Uses default N_arr and Nproc_arr

print(f"Full-scale timing results saved to: {timing_file_full}")

# Generate comprehensive plots
ig.timing_plot(f_timing=timing_file_full)
print(f"Full-scale timing plots generated with prefix: {timing_file_full.split('.')[0]}")
"""

'\nprint("\n# Running Full Scale Timing Test")\nprint("="*50)\n\n# Run with default parameters (comprehensive test)\ntiming_file_full = ig.timing_compute()  # Uses default N_arr and Nproc_arr\n\nprint(f"Full-scale timing results saved to: {timing_file_full}")\n\n# Generate comprehensive plots\nig.timing_plot(f_timing=timing_file_full)\nprint(f"Full-scale timing plots generated with prefix: {timing_file_full.split(\'.\')[0]}")\n'

## Custom Timing Configuration

You can also customize the timing test for specific scenarios

In [7]:
print("\n# Example: Custom Timing Configuration")
print("="*50)

# Example: Focus on specific dataset sizes of interest
N_arr_custom = [1000, 5000, 10000]  # Focus on medium-large datasets
Nproc_arr_custom = [1, 4, 8]        # Test specific processor counts

print(f"Custom test configuration:")
print(f"Dataset sizes: {N_arr_custom}")  
print(f"Processor counts: {Nproc_arr_custom}")
print(f"This configuration tests {len(N_arr_custom)} × {len(Nproc_arr_custom)} = {len(N_arr_custom) * len(Nproc_arr_custom)} combinations")

# Uncomment to run custom timing test
"""
timing_file_custom = ig.timing_compute(N_arr=N_arr_custom, Nproc_arr=Nproc_arr_custom)
ig.timing_plot(f_timing=timing_file_custom)
print(f"Custom timing analysis complete: {timing_file_custom}")
"""


# Example: Custom Timing Configuration
Custom test configuration:
Dataset sizes: [1000, 5000, 10000]
Processor counts: [1, 4, 8]
This configuration tests 3 × 3 = 9 combinations


'\ntiming_file_custom = ig.timing_compute(N_arr=N_arr_custom, Nproc_arr=Nproc_arr_custom)\nig.timing_plot(f_timing=timing_file_custom)\nprint(f"Custom timing analysis complete: {timing_file_custom}")\n'

## Understanding Timing Results

The timing analysis provides insights into:

### Performance Scaling
- How execution time varies with dataset size
- Parallel efficiency across different processor counts
- Identification of computational bottlenecks

### Component Analysis  
- Relative time spent in each workflow component
- Which components benefit most from parallelization
- Memory vs compute-bound identification

### Comparison Baselines
- Performance relative to traditional least squares methods
- Comparison with MCMC sampling approaches
- Cost-benefit analysis of different configurations

### Optimization Guidance
- Optimal processor counts for different dataset sizes
- Sweet spots for price-performance ratios
- Scaling behavior for production deployments