This Streamlit application provides an interactive benchmark to compare the execution speed of common numerical operations performed using standard Python lists and loops versus NumPy's optimized array operations.
The goal is to visually and quantitatively demonstrate the significant performance advantages that NumPy offers for numerical computing in Python, especially as data sizes increase.
- Configurable Data Size:
- Users can select the number of elements (from 10,000 up to 5,000,000) for the lists/arrays used in the benchmark via an interactive slider.
- Selectable Numerical Operations:
- Users can choose which common numerical tasks to compare, including:
- Summing all elements.
- Element-wise multiplication of two lists/arrays followed by a sum (similar to a 1D dot product).
- Applying an element-wise function (e.g., square root).
- Conditional filtering of elements based on a threshold.
- Users can choose which common numerical tasks to compare, including:
- Performance Timing:
- The application precisely times the execution of each selected operation for:
- A vanilla Python implementation using standard lists and
for
loops. - A NumPy implementation using its vectorized functions and array operations.
- A vanilla Python implementation using standard lists and
- Both implementations operate on identical randomly generated data for a fair comparison.
- The application precisely times the execution of each selected operation for:
- Results Display:
- Presents a clear table showing the execution time (in seconds) for each operation, comparing "Vanilla Python" and "NumPy". The faster time for each operation is highlighted.
- Visualizes the comparison with an interactive bar chart (using Plotly Express). The chart may use a logarithmic Y-axis if performance differences are very large, enhancing readability.
- Educational Explanation:
- Includes a dedicated section explaining the core reasons why NumPy typically outperforms vanilla Python for numerical tasks (e.g., vectorization, compiled C code, memory efficiency).
This application serves to:
- Illustrate NumPy's Speed: Provide a clear, hands-on demonstration of NumPy's computational efficiency.
- Educate on Vectorization: Help users understand the concept of vectorization and its impact on performance.
- Reinforce Best Practices: Encourage the use of NumPy for numerical tasks in Python by showcasing its benefits.
- Python 3.8+
- The libraries listed in
requirements.txt
(or install manually):- streamlit
- numpy
- pandas (used for structuring and displaying results)
- plotly
-
Clone the repository (or download the
app.py
script): -
Create and activate a virtual environment (recommended): '''bash python -m venv venv
'''
- Install the required packages:
If a
requirements.txt
file is provided with the script: '''bash pip install -r requirements.txt ''' Otherwise, install them manually: '''bash pip install streamlit numpy pandas plotly '''
Once the dependencies are installed, navigate to the directory containing the application script (app.py
) and run:
'''bash streamlit run app.py '''
Your web browser should open automatically to the application's URL (usually http://localhost:8501
).
- Configure Data Size:
- In the sidebar, use the "Number of Elements in Arrays/Lists" slider to select the size of the data to be processed.
- Select Operations:
- In the sidebar, check the boxes next to the numerical operations you wish to compare.
- Run Comparison:
- Click the "🚀 Run Comparison" button in the sidebar.
- View Results:
- The main panel will show a spinner while computations are in progress.
- Once complete, a table with detailed timing results and an interactive bar chart visualizing these results will be displayed.
- Read the "💡 Why is NumPy Faster?" section for an explanation of the performance differences.
- Streamlit: For creating the interactive web application.
- NumPy: The fundamental package for scientific computing with Python, used for array operations.
- Pandas: Used for organizing benchmark results into a DataFrame for easy display and for melting data for Plotly.
- Plotly (Plotly Express): For generating interactive charts.
- Python
time
module: For performance timing. - Python
random
module: For generating data for vanilla Python lists.