Skip to content

interactive benchmark to compare the execution speed of common numerical operations performed using standard Python lists and loops versus NumPy's optimized array operations.

Notifications You must be signed in to change notification settings

manyan-chan/numpy-vs-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Vanilla Python vs. NumPy Performance Comparison

Screenshot of the Streamlit app

This Streamlit application provides an interactive benchmark to compare the execution speed of common numerical operations performed using standard Python lists and loops versus NumPy's optimized array operations.

The goal is to visually and quantitatively demonstrate the significant performance advantages that NumPy offers for numerical computing in Python, especially as data sizes increase.

Features

  • Configurable Data Size:
    • Users can select the number of elements (from 10,000 up to 5,000,000) for the lists/arrays used in the benchmark via an interactive slider.
  • Selectable Numerical Operations:
    • Users can choose which common numerical tasks to compare, including:
      • Summing all elements.
      • Element-wise multiplication of two lists/arrays followed by a sum (similar to a 1D dot product).
      • Applying an element-wise function (e.g., square root).
      • Conditional filtering of elements based on a threshold.
  • Performance Timing:
    • The application precisely times the execution of each selected operation for:
      1. A vanilla Python implementation using standard lists and for loops.
      2. A NumPy implementation using its vectorized functions and array operations.
    • Both implementations operate on identical randomly generated data for a fair comparison.
  • Results Display:
    • Presents a clear table showing the execution time (in seconds) for each operation, comparing "Vanilla Python" and "NumPy". The faster time for each operation is highlighted.
    • Visualizes the comparison with an interactive bar chart (using Plotly Express). The chart may use a logarithmic Y-axis if performance differences are very large, enhancing readability.
  • Educational Explanation:
    • Includes a dedicated section explaining the core reasons why NumPy typically outperforms vanilla Python for numerical tasks (e.g., vectorization, compiled C code, memory efficiency).

Why This App?

This application serves to:

  • Illustrate NumPy's Speed: Provide a clear, hands-on demonstration of NumPy's computational efficiency.
  • Educate on Vectorization: Help users understand the concept of vectorization and its impact on performance.
  • Reinforce Best Practices: Encourage the use of NumPy for numerical tasks in Python by showcasing its benefits.

Requirements

  • Python 3.8+
  • The libraries listed in requirements.txt (or install manually):
    • streamlit
    • numpy
    • pandas (used for structuring and displaying results)
    • plotly

Installation

  1. Clone the repository (or download the app.py script):

  2. Create and activate a virtual environment (recommended): '''bash python -m venv venv

On Windows

venv\Scripts\activate

On macOS/Linux

source venv/bin/activate

'''

  1. Install the required packages: If a requirements.txt file is provided with the script: '''bash pip install -r requirements.txt ''' Otherwise, install them manually: '''bash pip install streamlit numpy pandas plotly '''

Running the App

Once the dependencies are installed, navigate to the directory containing the application script (app.py) and run:

'''bash streamlit run app.py '''

Your web browser should open automatically to the application's URL (usually http://localhost:8501).

How to Use

  1. Configure Data Size:
    • In the sidebar, use the "Number of Elements in Arrays/Lists" slider to select the size of the data to be processed.
  2. Select Operations:
    • In the sidebar, check the boxes next to the numerical operations you wish to compare.
  3. Run Comparison:
    • Click the "🚀 Run Comparison" button in the sidebar.
  4. View Results:
    • The main panel will show a spinner while computations are in progress.
    • Once complete, a table with detailed timing results and an interactive bar chart visualizing these results will be displayed.
    • Read the "💡 Why is NumPy Faster?" section for an explanation of the performance differences.

Technical Stack

  • Streamlit: For creating the interactive web application.
  • NumPy: The fundamental package for scientific computing with Python, used for array operations.
  • Pandas: Used for organizing benchmark results into a DataFrame for easy display and for melting data for Plotly.
  • Plotly (Plotly Express): For generating interactive charts.
  • Python time module: For performance timing.
  • Python random module: For generating data for vanilla Python lists.

About

interactive benchmark to compare the execution speed of common numerical operations performed using standard Python lists and loops versus NumPy's optimized array operations.

Resources

Stars

Watchers

Forks

Languages