Skip to content

A Python tool to analyze and plot the performance (time, comparisons, swaps) of various sorting algorithms across different array sizes and types.

License

Notifications You must be signed in to change notification settings

msmrexe/python-sorting-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sorting Algorithm Analyzer

This project is a Python tool to analyze and plot the performance of various sorting algorithms. It was built for an algorithms course to demonstrate the practical differences between algorithms based on time complexity and implementation.

The analyzer runs multiple sorting algorithms on arrays of different sizes and types (random, sorted, reversed). It records the execution time, number of comparisons, and number of swaps, then saves the raw data to a .csv file and generates a full set of performance plots.

Features

  • Multiple Algorithms: Compares Insertion Sort, Bubble Sort, Merge Sort, Quicksort, and Python's built-in Timsort.
  • Detailed Metrics: Measures Time (ms), Comparisons, and Swaps/Movements for a comprehensive analysis.
  • Case Analysis: Tests algorithms against Random (average case), Sorted (best case), and Reversed (worst case) arrays.
  • Modular Package: Code is cleanly structured in a sorting_analyzer package.
  • Data Export: Saves all raw experiment data to a .csv file.
  • Automatic Plotting: Uses seaborn and matplotlib to automatically generate and save publication-quality plots to a plots/ directory.

Project Structure

python-sorting-analyzer/
├── .gitignore
├── LICENSE
├── README.md                # This documentation
├── requirements.txt         # Project dependencies
├── main.py                  # Main runnable script (CLI)
└── sorting_analyzer/
    ├── __init__.py          # Makes 'sorting_analyzer' a package
    ├── algorithms.py        # All sorting function implementations
    ├── analyzer.py          # The main experiment-running logic
    └── plotter.py           # Plot generation logic

How It Works

The tool is orchestrated by main.py:

  1. Setup: The user runs main.py, optionally specifying the max array size (--max-size) and number of steps (--steps).
  2. Analysis (analyzer.py):
    • The run_analysis function iterates through every combination of algorithm, array size, and array type.
    • For each run, it generates the specified array (e.g., a 1000-element reversed list).
    • It starts a high-precision timer (time.perf_counter).
    • It calls the sorting function (e.g., algorithms.insertion_sort()).
    • It stops the timer and records the elapsed milliseconds.
    • The sorting function itself returns the number of comparisons and swaps it performed, which are also recorded.
    • All results are collected into a pandas DataFrame.
  3. Data Export: The DataFrame is saved to a .csv file (e.g., sorting_results.csv).
  4. Plotting (plotter.py):
    • The generate_plots function takes the DataFrame.
    • It iterates through each metric ("Time", "Comparisons", "Swaps") and array type ("Random", "Sorted").
    • It uses seaborn.lineplot to create a chart comparing all algorithms for that specific scenario (e.g., "Time vs. Array Size for Random Arrays").
    • Each plot is saved as a .png file to the plots/ directory.

How to Run

  1. Clone the repository:

    git clone https://github.com/msmrexe/python-sorting-analyzer.git
    cd python-sorting-analyzer
  2. Create a virtual environment (Recommended):

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Run the analysis:

    # Run with default settings (max size 2000, 10 steps)
    python main.py
    
    # Run a larger analysis
    python main.py --max-size 10000 --steps 20
    
    # Specify output files
    python main.py --csv my_data.csv --plots-dir my_plots

    After running, check the plots/ directory for your .png graphs and sorting_results.csv for the raw data.

Performance Analysis

The plots generated by this tool visually demonstrate core computer science concepts:

  • $O(n^2)$ vs. $O(n \log n)$:

    • On Random arrays, the "Time" and "Comparisons" plots show the lines for Bubble Sort and Insertion Sort curving up quadratically (like $x^2$), while Merge Sort, Quicksort, and Timsort look almost linear (log-linear, $n \log n$). This is the most important comparison.
  • Best-Case Scenarios (Sorted Arrays):

    • Insertion Sort: The plot for Sorted arrays shows Insertion Sort is extremely fast. It becomes $O(n)$ (linear time) because it only makes one comparison per element and no swaps.
    • Bubble Sort: Our optimized version also becomes $O(n)$ on a sorted array, as it will make one pass and then exit.
    • Quicksort (Worst-Case): The Sorted array plot shows Quicksort's "Time" and "Comparisons" spike dramatically, performing as badly as (or worse than) Bubble Sort. This is the classic $O(n^2)$ worst-case for Quicksort when the pivot is always the smallest/largest element.
  • Worst-Case Scenarios (Reversed Arrays):

    • Insertion Sort: The plot shows this is the worst-case. It requires the maximum number of comparisons and swaps, clearly showing its $O(n^2)$ nature.
    • Quicksort: This is also the $O(n^2)$ worst-case, just like the sorted array.
  • Algorithm of Choice:

    • Timsort (Python's built-in) is the fastest in all "Time" plots. It's a hybrid, highly-optimized algorithm (mergesort + insertion sort) written in C.
    • Merge Sort is be the most consistent. Its "Time" and "Comparisons" plots look nearly identical across Random, Sorted, and Reversed arrays, proving its $O(n \log n)$ guarantee in all cases.

Author

Feel free to connect or reach out if you have any questions!


License

This project is licensed under the MIT License. See the LICENSE file for full details.

About

A Python tool to analyze and plot the performance (time, comparisons, swaps) of various sorting algorithms across different array sizes and types.

Topics

Resources

License

Stars

Watchers

Forks

Languages