A production-ready desktop application that sorts large binary files of IEEE-754 double-precision floating-point numbers using External Merge Sort, with a full PySide6 GUI and step-by-step demo mode.
Built as a university assignment for Advanced Data Structures & Algorithms.
- Algorithm Overview
- Features
- Project Structure
- Installation
- CLI Usage
- GUI Usage
- Building an Executable
- License
External Merge Sort is designed to sort data that does not fit entirely in RAM. It works in two phases:
- Read the input file in fixed-size chunks of
run_capacitydoubles. - Sort each chunk in memory using Python's built-in Timsort.
- Write each sorted chunk to a temporary run file on disk.
- Group the run files into batches of
k(default: 2). - Merge each group using buffered I/O:
- 2-way merge: classic two-pointer technique.
- k-way merge (k > 2): min-heap (
heapq) selects the smallest element acrosskinput buffers.
- Write merged results to new run files.
- Repeat until only one sorted run remains — this becomes the output file.
| Property | Value |
|---|---|
| Time complexity | O(N log N) |
| I/O passes | O(log_k(N / M)) |
| Memory usage | O(M) where M = run_capacity × 8 bytes |
| Merge strategy | Buffered k-way merge with heapq |
- CLI mode — sort files from the command line with progress bar
- GUI mode — full PySide6 desktop application
- Demo mode — step-by-step visualisation of every comparison and output during merge
- Configurable parameters — run capacity, buffer size, k-way fan-in
- Verification — automatically verifies the output is sorted
- No numpy — lightweight, uses only
structfor binary I/O - Windows compatible — tested on Windows 10/11
external-merge-sort-visualizer/
├── app/
│ ├── __init__.py
│ ├── main.py # CLI entry-point & argument parser
│ └── ui_main.py # PySide6 GUI (QThread workers, demo table)
├── core/
│ ├── __init__.py
│ ├── binary_io.py # Read/write IEEE-754 doubles (no numpy)
│ ├── run_generation.py # Phase 1: split input into sorted runs
│ ├── merge.py # Phase 2: 2-way & k-way buffered merge
│ ├── external_sort.py # Full pipeline orchestrator
│ ├── demo_steps.py # Generator-based demo (yields events)
│ └── verify.py # Post-sort verification
├── samples/
│ └── generate_samples.py
├── docs/
│ └── USER_GUIDE.md
├── README.md
├── requirements.txt
├── .gitignore
└── LICENSE
- Python 3.10+
- pip
# 1. Clone the repository
git clone https://github.com/your-username/external-merge-sort-visualizer.git
cd external-merge-sort-visualizer
# 2. (Recommended) Create a virtual environment
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # macOS / Linux
# 3. Install dependencies
pip install -r requirements.txtpython samples/generate_samples.py --count 100000 --output samples/data.binpython -m app.main --input samples/data.bin --output samples/sorted.bin --verifypython -m app.main \
--input samples/data.bin \
--output samples/sorted.bin \
--run-capacity 20000 \
--buffer 8192 \
--k 4 \
--verifypython samples/generate_samples.py --count 20 --output samples/tiny.bin --seed 42
python -m app.main --input samples/tiny.bin --output samples/tiny_sorted.bin --demo| Option | Default | Description |
|---|---|---|
--input / -i |
required | Input binary file path |
--output / -o |
required | Output binary file path |
--run-capacity |
50000 | Doubles per in-memory run |
--buffer |
4096 | Doubles per I/O buffer |
--k |
2 | Merge fan-in (2, 4, or 8) |
--demo |
off | Step-by-step demo output |
--verify |
off | Verify output is sorted |
--keep-runs |
off | Preserve temp run files |
--gui |
off | Launch GUI instead of CLI |
python -m app.main --gui- Select input file using the Browse button.
- Select output file location.
- Adjust Run capacity, Buffer size, and k-way as needed.
- (Optional) Check Demo Mode to visualise the merge step by step.
- Click Start.
- Use Pause / Resume to control execution.
- In Demo Mode, use Step to advance one event at a time.
- Watch the progress bar and log area for status updates.
- The demo table shows every compare and output event with colour highlighting.
Use PyInstaller to create a standalone .exe:
pip install pyinstaller
pyinstaller --noconfirm --onedir --windowed --name ExternalSortApp app/main.pyThe output will be in dist/ExternalSortApp/. Run ExternalSortApp.exe directly.
MIT License — see LICENSE for details.