This project compares Sequential, Parallel, and Distributed image preprocessing techniques using Python.
images_dataset/
│
├── cars/
├── Cat/
├── dogs/
└── Flowers/
Each folder contains class images that are processed (resized + watermarked).
Reads all images, resizes them to 128×128, adds a watermark, and saves them to output_seq/.
Run Command:
python sequential_process.pyOutput Example:
Sequential Processing Time: 0.23 seconds
Performs the same operations in parallel using multiple worker processes.
Run Command:
python parallel_process.pyThis script uses Python’s concurrent.futures.ProcessPoolExecutor to test configurations with 1, 2, 4, and 8 workers.
Actual Output:
| Workers | Time (s) | Speedup |
|---|---|---|
| 1 | 0.59 | 1.00x |
| 2 | 0.68 | 0.87x |
| 4 | 1.15 | 0.52x |
| 8 | 1.31 | 0.45x |
Results are saved in output_parallel/.
Simulates a distributed environment using multiprocessing.Manager() and logical “nodes” within one system.
Each node processes half of the dataset and reports its time.
Run Command:
python distributed_process.pyActual Output:
Node 1 processed 47 images in 0.13s
Node 2 processed 47 images in 0.12s
Total distributed time: 0.61s
Efficiency: 0.38x over sequential
Generates a short performance report comparing all methods and configurations.
Run Command:
python generate_report.pyThis creates a file named report.pdf summarizing:
- Execution time comparison
- Speedup table
- Best configuration
- Discussion on performance and bottlenecks
| Mode | Configuration | Time (s) | Speedup |
|---|---|---|---|
| Sequential | — | 0.23 | 1.00x |
| Parallel | 1 Worker | 0.59 | 0.39x |
| Parallel | 2 Workers | 0.68 | 0.34x |
| Parallel | 4 Workers | 1.15 | 0.20x |
| Parallel | 8 Workers | 1.31 | 0.18x |
| Distributed | 2 Nodes | 0.61 | 0.38x |
Sequential execution achieved the best performance at 0.23 seconds.
Parallel and distributed versions were slower due to:
- Small dataset size
- Lightweight operations (resize + watermark)
- High process and I/O overhead compared to actual computation time
For larger datasets or CPU-intensive image processing, parallelism and distribution would likely outperform sequential execution.
Although parallelism is designed to improve performance, in this case, it did not provide speedup due to the nature of the workload.
- The dataset was small, and image operations were I/O-bound (reading/writing files).
- Multiprocessing overhead (process creation, data transfer) outweighed benefits.
- The sequential version avoided these costs, completing faster overall.
- Process initialization overhead for each worker
- Disk I/O contention during concurrent file access
- Limited CPU workload per image
- Python multiprocessing overhead on small tasks
- Use
ThreadPoolExecutorfor I/O-heavy workloads - Increase dataset size or apply heavier transformations
- Store data on SSD/RAM disk for faster I/O
- Explore GPU acceleration (e.g., CuPy, CUDA) or frameworks like Dask/Ray
Example processed images are stored in the sample_output/ folder, showing:
- Resized and watermarked images from each mode:
output_seq/output_parallel/output_distributed/
A detailed report file is included:
📘 report.pdf
Muhammad Faizan Sajid
Python Image Processing — Performance Benchmark Project (2025)
This project is open for educational and research use.