Image Processing Performance Analysis

This project compares Sequential, Parallel, and Distributed image preprocessing techniques using Python.

📂 Folder Structure

images_dataset/
│
├── cars/
├── Cat/
├── dogs/
└── Flowers/

Each folder contains class images that are processed (resized + watermarked).

🧩 Tasks Overview

Task 1 — Sequential Processing

Reads all images, resizes them to 128×128, adds a watermark, and saves them to output_seq/.

Run Command:

python sequential_process.py

Output Example:

Sequential Processing Time: 0.23 seconds

Task 2 — Parallel Processing

Performs the same operations in parallel using multiple worker processes.

Run Command:

python parallel_process.py

This script uses Python’s concurrent.futures.ProcessPoolExecutor to test configurations with 1, 2, 4, and 8 workers.

Actual Output:

Workers	Time (s)	Speedup
1	0.59	1.00x
2	0.68	0.87x
4	1.15	0.52x
8	1.31	0.45x

Results are saved in output_parallel/.

Task 3 — Simulated Distributed Task

Simulates a distributed environment using multiprocessing.Manager() and logical “nodes” within one system. Each node processes half of the dataset and reports its time.

Run Command:

python distributed_process.py

Actual Output:

Node 1 processed 47 images in 0.13s
Node 2 processed 47 images in 0.12s
Total distributed time: 0.61s
Efficiency: 0.38x over sequential

Task 4 — Report Generation

Generates a short performance report comparing all methods and configurations.

Run Command:

python generate_report.py

This creates a file named report.pdf summarizing:

Execution time comparison
Speedup table
Best configuration
Discussion on performance and bottlenecks

📊 Results Summary

Mode	Configuration	Time (s)	Speedup
Sequential	—	0.23	1.00x
Parallel	1 Worker	0.59	0.39x
Parallel	2 Workers	0.68	0.34x
Parallel	4 Workers	1.15	0.20x
Parallel	8 Workers	1.31	0.18x
Distributed	2 Nodes	0.61	0.38x

✅ Best Configuration

Sequential execution achieved the best performance at 0.23 seconds.

Parallel and distributed versions were slower due to:

Small dataset size
Lightweight operations (resize + watermark)
High process and I/O overhead compared to actual computation time

For larger datasets or CPU-intensive image processing, parallelism and distribution would likely outperform sequential execution.

💬 Discussion

Although parallelism is designed to improve performance, in this case, it did not provide speedup due to the nature of the workload.

The dataset was small, and image operations were I/O-bound (reading/writing files).
Multiprocessing overhead (process creation, data transfer) outweighed benefits.
The sequential version avoided these costs, completing faster overall.

Remaining Bottlenecks

Process initialization overhead for each worker
Disk I/O contention during concurrent file access
Limited CPU workload per image
Python multiprocessing overhead on small tasks

Possible Improvements

Use ThreadPoolExecutor for I/O-heavy workloads
Increase dataset size or apply heavier transformations
Store data on SSD/RAM disk for faster I/O
Explore GPU acceleration (e.g., CuPy, CUDA) or frameworks like Dask/Ray

🖼️ Sample Outputs

Example processed images are stored in the sample_output/ folder, showing:

Resized and watermarked images from each mode:
- output_seq/
- output_parallel/
- output_distributed/

📄 Report

A detailed report file is included:
📘 report.pdf

👨‍💻 Author

Muhammad Faizan Sajid
Python Image Processing — Performance Benchmark Project (2025)

⚖️ License

This project is open for educational and research use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Processing Performance Analysis

📂 Folder Structure

🧩 Tasks Overview

Task 1 — Sequential Processing

Task 2 — Parallel Processing

Task 3 — Simulated Distributed Task

Task 4 — Report Generation

📊 Results Summary

✅ Best Configuration

💬 Discussion

Remaining Bottlenecks

Possible Improvements

🖼️ Sample Outputs

📄 Report

👨‍💻 Author

⚖️ License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
images_dataset		images_dataset
sample_output		sample_output
README.md		README.md
distributed_sim.py		distributed_sim.py
parallel_process.py		parallel_process.py
report.pdf		report.pdf
sequential_process.py		sequential_process.py

faizan98515/Image-processing-using-Sequential-Parallel-and-distributed-computing

Folders and files

Latest commit

History

Repository files navigation

Image Processing Performance Analysis

📂 Folder Structure

🧩 Tasks Overview

Task 1 — Sequential Processing

Task 2 — Parallel Processing

Task 3 — Simulated Distributed Task

Task 4 — Report Generation

📊 Results Summary

✅ Best Configuration

💬 Discussion

Remaining Bottlenecks

Possible Improvements

🖼️ Sample Outputs

📄 Report

👨‍💻 Author

⚖️ License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages