GitHub - abidayalan/Python-Parallel-text-handling-Processor-: A scalable Python-based parallel text processing system that efficiently handles large datasets using multiprocessing. Includes chunking, pattern matching, rule-based sentiment analysis, and database-backed search.

🚀 Python Parallel Text Handling Processor 📌 Project Overview

The Python Parallel Text Handling Processor is a scalable and lightweight text-processing system designed to efficiently handle large volumes of textual data using Python’s parallel execution capabilities.

Instead of processing text sequentially, the system divides large text files into smaller chunks and processes them simultaneously using multiprocessing. This significantly improves performance, scalability, and execution speed.

The project integrates parallel processing, rule-based text analysis, structured database storage, and search functionality into a single streamlined pipeline.

🎯 Key Features

⚡ Parallel text processing using multiprocessing

📂 Intelligent text chunking (paragraph/sentence/character-based)

🔍 Pattern matching using Regular Expressions

😊 Rule-based sentiment scoring system

🗄️ Structured database storage (SQLite/PostgreSQL)

🔎 Search and filtering functionality using SQL

📊 CSV export for reporting and further analysis

📧 Optional email reporting support

🧠 Core Concepts Used

Parallel Computing (Multi-processing / Multi-threading)

Text Preprocessing & Pattern Matching

Rule-Based Sentiment Analysis

Relational Database Management

File Handling & Data Export

🛠️ Technologies Used

Language: Python

Parallel Processing: multiprocessing, threading, concurrent.futures

Text Processing: re (Regular Expressions)

Database: SQLite (default), PostgreSQL (optional)

Version Control: Git & GitHub

IDE: VS Code / PyCharm

💡 Why This Project?

Large text datasets (logs, documents, research data, etc.) can take significant time to process sequentially. This project demonstrates how parallel computing can drastically reduce execution time while maintaining structured storage and search capability.

It provides a lightweight alternative to heavy NLP frameworks, making it suitable for academic projects, research prototypes, and small-scale analytical systems.

📈 Future Enhancements

Performance benchmarking dashboard

Advanced NLP integration

REST API support

Web-based user interface

🏁 Conclusion

This project showcases efficient large-scale text handling by combining performance optimization, rule-based analysis, and database-backed search into a modular and scalable architecture.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
database.py		database.py
executer.py		executer.py
main.py		main.py
parallel_text.py		parallel_text.py
readme.md		readme.md
sample.txt		sample.txt
sentiment.py		sentiment.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages