python document_analyzer

Document Analysis Reporter (Python GUI) A desktop application built with Python and Tkinter for analyzing documents by paragraph, using a Hugging Face Transformer model for real-time text classification.

Features GUI Interface: Simple, cross-platform interface for selecting documents.

Multi-File Support: Reads TXT and MD files directly. (Simulates PDF/DOCX conversion, requiring external libraries for full functionality).

Paragraph Chunking: Splits documents reliably by double newlines (\n\n) to analyze content chunk-by-chunk.

NLP Integration: Uses the transformers library for sentiment analysis, with a graceful fallback to a keyword-based heuristic if the model is unavailable.

Report Generation: Saves the analysis results into structured CSV and human-readable TXT files in a user-selected directory.

Installation and Setup Clone the Repository:

git clone https://github.com/sabdulraqeb/python-document_analyzer/tree/main cd document-analysis-reporter

Create a Virtual Environment (Recommended):

python -m venv venv source venv/bin/activate # On Windows, use venv\Scripts\activate

Install Dependencies:

pip install -r requirements.txt

Note: Installing torch and transformers is necessary to enable the real NLP model. If you skip this step, the app will automatically use the simulated, keyword-based analysis.

Usage Run the main script from your terminal:

python document_analyzer.py

A window will open showing the application status (either "NLP Model Active!" or "Using Keyword Simulation").

Click "Select Document" and choose a .txt, .md, .pdf (simulated), or .docx (simulated) file.

After processing, a second dialog will appear asking you to select the folder where the analysis reports (_report.csv and _report.txt) should be saved.

A final confirmation message will appear once the files are saved successfully.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitattributes		.gitattributes
README.md		README.md
document_analyzer.py		document_analyzer.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

python document_analyzer

About

Uh oh!

Releases

Packages

Languages

sabdulraqeb/python-document_analyzer

Folders and files

Latest commit

History

Repository files navigation

python document_analyzer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages