Document Analysis Reporter (Python GUI) A desktop application built with Python and Tkinter for analyzing documents by paragraph, using a Hugging Face Transformer model for real-time text classification.
Features GUI Interface: Simple, cross-platform interface for selecting documents.
Multi-File Support: Reads TXT and MD files directly. (Simulates PDF/DOCX conversion, requiring external libraries for full functionality).
Paragraph Chunking: Splits documents reliably by double newlines (\n\n) to analyze content chunk-by-chunk.
NLP Integration: Uses the transformers library for sentiment analysis, with a graceful fallback to a keyword-based heuristic if the model is unavailable.
Report Generation: Saves the analysis results into structured CSV and human-readable TXT files in a user-selected directory.
Installation and Setup Clone the Repository:
git clone https://github.com/sabdulraqeb/python-document_analyzer/tree/main cd document-analysis-reporter
Create a Virtual Environment (Recommended):
python -m venv venv
source venv/bin/activate # On Windows, use venv\Scripts\activate
Install Dependencies:
pip install -r requirements.txt
Note: Installing torch and transformers is necessary to enable the real NLP model. If you skip this step, the app will automatically use the simulated, keyword-based analysis.
Usage Run the main script from your terminal:
python document_analyzer.py
A window will open showing the application status (either "NLP Model Active!" or "Using Keyword Simulation").
Click "Select Document" and choose a .txt, .md, .pdf (simulated), or .docx (simulated) file.
After processing, a second dialog will appear asking you to select the folder where the analysis reports (_report.csv and _report.txt) should be saved.
A final confirmation message will appear once the files are saved successfully.