This project is a Natural Language Processing (NLP) application focused on analyzing and classifying Arabic text related to human behavior and manners.
This project is a Natural Language Processing (NLP) application focused on analyzing and classifying Arabic text related to human behavior and manners.
Understanding and categorizing human values and behaviors expressed in text is a challenging task, especially in Arabic due to its linguistic complexity. This project aims to automatically classify textual content—such as citations, poems, and short paragraphs—based on whether they represent positive (good) manners or negative (bad) manners.
- Collected over 5,000 Arabic text samples through web scraping using Python
- Data includes:
- Citations
- Poems
- Paragraphs related to ethics and behavior
- Automated data collection from multiple online sources
- Text cleaning (removal of noise, punctuation, etc.)
- Normalization of Arabic text
- Named Entity Recognition (NER)
- Stemming for Arabic words
- Trained a Machine Learning model to classify text into:
- ✅ Good manners
- ❌ Bad manners
The final model is capable of analyzing Arabic textual input and predicting the type of behavior it represents.
Before starting this project, ensure you have the following software and libraries installed:
- Python (version 3.x)
- VSCode (or any preferred code editor)
Make sure you have the following libraries installed:
- requests: For sending HTTP requests.
- BeautifulSoup: For parsing HTML and XML documents.
- camel_tools: A library for processing and analyzing Arabic text, including:
SentimentAnalyzerMorphologyDBAnalyzersimple_word_tokenize
- xlsxwriter: For creating and writing to Excel files.
- csv: A built-in Python library for working with CSV files.
-
Clone the repository:
git clone https://github.com/your-username/your-repository-name.git cd your-repository-name -
Create a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows, use: venv\Scripts\activate
-
Install the required libraries:
pip install requests beautifulsoup4 camel-tools xlsxwriter
- Open the project in VSCode or your preferred code editor.
- Run the main script:
python main.py