# Indonesian Hate Speech Detection - GUI Application Guide

This notebook provides a guide for using the Indonesian Hate Speech Detection GUI application. The GUI application (`gui.py`) provides an easy-to-use interface for detecting hate speech in Indonesian text.

## Table of Contents
1. [Prerequisites](#prerequisites)
2. [Running the GUI Application](#running-the-gui)
3. [GUI Features Overview](#features)
4. [How to Use the Application](#usage)
5. [Understanding Results](#results)


## 1. Prerequisites

Before running the GUI application, ensure you have:

### Required Files
The following files must exist in the `../models/` directory (one level up from notebooks):
- `best_hate_speech_model.pkl` (trained machine learning model)
- `tfidf_vectorizer.pkl` (text vectorizer)
- `model_metadata.json` (model information)

### Required Python Packages
- `tkinter` (usually comes with Python installation)
- `pandas`
- `numpy`  
- `scikit-learn`
- `joblib`
- `re` (built-in)
- `threading` (built-in)

Install missing packages using:
```bash
pip install pandas numpy scikit-learn joblib
```

### System Requirements
- Python 3.6 or higher
- Windows/Mac/Linux with GUI support
- At least 2GB RAM (for model loading)


## 2. Running the GUI Application

### Step 1: Navigate to the notebooks directory
```bash
cd notebooks
```

### Step 2: Run the GUI application
```bash
python gui.py
```

### What happens when you run it:
1. The application will load models from `../models/`
2. You'll see console messages indicating success or failure
3. If successful, a GUI window will open automatically

### Expected Console Output:
```
Script directory: C:\Users\...\indonesian-hate-speech-detection\notebooks
Looking for models in: C:\Users\...\indonesian-hate-speech-detection\models
Model file exists: True
Vectorizer file exists: True
Loading models...
SUCCESS: Models loaded successfully
```


## 3. GUI Features Overview

The Indonesian Hate Speech Detection GUI includes the following features:

### Window Title
**Indonesian Hate Speech Detector** / *Deteksi Ujaran Kebencian dalam Bahasa Indonesia*

### Interface Sections

#### Input Text Section
- **Large text area**: For entering Indonesian text to analyze
- **Placeholder text**: Shows example usage when empty
- **Focus behavior**: Placeholder disappears when you click to type

#### Control Buttons
- **Analyze Text**: Performs hate speech detection analysis
- **Clear**: Clears input text and resets results
- **Example**: Loads example text for testing

#### Analysis Results Section
- **Prediction Result**: Shows classification with color coding
- **Confidence Score**: Model confidence percentage
- **Text Statistics**: Character, word, and sentence counts
- **Processing Time**: Analysis duration in seconds

### Visual Design Elements
- **Color-coded results**:
  - Green background: Normal/safe text
  - Red background: Hate speech detected
- **Indonesian language support**: Interface uses both English and Indonesian
- **Professional layout**: Clean, user-friendly design
- **Real-time updates**: Statistics update as you type


## 4. How to Use the Application

### Step-by-Step Usage Guide

#### Step 1: Launch the Application
- Follow the instructions in Section 2 to start the GUI
- Wait for the "SUCCESS: Models loaded successfully" message
- The GUI window should appear automatically

#### Step 2: Enter Text for Analysis
1. **Click in the text input area**
   - The placeholder text will disappear
   - You can now type or paste your text

2. **Input Indonesian text**
   - The application works best with Indonesian language text
   - You can enter anything from single words to multiple paragraphs
   - Examples: comments, social media posts, articles, messages

#### Step 3: Analyze the Text
1. **Click the "Analyze Text" button**
2. **Wait for processing** (usually 1-3 seconds)
3. **View results** in the Analysis Results section

#### Step 4: Interpret Results
- Check the **prediction** (Normal vs Hate Speech)
- Note the **confidence score** (higher is more certain)
- Review **text statistics** for context

### Quick Testing Options

#### Use Example Text
- Click **"Example"** to load sample text
- Multiple examples available for testing
- Good way to verify the application is working

#### Clear and Reset
- Click **"Clear"** to start over
- Resets all text and results
- Returns interface to initial state

### Usage Tips

#### Text Input Best Practices
- **Language**: Use Indonesian text for best results
- **Length**: Any length works, but longer texts may take more time
- **Format**: Plain text works best (avoid complex formatting)

#### Multiple Analyses
- You can analyze different texts consecutively
- No need to restart the application
- Previous results are replaced with new ones

#### Monitoring Performance
- Check processing time to gauge system performance
- Large texts or complex sentences may take longer
- Confidence scores help assess result reliability


## 5. Understanding Results

### Classification Categories

#### Normal Text
- **Display**: Green background with "Normal" label
- **Meaning**: Text is classified as non-offensive/safe
- **Examples**: 
  - Regular conversations: "Terima kasih atas bantuan Anda"
  - News articles: "Pemerintah mengumumkan kebijakan baru"
  - Positive comments: "Saya sangat menghargai pendapat Anda"

#### Hate Speech
- **Display**: Red background with "Hate Speech" label  
- **Meaning**: Text contains elements classified as hate speech
- **Examples**: Text with abusive language, discriminatory content, or offensive material

### Confidence Score Interpretation

The confidence score shows how certain the model is about its prediction:

- **90-100%**: Very High Confidence
  - Model is very certain about the classification
  - Results are highly reliable

- **70-89%**: High Confidence  
  - Model is fairly certain about the classification
  - Results are generally reliable

- **50-69%**: Moderate Confidence
  - Model has some uncertainty
  - Consider context and manual review

- **Below 50%**: Low Confidence
  - Model is uncertain about the classification
  - Text may be ambiguous or edge case
  - Manual review recommended

### Text Statistics

#### Character Count
- Total characters including spaces and punctuation
- Helps understand text length and complexity

#### Word Count  
- Number of words separated by spaces
- Useful for understanding text density

#### Sentence Count
- Estimated number of sentences (based on punctuation)
- Helps gauge text structure complexity

#### Processing Time
- Time taken for complete analysis (in seconds)
- Includes text cleaning, vectorization, and prediction
- Typical range: 0.1-3.0 seconds depending on text length

### Important Notes

#### Model Limitations
- **Training Data**: Model performance depends on training data coverage
- **Context Sensitivity**: May miss subtle context or sarcasm
- **Language Variants**: Works best with standard Indonesian
- **Cultural Nuances**: May not capture all cultural references

#### Best Practices for Interpretation
1. **Consider confidence scores** when making decisions
2. **Review low-confidence predictions** manually
3. **Use text statistics** to understand processing complexity
4. **Test with known examples** to validate behavior


## Summary

The Indonesian Hate Speech Detection GUI provides an intuitive interface for analyzing Indonesian text for hate speech content. 

### Quick Start Guide:

1. **Navigate to notebooks directory**: `cd notebooks`
2. **Run the application**: `python gui.py`
3. **Wait for success message**: "SUCCESS: Models loaded successfully"
4. **Use the GUI**: Enter Indonesian text and click "Analyze Text"

### Key Features:
- Real-time hate speech detection for Indonesian text
- Confidence scoring (0-100%)
- Text statistics (characters, words, sentences)
- Color-coded results (green=normal, red=hate speech)
- Processing time monitoring

### Best Practices:
- Use Indonesian language text for optimal results
- Consider confidence scores when interpreting results
- Test with example text to verify functionality
- Processing typically takes 1-3 seconds

For technical details about the underlying machine learning model, data processing, or training procedures, refer to the previous notebooks (01-04).
