# Persian Banking Sentiment Analysis - Data Exploration

This notebook demonstrates the basic usage of the Persian sentiment analysis pipeline.

## Setup

In [None]:
import sys
sys.path.append('../src')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from data_collection.cafe_bazaar_scraper import CafeBazaarScraper
from preprocessing.persian_cleaner import PersianTextPreprocessor

# Set up plotting
plt.style.use('seaborn-v0_8')
sns.set_palette('Set2')
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 12

print("Setup complete!")

## Data Collection Example

In [None]:
# Initialize scraper
scraper = CafeBazaarScraper()

# Get sample comments
sample_df = scraper.get_sample_comments('com.tejarat.ezam', count=50)
print(f"Collected {len(sample_df)} sample comments")
sample_df.head()

## Text Preprocessing Example

In [None]:
# Initialize preprocessor
preprocessor = PersianTextPreprocessor()

# Example text
sample_text = "این بانک خیلی خوبه! سرویسش عالیه 😊"

print(f"Original: {sample_text}")
print(f"Preprocessed: {preprocessor.preprocess_text(sample_text)}")
print(f"Tokens: {preprocessor.preprocess_text(sample_text, return_tokens=True)}")

## Next Steps

1. Collect full dataset using `scraper.scrape_all_banking_apps()`
2. Preprocess all comments
3. Label data for sentiment (positive/negative/neutral)
4. Train models and evaluate
5. Generate insights and reports