An intelligent system that analyzes tourist sentiment toward Saudi cities by scraping reviews from Google Maps and analyzing them using an Arabic AI model (CAMeLBERT).
User types "AlUla" → React sends request → C# creates report → C# sends to Python
→ Python scrapes reviews from Google Maps → Python analyzes sentiment with CAMeLBERT
→ Python returns results → C# saves to database → React displays charts
1️⃣ index.html — Base HTML page
- First file loaded by the browser
- Loads Tajawal (Arabic) and Inter (English) fonts from Google Fonts
- Loads TailwindCSS for styling
- Defines project colors: dark blue (primary), green (secondary), gold (accent)
- Contains
<div id="root">where React renders everything
2️⃣ index.tsx — React entry point
- Connects the App component to the root HTML element
- Uses React.StrictMode for error detection during development
- Rarely modified
3️⃣ index.css — Global styles
- Sets page direction to Right-to-Left (RTL)
- Contains custom scrollbar styling
4️⃣ App.tsx ⭐ — Main frontend file (most important!)
A large file containing all components and pages:
Components:
- Button — Unified button with 4 styles (primary, secondary, outline, ghost)
- Input — Input field with icon
- Card — White card with soft shadow
- Navbar — Top navigation bar (logo + username + language toggle + logout)
- Sidebar — Side menu (Dashboard, New Analysis, Reports, About)
- DashboardLayout — General layout (Navbar + Sidebar + Content)
Charts:
- SentimentChart — Pie chart for Positive/Negative/Neutral percentages
- FrequencyChart — Bar chart for top 5 most frequent words
- WordCloud — Word cloud with varying sizes and colors
Pages:
| Route | Page |
|---|---|
/ |
Login |
/signup |
Sign Up |
/forgot-password |
Forgot Password |
/city-input |
City Input |
/dashboard |
Results Dashboard |
/report |
Report |
/about |
About |
5️⃣ src/types.ts — Data type definitions
Sentiment— Sentiment classification (Positive, Negative, Neutral)Review— Review shape (text, source, date, author, rating)User— User data (name, email, role)AnalysisStats— Statistics (positive, negative, neutral counts)WordFreq— Word + occurrence countCityAnalysisData— Complete city analysis data
6️⃣ src/context/AuthContext.tsx — Authentication management
- Stores JWT Token in localStorage
- Auto-checks if user is logged in on page load
- Provides:
login(),register(),resetPassword(),logout() - Accessed via
useAuth()hook
7️⃣ src/context/LanguageContext.tsx — Language management (Arabic/English)
- Saves selected language in localStorage
- Automatically switches page direction (RTL ↔ LTR)
- Provides
t('key')function for translated text - Accessed via
useLanguage()hook
8️⃣ src/components/LanguageSwitcher.tsx — Language toggle button
- Shows "English" if current language is Arabic, and "عربي" if English
- Calls
toggleLanguage()on click
9️⃣ src/translations/ar.ts — Arabic text strings
- Dictionary with 100+ Arabic UI text entries
🔟 src/translations/en.ts — English text strings
- Same dictionary in English
1️⃣1️⃣ src/services/reportApi.ts — Report API service
generateReport()— Sends city name to backend to start analysisgetReports()— Fetches list of previous reportsgetLatestReport()— Fetches the most recent reportgetReportById()— Fetches a specific report by ID- All requests include JWT Token for authentication
1️⃣2️⃣ SmartTourism.API/Program.cs — Server entry point
- Connects SQLite database
- Configures CORS (allows frontend from localhost:3000)
- Sets up JWT authentication (validates token on every request)
- Enables Swagger for API documentation
1️⃣3️⃣ SmartTourism.API/appsettings.json — Server configuration
- Database path:
SmartTourism.db - JWT secret encryption key
- Issuer and Audience settings for token
1️⃣4️⃣ SmartTourism.API/SmartTourism.API.csproj — Project definition
- .NET 8
- Libraries: BCrypt (password hashing), JWT Bearer (security), SQLite (database), Swagger (docs)
1️⃣5️⃣ Controllers/AuthController.cs ⭐ — Authentication controller
| Endpoint | Function |
|---|---|
POST /api/auth/register |
Register new account (hashes password with BCrypt) |
POST /api/auth/login |
Login (verifies password, issues JWT Token) |
POST /api/auth/reset-password |
Reset password |
GET /api/auth/me |
Get current user data (protected with [Authorize]) |
1️⃣6️⃣ Controllers/ReportsController.cs ⭐⭐ — Reports controller (most important backend file!)
| Endpoint | Function |
|---|---|
POST /api/reports/generate |
Receives city name → sends to Python → saves results → returns them |
GET /api/reports |
Fetches all user reports |
GET /api/reports/latest |
Fetches latest completed report |
GET /api/reports/{id} |
Fetches report by ID |
Smart features:
- Uses SHA256 to prevent duplicate reports (returns existing report if same city + settings)
- Uses SHA256 for review deduplication
- Saves reviews in batches for better performance
1️⃣7️⃣ Models/User.cs — Users table
Id (GUID) | FirstName | LastName | Email | PasswordHash | Role | CreatedAt
1️⃣8️⃣ Models/Report.cs — Reports table
Id | UserId | City | Sources | Status (Processing/Completed/Failed) | TotalReviews | PositiveCount | NegativeCount | NeutralCount | ReportJson
1️⃣9️⃣ Models/Review.cs — Individual reviews table
Id | ReportId | Source | ReviewText | PredictedLabel | Score (0-1 confidence) | KeywordsJson | ReviewHash
2️⃣0️⃣ Data/AppDbContext.cs — Database context
- Defines 3 tables: Users, Reports, Reviews
- Unique index on Email (no duplicates)
- Index on ReportKey to prevent duplicate reports
2️⃣1️⃣ DTOs/AuthDtos.cs — Authentication data transfer objects
RegisterDto— Registration data (name + email + password)LoginDto— Login data (email + password)ResetPasswordDto— Password reset dataUserDto— User data sent to frontend (no password for security)
2️⃣2️⃣ DTOs/ReportDtos.cs — Report data transfer objects
GenerateReportDto— Report creation request (city + sources + date)ReportSummaryDto— Report summary for list viewReportDetailDto— Full report details with JSON
2️⃣3️⃣ Migrations/ — Database migrations (schema evolution)
InitialCreate— Created Users tableAddPasswordResetToken— Added password reset fieldAddReportsAndReviews— Created Reports and Reviews tables
2️⃣4️⃣ SmartTourism.ML/main.py ⭐⭐⭐ — AI server (most important file in the project!)
Part 1: Arabic Text Cleaning
- Removes URLs, mentions (@), and hashtags (#)
- Removes diacritics (Tashkeel) and Tatweel (ـــ)
- Normalizes Hamza variations: أ/إ/آ → ا
- Detects and ignores commercial ads (phone numbers + promotional keywords)
Part 2: Review Scraping
- Opens Chrome in headless mode using Selenium
- Searches for the location in Google Maps
- Automatically clicks the "Reviews" tab
- Scrolls down and collects reviews (text + star rating)
- Avoids detection with anti-bot techniques
Part 3: AI Sentiment Analysis
- Loads CAMeLBERT model (Arabic dialect-specific) at startup
- Passes each review through the model → returns: label (Positive/Negative/Neutral) + confidence score
- Extracts keywords from each review
Part 4: API
POST /api/analyze— Receives location name → scrapes → cleans → analyzes → returns results
2️⃣5️⃣ SmartTourism.ML/requirements.txt — Python dependencies
fastapi+uvicorn— Fast web serverselenium+webdriver-manager— Browser-based scrapingtorch+transformers— AI libraries (PyTorch + HuggingFace)pandas— Data processing
- dist/index.html — Final built HTML page
- dist/assets/index-*.js — All React code bundled into one file (675KB)
- dist/assets/index-*.css — Minified styles
dist= the final production-ready version. Generated by runningnpm run build.
| File | Purpose |
|---|---|
package.json |
Frontend project definition and dependencies |
vite.config.ts |
Vite build tool settings (port 3000) |
tsconfig.json |
TypeScript configuration |
.gitignore |
Files excluded from Git |
SmartTourism.sln |
C# solution file |
| Technology | Usage |
|---|---|
| React + TypeScript | User Interface |
| TailwindCSS | Styling |
| Recharts + D3.js | Charts & Visualizations |
| C# .NET 8 | Backend API |
| Entity Framework | Database ORM |
| SQLite | Database |
| JWT + BCrypt | Security & Encryption |
| Python FastAPI | AI Service Server |
| Selenium | Web Scraping |
| CAMeLBERT | Arabic Sentiment Analysis Model |
| HuggingFace Transformers | AI Model Runtime |
# 1. Start the AI server (Python)
cd SmartTourism.ML
venv/bin/python main.py # Runs on http://localhost:8000
# 2. Start the backend (C#)
cd SmartTourism.API
dotnet run # Runs on http://localhost:5165
# 3. Start the frontend (React)
npm run dev # Runs on http://localhost:3000Q1: Why split the backend into two services (C# and Python)? We used a Microservices architecture. C# handles the core API, data management, and security because it's fast and robust. Python is used because it's the strongest language for running AI models (HuggingFace/CAMeLBERT). This separation makes the system scalable.
Q2: How do you scrape data from Google Maps? We use Selenium to open Chrome in headless mode, search for the location on Google Maps, click the reviews tab, and scroll automatically to collect reviews. We avoid detection by masking the bot's fingerprint.
Q3: What AI model is used and why? We use CAMeLBERT-da-sentiment from the CAMeL Lab at NYU Abu Dhabi. We chose it because it's specifically trained on Arabic dialects (not just Modern Standard Arabic), and tourists typically write in colloquial Arabic.
Q4: How are passwords secured? We never store plain passwords. We use BCrypt for Hashing + Salting. JWT Tokens are used for sessions without storing session data on the server.
Q5: Why TypeScript instead of JavaScript? TypeScript catches errors at write-time (before runtime) by enforcing type definitions. This reduces bugs and speeds up development.
Q6: What are DTOs and why use them? Data Transfer Objects define exactly what data is sent to the client, without exposing all database columns (e.g., we send user data without the password hash).
Q7: How do you prevent duplicate reports and reviews? We use SHA256 hashing. Each report gets a unique key derived from (userId + city + sources + date). If the same request is made again, we return the existing report instead of re-analyzing.
Q8: What happens if the Python server fails? C# wraps the request in a try-catch. If it fails, the report status is set to "Failed" and a 500 error is returned to the user without crashing the entire system.
Q9: What's the difference between the src and dist folders?
src = development files (source code we write). dist = the final built and minified version ready to deploy to the internet.
Q10: How does the site support both Arabic and English?
We use React Context API. LanguageContext stores the current language and switches page direction (RTL/LTR). All text lives in separate translation files (ar.ts and en.ts) and is accessed via the t('key') function.
Supervisor: Dr. Mufreh Al-Qahtani
Team: Ali Abdullah Al-Mastur • Abdullah Hussein Al-Awad • Abdullah Mosfer • Yazan Yahya • Mahdi Hamoud Al-Dosari • Abdulrahman Adawi