AWAL GPT is a high-performance, production-ready multilingual assistant specialized in the Amazigh (Tamazight) language and culture. It leverages cutting-edge LLMs and a custom RAG (Retrieval-Augmented Generation) pipeline to provide accurate, culturally-aware responses in Tamazight Latin script.
- Multilingual Intelligence: Native support for Tamazight (Latin & Tifinagh), French, Arabic (Darija/MSA), and English.
- Elite Tamazight Logic: Advanced normalization, transliteration, and linguistic validation for high-quality Latin script output.
- Groq Acceleration: Powered by Groq's Llama 3.3 (70B) for near-instantaneous response times (< 1s).
- Custom RAG Pipeline: Local TF-IDF indexing for cultural facts, proverbs, and specialized domain vocabulary.
- Premium UI/UX: Responsive React frontend with dark mode support, mobile-optimized sidebar overlay, and localized "Thinking" state animations.
- Secure API: JWT-based authentication, rate-limiting, password hashing (Bcrypt), and MongoDB-backed conversation history.
- Framework: FastAPI (Python 3.13)
- Database: MongoDB (Motor async driver)
- LLM: Groq (Llama-3.3-70b-versatile)
- ML Engine: Scikit-Learn (Training) + NumPy (Optimized Runtime Cosine Similarity)
- Security: JWT, Bcrypt, Rate-Limiting Middleware
- Framework: React.js
- Styling: Vanilla CSS (Modern, Responsive Design)
- Features: "Thinking" state (Ar iswingim...), Sidebar Overlay for mobile, Multi-domain support.
The AWAL GPT pipeline follows a 5-stage transformation:
- Normalization: Cleans input text, handles numeric transliteration (e.g.,
7→h,9→q), and strips HTML/extra whitespace. - Intent Detection: A dedicated ML model classifies the query (Salutation, Question, Translation, Culture, etc.).
- Semantic RAG:
- TF-IDF Search: Finds relevant cultural context or proverbs.
- Multi-Lang Search: Cross-references EN/FR/AR semantics to find verified Tamazight equivalents.
- LLM Synthesis: Generates a structured response via Groq with cross-lingual semantic anchors.
- Post-Processing: Final script validation and "salvage" logic to ensure the output is pure Tamazight Latin.
cd backend
pip install -r requirements.txtCreate a .env file:
GROQ_API_KEY=your_groq_key
MONGO_URI=mongodb://localhost:27017
JWT_SECRET=your_super_secret_key
ADMIN_TOKEN=your_admin_tokenInitialize the indices and intent model:
python train.pyRun the server:
uvicorn main:app --reloadcd frontend
npm install
npm start| Endpoint | Method | Description |
|---|---|---|
/auth/register |
POST |
Create a new user account |
/auth/login |
POST |
Authenticate and receive JWT |
/chat |
POST |
Send message and get AI response |
/conversations |
GET |
List user's conversation history |
/health |
GET |
System health and index status |
- Input: Any length in Tamazight (Latin/Tifinagh), French, Arabic, English.
- Output:
[TAM]— Optimized Tamazight Latin.[FR]— Corresponding French translation.[AR]— Arabic equivalent.[EN]— English context.
[TAM] AWAL GPT d asghiwel n tutlayt tamazight s tarrayt n AI tameqqrant.
- Architecture: N-isisfiw Stage-by-Stage pipeline i-xd-amn deg backend.
- Setup: N-rni iwaliwen iwakken setup n project ad i-li d professional s 3 n steps.
- Optimized: N-ssmras Groq d NumPy iwakken tazzla ad t-ili d tasfayt ikemlen.