PromptFlow AI is a high-performance, memory-optimized LLM orchestrator built with FastAPI and Gemini 2.5 Flash. It features a custom "Wisdom Layer" to handle long-term context retention while staying within the limits of free-tier infrastructure.
- Hierarchical Context Compression: Instead of passing massive chat histories, the system distills conversations into a "Wisdom Layer" (What, Why, and How), significantly reducing token usage.
- Context Guardrails: Integrated intent-checking to prevent "context drift," ensuring the AI stays focused on the project mission (e.g., ShopEasy E-commerce).
- Session Handoff: Generates a "Memory Anchor" at the end of every session, allowing the user to resume work the next day with 100% context alignment.
- Concurrency-First Design: Uses Python's asyncio to handle Guardrails and Gemini processing in parallel for near-instant responses.
- M0-Tier Optimized: Specifically tuned for MongoDB Atlas Free Tier with connection pooling and $slice operators to ensure zero lag.
- Backend: FastAPI (Python 3.12+)
- AI Engine: Gemini 2.5 Flash (via google-genai SDK)
- Database: MongoDB Atlas (Motor Async Driver)
- Safety: Custom NVIDIA-inspired Guardrail logic
git clone https://github.com/rkbharti/promptflow-ai.git
cd promptflow-aipython -m venv venv▶️ On Windows:venv\Scripts\activate▶️ On macOS / Linux:source venv/bin/activate
After activation, you should see (venv) in your terminal:
(venv) your-project-folder>
pip install --upgrade pipCreate a .env file:
GOOGLE_API_KEY=your_gemini_api_key
MONGO_URI=your_mongodb_connection_string
PROJECT_NAME=PromptFlow_AI
pip install -r requirements.txtuvicorn app.main:app --reloadThe system employs a "Save Game" logic. When a session reaches a certain message threshold, the AI automatically summarizes the technical state into a compressed_context field in MongoDB. This ensures that the next request only needs the summary and the last few messages, keeping the system lightning-fast.