Skip to content

A free, 100% on-device ChatGPT Application. Your data never leaves your hardware. Private. Local. Limitless.

Notifications You must be signed in to change notification settings

3ahmood/OpenChatGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenChatGPT

OpenChatGPT is a high-end, production-ready, fully open-source, on-device AI intelligence system designed as a professional-grade ChatGPT-style assistant. It combines agentic reasoning, real-time web search, session-based memory, and a clean iMessage-inspired interface to deliver accurate, contextual, and polished conversational experiences.

The project demonstrates that high-quality, agentic AI experiences can be achieved entirely on local infrastructure. By combining stateful reasoning, real-time information retrieval, and a polished conversational interface, OpenChatGPT provides a practical, transparent alternative to proprietary AI platforms.


Open-Source, Local-First AI

OpenChatGPT is built from the ground up with openness and user control as first-class principles:

  • 100% open-source stack spanning UI, backend, orchestration, and model runtime
  • Runs entirely on-device, keeping prompts, context, and reasoning local
  • No SaaS model APIs, subscriptions, or hidden inference costs
  • Full inspectability of agent logic, memory flow, and tool usage

This makes OpenChatGPT suitable for developers, researchers, and organizations that require transparency, data ownership, and long-term maintainability.


Comparable to Paid AI Assistants

Despite being fully local and open, OpenChatGPT is engineered to deliver outcomes on par with commercial AI services:

  • Large-scale local model inference via gpt-oss:20b
  • Agentic reasoning with LangGraph for structured task completion
  • Real-time web search and synthesis using Tavily
  • Multi-turn session memory with consistent contextual awareness
  • Professional, tool-agnostic response formatting

The result is a system capable of research, analysis, and conversational assistance that closely mirrors the experience of paid, cloud-hosted AI platforms.


Privacy by Architecture

OpenChatGPT does not rely on external model providers or remote inference:

  • All LLM reasoning occurs locally through Ollama
  • Memory and session state are stored in-process
  • External calls are limited to optional search APIs, with no prompt leakage
  • API keys are managed locally without environment exposure

This architecture ensures maximum privacy while preserving modern AI capabilities.


Key Features

Agentic Logic

  • Implemented using LangGraph
  • Stateful, single-agent loop for structured task execution
  • Designed for reliable multi-step reasoning and completion

Large Language Model

  • Powered by gpt-oss:20b via Ollama
  • Local inference with high-quality, large-context responses
  • No dependency on hosted proprietary model APIs

Real-Time Web Search

  • Integrated Tavily API for live information retrieval
  • Tool usage is fully abstracted from the user
  • Responses are synthesized in a natural, professional tone

Session-Based Memory

  • Thread-based memory using an in-memory saver
  • Maintains conversational context across multiple turns
  • Supports multi-session usage without cross-contamination

Automated Key Management

  • Secure, local API key loading from:

    • TavilyKey.txt
    • LangSmithKey.txt
  • No hard-coded secrets or environment variable leakage

iMessage-Inspired UI

  • Clean, text-only chat interface built with Streamlit
  • Right-aligned green user message bubbles
  • Left-aligned blue AI message bubbles
  • No icons or visual clutter
  • Custom auto-scrolling for smooth conversational flow

Production-Ready Backend

  • FastAPI-based backend service
  • Hot-reloading enabled for rapid iteration
  • Clean separation between UI, API, and agent logic

System Architecture

End-to-end request flow from user input to synthesized response.

flowchart TD
    %% Nodes
    U[User]
    UI[Streamlit Frontend<br/>iMessage-style UI]
    API[FastAPI Backend<br/>API Layer]
    AGENT[LangGraph Agent<br/>Stateful Loop]
    LLM[Ollama<br/>gpt-oss:20b]
    SEARCH[Tavily API<br/>Real-Time Search]
    MEM[In-Memory Saver<br/>Session Context]

    %% Flow
    U --> UI --> API --> AGENT
    AGENT --> LLM
    AGENT --> SEARCH
    LLM --> MEM
    SEARCH --> MEM
    MEM --> AGENT

    %% Styling
    classDef user fill:#f5f5f5,stroke:#333,stroke-width:1px;
    classDef ui fill:#e3f2fd,stroke:#1565c0,stroke-width:1.5px;
    classDef backend fill:#ede7f6,stroke:#5e35b1,stroke-width:1.5px;
    classDef agent fill:#e8f5e9,stroke:#2e7d32,stroke-width:1.5px;
    classDef model fill:#fff3e0,stroke:#ef6c00,stroke-width:1.5px;
    classDef tool fill:#fce4ec,stroke:#ad1457,stroke-width:1.5px;
    classDef memory fill:#e0f2f1,stroke:#00695c,stroke-width:1.5px;

    class U user;
    class UI ui;
    class API backend;
    class AGENT agent;
    class LLM model;
    class SEARCH tool;
    class MEM memory;
Loading

Agent Execution Flow

Single-agent, stateful reasoning and tool orchestration.

flowchart TD
    INPUT[User Input]
    AGENT[LangGraph Agent<br/>Stateful Node]
    THINK[Reasoning Step<br/>Plan / Decide]
    LLM[LLM Inference<br/>Ollama]
    TOOL[Web Search<br/>Tavily]
    MEMORY[Memory Update<br/>thread_id]

    INPUT --> AGENT --> THINK
    THINK --> LLM
    THINK --> TOOL
    LLM --> MEMORY
    TOOL --> MEMORY
    MEMORY --> AGENT

    %% Styling
    classDef input fill:#f5f5f5,stroke:#333,stroke-width:1px;
    classDef agent fill:#e8f5e9,stroke:#2e7d32,stroke-width:1.5px;
    classDef process fill:#ede7f6,stroke:#5e35b1,stroke-width:1.5px;
    classDef model fill:#fff3e0,stroke:#ef6c00,stroke-width:1.5px;
    classDef tool fill:#fce4ec,stroke:#ad1457,stroke-width:1.5px;
    classDef memory fill:#e0f2f1,stroke:#00695c,stroke-width:1.5px;

    class INPUT input;
    class AGENT agent;
    class THINK process;
    class LLM model;
    class TOOL tool;
    class MEMORY memory;
Loading

Local Deployment Topology

Local-first execution with clear separation of concerns.

flowchart TD
    DEV[Developer Machine]
    FE[Streamlit Frontend<br/>app.py]
    BE[FastAPI Backend<br/>main.py]
    OLLAMA[Ollama Runtime<br/>gpt-oss:20b]
    SECRETS[Local Secrets]
    TAVILY[TavilyKey.txt]
    LANGSMITH[LangSmithKey.txt]

    DEV --> FE
    DEV --> BE
    DEV --> OLLAMA
    DEV --> SECRETS
    SECRETS --> TAVILY
    SECRETS --> LANGSMITH

    %% Styling
    classDef host fill:#f5f5f5,stroke:#333,stroke-width:1px;
    classDef ui fill:#e3f2fd,stroke:#1565c0,stroke-width:1.5px;
    classDef backend fill:#ede7f6,stroke:#5e35b1,stroke-width:1.5px;
    classDef model fill:#fff3e0,stroke:#ef6c00,stroke-width:1.5px;
    classDef secrets fill:#fce4ec,stroke:#ad1457,stroke-width:1.5px;

    class DEV host;
    class FE ui;
    class BE backend;
    class OLLAMA model;
    class SECRETS,TAVILY,LANGSMITH secrets;
Loading

Getting Started

Prerequisites

Ensure the following are installed and configured:

  • Python 3.10 or later

  • Ollama

    • Pull the required model:

      ollama pull gpt-oss:20b
  • API keys stored locally:

    • TavilyKey.txt
    • LangSmithKey.txt

Running the Application

1. Start the Backend

python main.py
  • Backend will run at: http://localhost:8000

2. Start the Frontend

streamlit run app.py
  • Frontend will be available at: http://localhost:8501

Verification and Validation

Real-Time Search and Synthesis

  • Verified that the agent can perform live searches (e.g., “What’s the current price of SPY?”)
  • Responses are synthesized cleanly without exposing internal tool usage
  • Output maintains a professional, authoritative tone

Session Memory

  • Confirmed persistent conversational context across multiple turns
  • Thread-based thread_id system reliably maintains state
  • No memory leakage between sessions

Who This Is For

  • Developers seeking a drop-in open-source alternative to paid AI assistants
  • Researchers exploring agentic AI systems without cloud constraints
  • Organizations requiring on-device inference for security or compliance
  • Builders looking for a reference implementation of local-first AI design