Cortex is a private, secure, and highly responsive desktop AI assistant designed for seamless interaction with local Large Language Models (LLMs) through the Ollama framework. It prioritizes data privacy and deep personalization by keeping all models and user data entirely on the local machine. Its core feature is a sophisticated permanent memory system, allowing the AI to build a persistent knowledge base tailored to the user's specific context.
&
In an era of cloud-centric AI, Cortex champions a local-first approach. The guiding philosophy is that your data and your AI interactions should belong to you. By leveraging the power of locally-run models via Ollama, Cortex provides a powerful conversational experience without compromising on privacy or control. It is designed not as a service, but as a personal tool for thought, research, and development.
- Local-First AI Interaction: All communication happens directly with your local Ollama instance. No data is ever sent to the cloud.
- Persistent Chat History: Conversations are automatically saved locally and can be reloaded at any time, preserving the full context of your interactions.
- Permanent Memory System: Go beyond simple chat history. Explicitly instruct the AI to remember key facts, preferences, or project details using simple in-chat tags.
- Add Memories: Tell the AI
<memo>My project 'Apollo' is written in Go.</memo>
and it will remember this context for future questions. - Forget Memories: Full control to clear the AI's memory bank with a
<clear_memory />
command.
- Add Memories: Tell the AI
- Intuitive User Interface: A clean, modern UI built with PySide6 (Qt) provides a fluid and responsive user experience.
- Theming Support: Switch between focused light and dark themes to suit your preference.
- Model Flexibility: Easily switch between different chat models available on your Ollama instance directly from the settings menu.
- Asynchronous Processing: The UI remains perfectly responsive at all times, with AI query processing handled in a background thread.
Cortex is built on a robust, multi-layered architecture founded on the principle of Separation of Concerns. This design ensures the application is maintainable, scalable, and easy to reason about.
+--------------------------------+
| UI Layer (View) | (PySide6 Widgets, Dialogs, Styles)
| Handles presentation & input. |
+--------------------------------+
^
| (Signals & Slots)
v
+--------------------------------+
| Orchestration Layer (Control) | (Orchestrator, Workers in Chat_LLM.py)
| Manages state & async tasks. |
+--------------------------------+
^
| (Method Calls)
v
+--------------------------------+
| Service & Logic Layer (Model) | (SynthesisAgent, Memory Managers)
| Handles business logic & data.|
+--------------------------------+
- UI Layer: Responsible for rendering the interface and capturing user events. It is completely decoupled from the application's business logic.
- Orchestration Layer: The
Orchestrator
class acts as the central nervous system, mediating communication between the UI and the backend services. It manages application state, chat threads, and offloads all blocking operations (like LLM requests) to dedicatedQThread
workers. - Service & Logic Layer: Contains the "brains" of the application. The
SynthesisAgent
is responsible for building prompts and communicating with the Ollama API. The variousMemoryManager
classes handle the rules for short-term context, long-term history persistence, and the permanent memory bank.
Follow these steps to set up and run Cortex on your local machine.
- Python: Python 3.10 or newer is required.
- Git: Required to clone the repository.
- Ollama: Cortex is a client for Ollama. You must have Ollama installed and running.
-
Clone the repository:
git clone https://github.com/dovvnloading/Cortex.git cd Cortex
-
Create and activate a virtual environment:
# For Windows python -m venv venv .\venv\Scripts\activate # For macOS/Linux python3 -m venv venv source venv/bin/activate
-
Install the required dependencies:
pip install -r requirements.txt
-
Download the necessary Ollama models: Cortex uses two models by default: a primary model for generation and a smaller, faster model for generating chat titles.
ollama pull qwen3:8b ollama pull granite4:tiny-h
Note: You can configure Cortex to use other models after installation via the in-app settings.
-
Run the application:
python Chat_LLM.py
- Chatting: Type your message in the input box at the bottom and press Enter or click the "Send" button.
- New Chat: Click the "+ New Chat" button in the top-left panel to start a new conversation.
- Managing Chats: Right-click on any conversation in the history panel to access options for renaming or deleting it.
You can control the AI's permanent memory directly from the chat.
-
To save a memory: Include a
<memo>
tag anywhere in your response. The AI will save the enclosed fact.User: My project is called 'Apollo' and it is written in the Go programming language. From now on, remember that. AI: Understood. I will remember that your project 'Apollo' is written in Go.
<memo>User's project is named 'Apollo' and is written in Go.</memo>
-
To clear all memories: Ask the AI to forget everything. It will use the
<clear_memory />
tag.User: Please forget everything you know about me. AI: As you wish. I have cleared all permanent memories.
<clear_memory />
-
Managing Memories Manually: Click the settings cog in the title bar, and in the "Permanent Memory" section, click "Manage..." to open a dialog where you can view, edit, add, or delete all stored facts.
This project includes a Visual Studio Code project file (.vscode/settings.json
) with recommended settings for formatting and linting to maintain code consistency.
To contribute, please fork the repository and submit a pull request. For major changes, please open an issue first to discuss the proposed changes.
Cortex is an actively developed project. Potential future enhancements include:
- Streaming Responses: Displaying the AI's response token-by-token for a "typewriter" effect.
- Plugin System: An architecture to allow for extensions, such as web search or local file access.
- Advanced Memory Management: Exploring more sophisticated techniques for automatic memory retrieval and summarization.
- UI Enhancements: Additional user experience improvements, such as global keyboard shortcuts.
This project is licensed under the MIT License. See the LICENSE
file for details.
Developed by Matt Wesney.
This application is powered by exceptional open-source technologies, including:
Icon credits: Anthony Bossard
We're excited to introduce a new experimental feature: Permanent Memory. This system is designed to allow the AI to remember key facts you share over time, leading to more personalized and context-aware conversations.
As this is a new and complex addition, you may encounter some instances where the AI's memory behaves unstably. For example, it might recall a fact at an irrelevant time or misinterpret the context of your conversation. We are actively working on refining its reasoning capabilities to make this feature more reliable and intelligent.
If you find this feature to be disruptive or problematic for your use case, you can easily disable it at any time:
- Navigate to Settings (click the gear icon ⚙️ in the title bar).
- Under the Permanent Memory section, uncheck the box labeled "Enable the AI to remember key facts".
This will prevent the AI from accessing or saving any long-term memories.
Your feedback is crucial for improving this feature! If you experience any odd behavior, bugs, or have suggestions, please help us by opening an issue on this repository.
When reporting an issue, please provide as much detail as possible, including conversation context, so we can effectively diagnose and resolve the problem.
Thank you for your understanding and for helping us build a smarter, more capable assistant