____ _ _____
/ __ \ | | / ____|
| | | |_ _ __ _ _ __| |_ _ _ _ __ ___ | (___ ___ _ ____ _____ _ __
| | | | | | |/ _' | '__| __| | | | '_ ' _ \ \___ \ / _ \ '__\ \ / / _ \ '__|
| |__| | |_| | (_| | | | |_| |_| | | | | | ____) | __/ | \ V / __/ |
\___\_\\__,_|\__,_|_| \__|\__,_|_| |_| |_ |_____/ \___|_| \_/ \___|_|
Quantum Server is the backend component that powers the Quantum CLI tool. It provides a FastAPI-based server that interfaces with Chain of Thought AI model through Ollama and LangChain. The server handles AI interactions and provides a streaming API endpoint for real-time AI responses.
- Streaming AI Responses: Real-time streaming of AI responses using Server-Sent Events (SSE)
- Chain of Thought Processing: Structured thinking process in AI responses
- FastAPI Integration: Modern, fast (high-performance) web framework for building APIs
- Ollama Integration: Direct integration with Ollama for local AI model execution
- LangChain Implementation: Leveraging LangChain for enhanced AI interactions
- Ollama (The CLI tool will guide you through the installation)
- QwQ AI model
ollama pull qwqorollama run qwq - Python 3.10 or later
- Recommended hardware: 32 GB RAM, and if using MacBook Pro, M1 or above
- Clone the repository:
git clone https://github.com/yourusername/quantum_server.git
cd quantum_server- Install dependencies:
pip install -r requirements.txt- Run the server:
uvicorn chat_with_qwq:app --reloadThe server will start on http://localhost:8000
Streams AI responses for given input messages.
Request body:
{
"message": "your question here"
}Response: Server-Sent Events stream with AI responses
To run tests:
pytest tests/ -vContributions are welcome! Please check the CONTRIBUTING file for details.
This project is licensed under the MIT License - see the LICENSE file for details.
If this project helps you speed up your development, you can buy me a coffee to fuel the creation of more features. Please check link above.