A FastAPI microservice that transforms casual user prompts into expert-level AI engineering prompts using your choice of LLM provider (Anthropic Claude, OpenAI, Ollama, Groq, and more).
- Single endpoint POST
/refinefor prompt transformation - Multi-provider support: Choose from cloud or local LLMs
- Free options: Ollama (local), Groq (cloud free tier), LM Studio (local)
- Paid options: Anthropic Claude, OpenAI, or any OpenAI-compatible API
- Adds software development context and specificity
- Includes best practices, error handling, and edge cases
- Production-ready with Docker support
- Simple environment variable-based provider switching
- Test locally for free with Ollama before using paid APIs
# 1. Install Ollama (if not already installed)
# Visit https://ollama.ai or run: curl -fsSL https://ollama.com/install.sh | sh
# 2. Pull a model
ollama pull llama3.1
# 3. Clone and setup RefAIne
git clone <repo-url>
cd RefAIne
# 4. Create .env file
cat > .env << EOF
LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_API_KEY=ollama
OPENAI_MODEL=llama3.1
EOF
# 5. Install and run
uv pip install -r pyproject.toml
uvicorn main:app --reload
# 6. Test it!
curl -X POST http://localhost:8000/refine \
-H "Content-Type: application/json" \
-d '{"prompt": "create a REST API"}'- Python 3.12+
- uv package manager
- LLM provider of your choice:
- Ollama (free, local) - recommended for testing
- Groq (free tier, cloud) - fast inference
- Anthropic Claude (paid, cloud) - high quality
- OpenAI (paid, cloud) - widely compatible
- Or any OpenAI-compatible API
- Clone and setup
git clone <repo-url>
cd RefAIne- Configure environment
Option A: Using Ollama (Free, Local):
# Install Ollama from https://ollama.ai
ollama pull llama3.1
# Create .env file
cp .env.example .env
# Edit .env and set:
# LLM_PROVIDER=openai
# OPENAI_BASE_URL=http://localhost:11434/v1
# OPENAI_API_KEY=ollama
# OPENAI_MODEL=llama3.1Option B: Using Anthropic Claude:
cp .env.example .env
# Edit .env and set:
# LLM_PROVIDER=anthropic
# ANTHROPIC_API_KEY=your_api_key_hereOption C: Using Groq (Free Tier):
cp .env.example .env
# Edit .env and set:
# LLM_PROVIDER=openai
# OPENAI_BASE_URL=https://api.groq.com/openai/v1
# OPENAI_API_KEY=your_groq_api_key
# OPENAI_MODEL=llama-3.3-70b-versatile- Install dependencies
uv pip install -r pyproject.toml- Run the service
uvicorn main:app --reloadThe API will be available at http://localhost:8000
-
Configure your .env file with your chosen provider (see Configuration section below)
-
Build and run with docker-compose
docker-compose up -d- Or build manually
docker build -t refaine .
# For Anthropic Claude
docker run -p 8000:8000 -e LLM_PROVIDER=anthropic -e ANTHROPIC_API_KEY=your_key refaine
# For Groq
docker run -p 8000:8000 -e LLM_PROVIDER=openai -e OPENAI_BASE_URL=https://api.groq.com/openai/v1 -e OPENAI_API_KEY=your_key -e OPENAI_MODEL=llama-3.3-70b-versatile refainecurl http://localhost:8000/Response:
{
"service": "refAIne",
"status": "healthy",
"version": "1.0.0"
}Endpoint: POST /refine
Request:
curl -X POST http://localhost:8000/refine \
-H "Content-Type: application/json" \
-d '{"prompt": "make a function to sort a list"}'Response (example from Ollama with llama3.1):
{
"original": "make a function to sort a list",
"refined": "Create a Python function that sorts a list with the following requirements:\n\n1. Function signature: Accept a list of comparable elements as input\n2. Return a new sorted list (do not modify the original)\n3. Use Python's built-in sorting (efficient O(n log n) Timsort)\n4. Add type hints for better code clarity\n5. Include error handling for None or non-list inputs\n6. Add docstring with examples\n7. Consider edge cases: empty list, single element, already sorted, reverse sorted\n8. Make it generic to work with any comparable types (int, str, float, etc.)\n\nProvide clean, PEP 8 compliant code with appropriate documentation.",
"model": "llama3.1"
}The refined output quality and style will vary depending on your chosen LLM provider and model.
Basic prompt:
curl -X POST http://localhost:8000/refine \
-H "Content-Type: application/json" \
-d '{"prompt": "create a REST API"}'More specific prompt:
curl -X POST http://localhost:8000/refine \
-H "Content-Type: application/json" \
-d '{"prompt": "build user authentication"}'Algorithm request:
curl -X POST http://localhost:8000/refine \
-H "Content-Type: application/json" \
-d '{"prompt": "optimize database queries"}'RefAIne supports multiple LLM providers. Choose your provider using the LLM_PROVIDER environment variable.
LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_API_KEY=ollama
OPENAI_MODEL=llama3.1Prerequisites:
- Install Ollama: https://ollama.ai
- Pull a model:
ollama pull llama3.1 - Start server:
ollama serve
Supported Ollama Models:
llama3.1,llama3.2,llama3.3qwen2.5,mistral,codellama- Any model available in Ollama library
LLM_PROVIDER=openai
OPENAI_BASE_URL=https://api.groq.com/openai/v1
OPENAI_API_KEY=your-groq-api-key
OPENAI_MODEL=llama-3.3-70b-versatileGet API Key: https://console.groq.com
Supported Groq Models:
llama-3.3-70b-versatilellama-3.1-70b-versatilemixtral-8x7b-32768
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-your-key-here
CLAUDE_MODEL=claude-sonnet-4-20250514Supported Claude Models:
claude-sonnet-4-20250514(default)claude-opus-4-20250514claude-3-5-sonnet-20241022
LLM_PROVIDER=openai
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_API_KEY=sk-your-openai-key
OPENAI_MODEL=gpt-4-turbo-previewSupported OpenAI Models:
gpt-4-turbo-previewgpt-4ogpt-3.5-turbo
LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_API_KEY=lm-studio
OPENAI_MODEL=your-model-namePrerequisites:
- Download LM Studio: https://lmstudio.ai
- Load a model in LM Studio
- Start the server from the LM Studio UI
RefAIne works with any LLM service that implements the OpenAI API format:
LLM_PROVIDER=openai
OPENAI_BASE_URL=https://your-api-endpoint/v1
OPENAI_API_KEY=your-api-key
OPENAI_MODEL=your-model-nameCompatible services include: vLLM, Text Generation Inference, FastChat, LocalAI, and more.
| Variable | Description | Default | Required |
|---|---|---|---|
LLM_PROVIDER |
Provider type: anthropic or openai |
anthropic |
No |
| Anthropic Settings | |||
ANTHROPIC_API_KEY |
Anthropic API key | - | Required if using Anthropic |
CLAUDE_MODEL |
Claude model name | claude-sonnet-4-20250514 |
No |
| OpenAI-Compatible Settings | |||
OPENAI_API_KEY |
API key (for OpenAI, Groq, etc.) | - | Required if LLM_PROVIDER=openai |
OPENAI_BASE_URL |
API endpoint URL | https://api.openai.com/v1 |
No |
OPENAI_MODEL |
Model name | gpt-4-turbo-preview |
No |
Note: For Ollama, use LLM_PROVIDER=openai with OPENAI_BASE_URL=http://localhost:11434/v1
Once running, visit:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
RefAIne/
├── main.py # FastAPI application
├── pyproject.toml # Dependencies
├── Dockerfile # Container image
├── docker-compose.yml # Orchestration
├── .env.example # Environment template
└── README.md # Documentation
# Install with dev dependencies
uv pip install -r pyproject.toml
# Run the service
uvicorn main:app --reload
# Test in another terminal
curl -X POST http://localhost:8000/refine \
-H "Content-Type: application/json" \
-d '{"prompt": "test prompt"}'- Provider selection: Choose based on your needs (cost, speed, quality, privacy)
- Rate limiting: Set appropriate limits for your chosen provider
- Authentication: Add API keys or authentication if exposing publicly
- Monitoring: Track API usage and costs (for paid providers)
- Caching: Implement caching for common prompts to reduce latency and costs
- Timeouts: Configure request timeouts based on provider speed
- Scaling: For local providers (Ollama), ensure adequate hardware resources
- Security: Keep API keys secure, use environment variables, never commit to git
MIT