Skip to content

project-unisonOS/unison-inference

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

unison-inference

External inference service for Unison, providing LLM integration with multiple providers.

Features

  • Multi-provider support: OpenAI, Ollama (local), Azure OpenAI
  • Intent-driven: Handles inference.request and inference.response intents
  • Provider abstraction: Easy switching between providers via configuration
  • Cost-aware: Designed to work with Policy service for cost/risk checks
  • Observability: Structured JSON logging and Prometheus metrics

Supported Providers

OpenAI

  • Environment: OPENAI_API_KEY, OPENAI_BASE_URL (optional)
  • Models: gpt-4, gpt-3.5-turbo, etc.

Ollama (Local)

  • Environment: OLLAMA_BASE_URL (default: http://ollama:11434)
  • Models: llama3.2, mistral, codellama, etc.
  • No API keys required

Azure OpenAI

  • Environment: AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, AZURE_OPENAI_API_VERSION
  • Models: Your Azure deployment names

Configuration

Variable Default Description
UNISON_INFERENCE_PROVIDER ollama Default provider (openai/ollama/azure)
UNISON_INFERENCE_MODEL llama3.2 Default model name
OPENAI_API_KEY - OpenAI API key
OPENAI_BASE_URL https://api.openai.com/v1 OpenAI base URL
OLLAMA_BASE_URL http://ollama:11434 Ollama API URL
AZURE_OPENAI_ENDPOINT - Azure OpenAI endpoint
AZURE_OPENAI_API_KEY - Azure OpenAI API key
AZURE_OPENAI_API_VERSION 2024-02-15-preview Azure API version

API Endpoints

POST /inference/request

Handle inference requests as intents.

Request:

{
  "intent": "summarize.doc",
  "prompt": "Summarize this document...",
  "provider": "ollama",
  "model": "llama3.2",
  "max_tokens": 1000,
  "temperature": 0.7
}

Response:

{
  "ok": true,
  "intent": "summarize.doc",
  "provider": "ollama",
  "model": "llama3.2",
  "result": "Document summary...",
  "event_id": "uuid",
  "timestamp": 1698673200
}

GET /health

Service health check.

GET /ready

Readiness check including provider availability.

GET /metrics

Prometheus metrics.

Development

# Install dependencies
pip install -r requirements.txt

# Run locally
python src/server.py

# Run with Docker
docker build -t unison-inference .
docker run -p 8087:8087 unison-inference

Integration with Unison

The inference service integrates with:

  • Orchestrator: Registers inference intents and routes requests
  • Policy: Cost/risk evaluation for external API calls
  • Context: Stores inference history and results
  • Storage: Persists prompts and responses

Example Intents

  • summarize.doc: Summarize documents or text
  • analyze.code: Analyze or generate code
  • translate.text: Translate between languages
  • generate.idea: Brainstorm ideas or suggestions

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published