Solo Server

Solo Server is a lightweight server to manage hardware aware inference.

# Install the solo-server package using pip
pip install solo-server

# Run the solo server setup in simple mode
solo setup

Features

Seamless Setup: Manage your on device AI with a simple CLI and HTTP servers
Open Model Registry: Pull models from registries like Ollama & Hugging Face
Cross-Platform Compatibility: Deploy AI models effortlessly on your hardware
Configurable Framework: Auto-detect hardware (CPU, GPU, RAM) and sets configs

Installation

🔹Prerequisites

🐋 Docker: Required for containerization
- Install Docker

🔹 Install with `uv` (Recommended)

Install 'uv' using these docs: https://docs.astral.sh/uv/getting-started/installation/

# Install uv
# On Windows (PowerShell)
iwr https://astral.sh/uv/install.ps1 -useb | iex
# If you have admin use, consider: https://github.com/astral-sh/uv/issues/3116
powershell -ExecutionPolicy Bypass -c "pip install uv" 

# On Unix/MacOS
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create virtual environment
uv venv

# Activate the virtual environment
source .venv/bin/activate  # On Unix/MacOS
# OR
.venv\Scripts\activate     # On Windows

uv pip install solo-server

Creates an isolated environment using uv for performance and stability.

Run the interactive setup to configure Solo Server:

solo setup

🔹 Setup Features

✔️ Detects CPU, GPU, RAM for hardware-optimized execution
✔️ Auto-configures solo.conf with optimal settings
✔️ Recommends the compute backend OCI (CUDA, HIP, SYCL, Vulkan, CPU, Metal)

Example Output:

╭────────────────── System Information ──────────────────╮
│ Operating System: Windows │
│ CPU: AMD64 Family 23 Model 96 Stepping 1, AuthenticAMD │
│ CPU Cores: 8 │
│ Memory: 15.42GB │
│ GPU: NVIDIA │
│ GPU Model: NVIDIA GeForce GTX 1660 Ti │
│ GPU Memory: 6144.0GB │
│ Compute Backend: CUDA │
╰────────────────────────────────────────────────────────╯
🔧 Starting Solo Server Setup...
📊 Available Server Options:
• Ollama
• vLLM
• Llama.cpp

✨ Ollama is recommended for your system
Choose server [ollama]:

Solo Server Block Diagram

Commands

Serve a Model

solo serve -s ollama -m llama3.2

Command Options:

╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --server  -s      TEXT     Server type (ollama, vllm, llama.cpp) [default: ollama]                                  │
│ --model   -m      TEXT     Model name or path [default: None]                                                       │
│ --port    -p      INTEGER  Port to run the server on [default: None]                                                │
│ --help                     Show this message and exit.                                                              │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

REST API

You can now use the API endpoint created by the Solo Server to interact with the model. You can send a POST request to http://localhost:11434/api/chat with a JSON payload containing the model name and the messages you want to send to the model.

Generate a response

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.2",
  "prompt":"Why is the sky blue?"
}'

Chat with a model

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    { "role": "user", "content": "why is the sky blue?" }
  ]
}'

Check Model Status

solo status

Example Output:

🔹 Running Models:
-------------------------------------------
| Name      | Model   | Backend | Port |
|----------|--------|---------|------|
| llama3   | Llama3 | CUDA    | 8080 |
| gptj     | GPT-J  | CPU     | 8081 |
-------------------------------------------

Stop a Model

solo stop

Example Output:

🛑 Stopping Solo Server...
✅ Solo server stopped successfully.

⚙️ Configuration (`solo.json`)

After setup, all settings are stored in:

~/.solo_server/solo.json

Example:

# Solo Server Configuration

{
    "hugging_face": {
        "token": ""
    },
    "system_info": {
        "os": "Windows",
        "cpu_model": "AMD64 Family 23 Model 96 Stepping 1, AuthenticAMD",
        "cpu_cores": 8,
        "memory_gb": 15.42,
        "gpu_vendor": "NVIDIA",
        "gpu_model": "NVIDIA GeForce GTX 1660 Ti",
        "gpu_memory": 6144.0,
        "compute_backend": "CUDA"
    },
    "starfish": {
        "api_key": ""
    },
    "hardware": {
        "use_gpu": true
    }
}

📝 Highlight Apps

Refer example_apps for sample applications.

ai-chat

🔹 To Contribute, Setup in Dev Mode

# Clone the repository
git clone https://github.com/GetSoloTech/solo-server.git

# Navigate to the directory
cd solo-server

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # Unix/MacOS
# OR
.venv\Scripts\activate     # Windows

# Install in editable mode
pip install -e .

📝 Project Inspiration

This project wouldn't be possible without the help of other projects like:

uv
llama.cpp
ramalama
ollama
whisper.cpp
vllm
podman
huggingface
llamafile
cog

Like using Solo, consider leaving us a ⭐ on GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 166 Commits
assets		assets
example_apps/ai-chat		example_apps/ai-chat
solo_server		solo_server
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
install.sh		install.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Solo Server

Features

Table of Contents

Installation

🔹Prerequisites

🔹 Install with `uv` (Recommended)

🔹 Setup Features

Solo Server Block Diagram

Commands

Serve a Model

REST API

Generate a response

Chat with a model

Check Model Status

Stop a Model

⚙️ Configuration (`solo.json`)

📝 Highlight Apps

🔹 To Contribute, Setup in Dev Mode

📝 Project Inspiration

About

Contributors 7

Languages

License

GetSoloTech/solo-server

Folders and files

Latest commit

History

Repository files navigation

Solo Server

Features

Table of Contents

Installation

🔹Prerequisites

🔹 Install with uv (Recommended)

🔹 Setup Features

Solo Server Block Diagram

Commands

Serve a Model

REST API

Generate a response

Chat with a model

Check Model Status

Stop a Model

⚙️ Configuration (solo.json)

📝 Highlight Apps

🔹 To Contribute, Setup in Dev Mode

📝 Project Inspiration

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 7

Languages

🔹 Install with `uv` (Recommended)

⚙️ Configuration (`solo.json`)