Troubleshooting

Common issues and solutions for TITAN. If your problem isn't covered here, open an issue.

1. Gateway Startup Issues

Port Already in Use

Error:

Port 48420 is already in use. Is TITAN already running?

Cause: Another process (or a previous TITAN instance) is already listening on the gateway port.

Fix:

# Find what's using the port
lsof -i :48420

# Kill the process
kill <PID>

# Or start TITAN on a different port
titan gateway --port 48421

Auth Configuration Problems

Symptom: All API requests return 401 Unauthorized.

Cause: auth.mode is set to "token" but no token is configured. TITAN logs:

Auth mode is "token" but no token configured — denying request.
Set gateway.auth.token or switch to mode "password".

Fix: Either set a token or switch auth mode:

// titan.json
{
  "gateway": {
    "auth": {
      "mode": "none"
    }
  }
}

Auth Mode	When to Use
`"none"`	Local development, trusted networks
`"token"`	API access with a static bearer token
`"password"`	Browser-based access with login page

Request Limits

Limit	Value	Error
Max body size	1 MB	`413 Payload too large (max 1MB)`
Max concurrent requests	5	`503 Server busy — too many concurrent requests`
Rate limit exceeded	varies	`429 Too many requests` (check `Retry-After` header)

2. Ollama Connection Problems

ECONNREFUSED

Error:

fetch failed: connect ECONNREFUSED 127.0.0.1:11434

Cause: Ollama is not running or is on a different address.

Fix:

# Start Ollama
ollama serve

# Verify it's running
curl http://localhost:11434/api/tags

# If Ollama is on a different host, configure the URL:
# titan.json
{
  "providers": {
    "ollama": {
      "baseUrl": "http://192.168.1.11:11434"
    }
  }
}

The URL resolution order is:

config.providers.ollama.baseUrl in titan.json
OLLAMA_BASE_URL environment variable
http://localhost:11434 (default)

Model Not Found

Error (from model switch):

Model 'llama3.2:3b' not found in Ollama. Pull it first: ollama pull llama3.2:3b

Fix:

ollama pull llama3.2:3b

Ollama Unreachable During Model Switch

Error:

Cannot verify model 'qwen3.5:9b' — Ollama is unreachable at http://localhost:11434. Check Ollama is running: ollama serve

The health probe times out after 3 seconds. If Ollama is slow to respond (e.g., loading a large model), wait and retry.

Ollama Timeouts

TITAN does not set an explicit request timeout on Ollama chat calls — it relies on OS/network defaults. If inference is slow (especially on CPU), the request may appear to hang.

Tip: Check what Ollama is doing:

# See loaded models and memory usage
ollama ps

# Watch Ollama logs
journalctl -u ollama -f   # systemd
# or
ollama serve               # foreground with logs

Context Window Defaults

Mode	`num_ctx`	`num_predict`
Local models	16,384	16,384
Cloud models (`:cloud` suffix)	131,072	32,768

Models stay loaded in Ollama VRAM for 30 minutes after last use (keep_alive: '30m').

3. GPU / VRAM Troubleshooting

GPU Not Detected

TITAN probes GPUs in this order: Apple Silicon → AMD ROCm → NVIDIA.

Check if TITAN sees your GPU:

# NVIDIA
nvidia-smi

# AMD
rocm-smi

# Apple Silicon — always detected on macOS
system_profiler SPDisplaysDataType

If nvidia-smi or rocm-smi fails or times out (5-second limit), TITAN falls back to CPU mode:

No GPU detected — stall timeout increased to 120s for CPU inference
CPU-only mode: maxConcurrentTasks auto-tuned to 2

Important: Integrated GPUs (AMD APUs, Intel UHD) are generally NOT used by Ollama. ollama ps will show 100% CPU even if /dev/kfd exists.

Low VRAM

TITAN emits a warning when free VRAM drops below 500 MB. The orchestrator reserves 1024 MB by default to prevent OOM.

Symptoms:

Slow inference (model partially offloaded to CPU)
Model swap failures

Fix:

# Check current VRAM usage
nvidia-smi  # or rocm-smi

# See what Ollama has loaded
ollama ps

# Unload unused models
ollama stop <model-name>

VRAM Acquire Failures

These errors come from the VRAM orchestrator API (POST /api/vram/acquire):

Error	Meaning
`GPU state unavailable (no supported GPU detected — requires NVIDIA, AMD ROCm, or Apple Silicon)`	No GPU found at all
`Not enough VRAM: need XMB, available YMB (auto-swap disabled)`	Insufficient VRAM and `vram.autoSwapModel` is `false`
`Not enough VRAM: need XMB, available YMB (no models to evict)`	Nothing to evict — reserved VRAM is consumed by non-TITAN processes
`Evicted <models> but still not enough VRAM: need XMB, have YMB`	Eviction freed some VRAM but still not enough

VRAM Configuration

// titan.json
{
  "vram": {
    "enabled": true,
    "pollIntervalMs": 10000,
    "reserveMB": 1024,
    "autoSwapModel": true,
    "fallbackModel": "qwen3:7b",
    "ollamaUrl": "http://localhost:11434"
  }
}

Set autoSwapModel: true to let TITAN automatically evict models when VRAM is needed.

4. Model Tool-Calling Issues

Models That Work Well

Model	Size	Notes
`qwen3.5:4b`	4B	Native tool calling, 256K context
`qwen3.5:9b`	9B	Recommended default
`qwen3.5:35b`	35B	Most reliable tool calling
`qwen3-coder:32b`	32B	Best for code tasks
`llama3.2:3b`	3B	Fast but hallucinates extra tool calls
`devstral-small-2`	~22B	Good for dev tasks

Models to Avoid

Model	Problem
DeepSeek-R1 (all sizes)	Malformed JSON schemas, ignores tool definitions
LLaMA 3.1	Poor tool calling reliability
Mistral/Mixtral (local)	Inconsistent across quantizations
Phi-3/Phi-4	No native tool calling in Ollama
Gemma 2	Narrates instead of calling tools
dolphin3	Returns `"does not support tools"` error
arcee-agent	No tool calling despite marketing claims

"Does Not Support Tools" Error

Error from Ollama:

Model <name> does not support native tool calling — running in chat-only mode

TITAN automatically retries the request without tools. If the model truly doesn't support tools, it will run in chat-only mode (no skills, no autonomous actions).

Tool Call Failure Self-Heal

If a model fails to generate tool calls for 3 consecutive rounds despite tools being available, the stall detector triggers tool_call_failure:

First attempt: Switches to a fallback model that supports tool calling
Second attempt: If still failing, returns an honest status to the user

Hardware Recommendations

Hardware	Recommended Model	Speed
8–12 GB RAM, CPU-only	`llama3.2:3b`	~16 tok/s
8 GB VRAM (laptop)	`qwen3.5:4b`	~150 tok/s
16 GB RAM	`qwen3.5:9b`	~80–120 tok/s
24 GB VRAM (RTX 4090)	`qwen3-coder:32b`	~20–40 tok/s
32 GB VRAM (RTX 5090)	`qwen3.5:35b`	~20–40 tok/s

5. Provider Routing & Fallback

How Fallback Works

When a provider fails with a retryable error, TITAN tries the fallback chain (configured in agent.fallbackChain), then falls back to a provider-level failover scan.

Retryable errors (triggers fallback):

HTTP 429, 500, 502, 503
rate limit / rate_limit
timeout / timed out / ETIMEDOUT
ECONNREFUSED / ECONNRESET
Messages containing overloaded

Fallback Chain

// titan.json
{
  "agent": {
    "fallbackChain": ["ollama/qwen3.5:9b", "anthropic/claude-sonnet-4-20250514"],
    "fallbackMaxRetries": 3
  }
}

Max retries: fallbackMaxRetries (default: 3)
Fallback state expires after 5 minutes — TITAN retries the primary model after that

Provider-Level Failover

If the fallback chain is exhausted, TITAN scans providers in this order: anthropic → openai → google → ollama. It picks the first healthy provider with a model name matching the original prefix (e.g., claude-*).

Cloud Model Bypass

Ollama cloud models (with -cloud or :cloud suffix) may be silently rerouted to OpenRouter when a mapping exists. Look for:

[CloudBypass] qwen3.5:9b-cloud → openrouter/<model> (parallel-capable)

Unknown Provider

Error:

Unknown provider: <name>. Available: anthropic, openai, ollama, google, ...

Check your model ID format: it must be provider/model-name (e.g., anthropic/claude-sonnet-4-20250514).

6. Agent Stall Detection

TITAN monitors agent sessions for stalls and automatically intervenes.

Stall Types

Type	Trigger	Default Threshold
`silence`	No activity	30s (120s on CPU-only / autonomous mode)
`tool_loop`	Same tool + same args repeated	3 times in a row
`empty_response`	LLM returns < 3 characters	Immediate
`max_rounds`	Tool round budget exhausted	Depends on mode (up to 25 in autonomous)
`tool_call_failure`	Model ignores tools for N rounds	3 consecutive rounds

What Happens on Stall

TITAN sends a nudge message to the agent (up to 2 attempts, or 5 in autonomous mode)
If nudges don't help, the agent gives up with: "I've been unable to make progress on this task."

Adjusting Thresholds

Stall thresholds auto-adjust based on hardware:

GPU detected: 30-second silence timeout
No GPU (CPU-only): 120-second silence timeout
Autonomous mode: 120-second silence timeout, 5 nudge attempts

7. Common HTTP Error Codes

Code	Error	Cause	Fix
400	`Invalid JSON`	Malformed request body	Check your JSON syntax
401	`Unauthorized`	Missing or invalid auth token	Add `Authorization: Bearer <token>` header
401	`Invalid password`	Wrong login password	Check `gateway.auth.password` in config
404	`Model '<name>' not found in Ollama`	Model not pulled	Run `ollama pull <name>`
413	`Payload too large (max 1MB)`	Request body > 1 MB	Reduce payload size
429	`Too many requests`	Rate limit exceeded	Wait for `Retry-After` seconds
503	`Server busy — too many concurrent requests`	> 5 concurrent requests	Wait and retry
503	`Cannot verify model — Ollama is unreachable`	Ollama down or unreachable	Start Ollama: `ollama serve`

WebSocket Errors

Close Code	Message	Cause
1008	`Unauthorized`	Invalid or missing auth on WS connection
1008	`Mesh auth failed`	Mesh peer HMAC authentication rejected

8. Quick Troubleshooting Checklist

Run through this list when something isn't working:

Useful Diagnostic Commands

# TITAN health check
curl http://localhost:48420/api/health

# TITAN system stats
curl http://localhost:48420/api/stats

# VRAM status
curl http://localhost:48420/api/vram

# Currently loaded Ollama models
ollama ps

# GPU memory
nvidia-smi --query-gpu=memory.used,memory.free --format=csv

# Run TITAN's built-in doctor
titan doctor --json

Getting Help

GitHub Issues: https://github.com/Djtony707/TITAN/issues
Wiki: https://github.com/Djtony707/TITAN/wiki
Discord: Check the #titan-support channel

TITAN Wiki

Uh oh!

Troubleshooting

Troubleshooting

1. Gateway Startup Issues

Port Already in Use

Auth Configuration Problems

Request Limits

2. Ollama Connection Problems

ECONNREFUSED

Model Not Found

Ollama Unreachable During Model Switch

Ollama Timeouts

Context Window Defaults

3. GPU / VRAM Troubleshooting

GPU Not Detected

Low VRAM

VRAM Acquire Failures

VRAM Configuration

4. Model Tool-Calling Issues

Models That Work Well

Models to Avoid

"Does Not Support Tools" Error

Tool Call Failure Self-Heal

Hardware Recommendations

5. Provider Routing & Fallback

How Fallback Works

Fallback Chain

Provider-Level Failover

Cloud Model Bypass

Unknown Provider

6. Agent Stall Detection

Stall Types

What Happens on Stall

Adjusting Thresholds

7. Common HTTP Error Codes

WebSocket Errors

8. Quick Troubleshooting Checklist

Useful Diagnostic Commands

Getting Help

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally