Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,9 @@ services:
volumes:
- ./.env:/app/.env:ro
- ./llm_config.json:/app/llm_config.json:ro
- ./llm_config.premium.json:/app/llm_config.premium.json:ro
- ./llm_config.frontier.json:/app/llm_config.frontier.json:ro
- ./llm_config.custom.json:/app/llm_config.custom.json:ro
- ./run:/app/run
restart: unless-stopped
develop:
Expand All @@ -166,6 +169,15 @@ services:
- action: sync
path: ./llm_config.json
target: /app/llm_config.json
- action: sync
path: ./llm_config.premium.json
target: /app/llm_config.premium.json
- action: sync
path: ./llm_config.frontier.json
target: /app/llm_config.frontier.json
- action: sync
path: ./llm_config.custom.json
target: /app/llm_config.custom.json
- action: sync
path: ./.env
target: /app/.env
Expand Down Expand Up @@ -196,6 +208,13 @@ services:
PLANEXE_FRONTEND_MULTIUSER_ADMIN_PASSWORD: ${PLANEXE_FRONTEND_MULTIUSER_ADMIN_PASSWORD:-admin}
ports:
- "${PLANEXE_FRONTEND_MULTIUSER_PORT:-5001}:5000"
volumes:
- ./.env:/app/.env:ro
- ./llm_config.json:/app/llm_config.json:ro
- ./llm_config.premium.json:/app/llm_config.premium.json:ro
- ./llm_config.frontier.json:/app/llm_config.frontier.json:ro
- ./llm_config.custom.json:/app/llm_config.custom.json:ro
- ./run:/app/run
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:5000/healthcheck').read()"]
interval: 10s
Expand Down
108 changes: 66 additions & 42 deletions docs/llm_config.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,72 @@
---
title: LLM config (llm_config.json)
title: LLM config profiles
---

# LLM config (llm_config.json)
# LLM config profiles

This file defines which LLM providers and models PlanExe can use. Each top‑level key is a model id used in the UI and pipeline.
PlanExe supports **4 model profiles**:

`llm_config.json` lives in the PlanExe repo root and is read at runtime. Environment variables are substituted from `.env`.
- `baseline`
- `premium`
- `frontier`
- `custom`

Each profile maps to a separate config file:

- `baseline` → `llm_config.json`
- `premium` → `llm_config.premium.json`
- `frontier` → `llm_config.frontier.json`
- `custom` → `llm_config.custom.json` (or `PLANEXE_LLM_CONFIG_CUSTOM_FILENAME`)

If the selected profile file is missing or invalid, PlanExe safely falls back to `llm_config.json`.

---

## How profile selection works

### Runtime env var

Set:

- `PLANEXE_MODEL_PROFILE=baseline|premium|frontier|custom`

This is passed end-to-end in worker execution paths (frontend/API/task parameters → worker pipeline).

### Request/task parameter

Task producers (web frontend, MCP) can include:

- `model_profile`

Invalid values are normalized to `baseline`.

---

## Strict filename validation

Config filenames are strictly validated:

- must be a **filename only** (no `/`, `\\`, absolute path)
- must match: `llm_config*.json`

This prevents path traversal and unsafe file selection.

Legacy override `PLANEXE_LLM_CONFIG_NAME` is still supported for backward compatibility, but profile-based selection is preferred.

---

## Provider-priority ordering per profile

Within each profile config file, priority is defined per model entry:

- lower `priority` value = tried first
- higher `priority` value = fallback order

`auto` mode uses this profile-specific priority ordering.

---

## File structure
## File format (same for all profile files)

```json
{
Expand All @@ -24,8 +80,6 @@ This file defines which LLM providers and models PlanExe can use. Each top‑lev
"api_key": "${OPENROUTER_API_KEY}",
"temperature": 0.1,
"timeout": 60.0,
"is_function_calling_model": false,
"is_chat_model": true,
"max_tokens": 8192,
"max_retries": 5
}
Expand All @@ -35,41 +89,11 @@ This file defines which LLM providers and models PlanExe can use. Each top‑lev

---

## Top-level fields
## Backward compatibility

- **comment**: Plain‑text description for humans. Optional.
- **priority**: Lower number = higher priority when `auto` is selected. Optional.
- **luigi_workers**: Number of Luigi workers used for this model. Use `1` for local models (Ollama/LM Studio).
- **class**: Provider class name (e.g., `OpenRouter`, `OpenAI`, `Ollama`, `LMStudio`, `OpenAILike`).
- **arguments**: Provider‑specific settings passed to the LLM client.

---

## Common arguments

These keys are common across most providers:

- **model** / **model_name**: Provider model identifier.
- **api_key**: API key reference (usually `${ENV_VAR}`).
- **base_url** / **api_base**: Override the provider base URL.
- **temperature**: Controls randomness. Lower is more deterministic.
- **timeout** / **request_timeout**: Max time per request in seconds.
- **max_tokens** / **max_completion_tokens**: Output token limit (provider specific).
- **max_retries**: Retry count on transient errors.
- **is_function_calling_model**: Whether the model supports structured/tool output.
- **is_chat_model**: Whether the model uses chat format.

---

## Choosing values

- Use **luigi_workers = 1** for local models (Ollama / LM Studio).
- Use **luigi_workers > 1** for cloud models if you want parallel tasks.
- Keep **timeout** higher for slower models.

---
When no profile is provided, PlanExe defaults to:

## Notes
- `baseline`
- `llm_config.json`

- If `llm_config.json` is missing, PlanExe logs a warning and proceeds with defaults.
- Changes to `llm_config.json` require a container restart (or rebuild if baked into the image).
So existing deployments continue to work without changes.
1 change: 1 addition & 0 deletions frontend_multi_user/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ WORKDIR /app
COPY worker_plan/worker_plan_api /app/worker_plan_api
COPY database_api /app/database_api
COPY frontend_multi_user /app/frontend_multi_user
COPY llm_config*.json /app/

# Install dependencies from frontend_multi_user pyproject
RUN set -eux; \
Expand Down
51 changes: 49 additions & 2 deletions frontend_multi_user/src/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@

from worker_plan_api.planexe_dotenv import DotEnvKeyEnum, PlanExeDotEnv
from worker_plan_api.planexe_config import PlanExeConfig
from worker_plan_api.model_profile import ModelProfileEnum, normalize_model_profile

RUN_DIR = "run"

Expand Down Expand Up @@ -122,6 +123,43 @@ def wrapper(*args, **kwargs):
return view(*args, **kwargs)
return wrapper


def _profile_model_name_map() -> Dict[str, list[str]]:
profile_to_models: Dict[str, list[str]] = {}
for profile in ModelProfileEnum:
config = PlanExeConfig.load(model_profile_override=profile)
config_path = config.llm_config_json_path
if config_path is None:
profile_to_models[profile.value] = []
continue
try:
with config_path.open("r", encoding="utf-8") as fh:
model_map = json.load(fh)
except Exception:
profile_to_models[profile.value] = []
continue
if not isinstance(model_map, dict):
profile_to_models[profile.value] = []
continue

def sort_key(item: tuple[str, dict]) -> tuple[int, str]:
data = item[1] if isinstance(item[1], dict) else {}
priority = data.get("priority")
if not isinstance(priority, int):
priority = 999999
return priority, item[0]

names: list[str] = []
for model_id, model_data in sorted(model_map.items(), key=sort_key):
model_name = model_id
if isinstance(model_data, dict):
args = model_data.get("arguments")
if isinstance(args, dict) and isinstance(args.get("model"), str):
model_name = args["model"]
names.append(model_name)
profile_to_models[profile.value] = names
return profile_to_models

class MyFlaskApp:
def __init__(self):
logger.info(f"MyFlaskApp.__init__. Starting...")
Expand Down Expand Up @@ -1887,6 +1925,7 @@ def index():
nonce=nonce,
user_id=user_id,
example_prompts=example_prompts,
model_profile_models_json=json.dumps(_profile_model_name_map()),
)

@self.app.route('/healthcheck')
Expand Down Expand Up @@ -2401,6 +2440,12 @@ def run():
if len(parameters) == 0:
parameters = None

# Normalize model profile to a known value with backward-compatible baseline default.
if not isinstance(parameters, dict):
parameters = {}
raw_profile = parameters.get("model_profile")
parameters["model_profile"] = normalize_model_profile(raw_profile).value

# Get length of prompt_param in bytes and in characters
prompt_param_bytes = len(prompt_param.encode('utf-8'))
prompt_param_characters = len(prompt_param)
Expand Down Expand Up @@ -2502,8 +2547,10 @@ def create_plan():
parameters.pop('user_id', None)
parameters.pop('nonce', None)
parameters.pop('redirect_to_plan', None)
if len(parameters) == 0:
parameters = None

# Normalize model profile to a known value with backward-compatible baseline default.
raw_profile = parameters.get("model_profile")
parameters["model_profile"] = normalize_model_profile(raw_profile).value

prompt_param_bytes = len(prompt_param.encode('utf-8'))
prompt_param_characters = len(prompt_param)
Expand Down
3 changes: 2 additions & 1 deletion frontend_multi_user/templates/demo_run.html
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,7 @@ <h1>Demo Run</h1>
<input type="hidden" name="nonce" value="{{ nonce }}">
<!-- Values are submitted only when enabled (not disabled) -->
<input type="hidden" name="speed_vs_detail" id="form-speed-vs-detail" value="ping_llm">
<input type="hidden" name="model_profile" id="form-model-profile" value="baseline">
<input type="hidden" name="developer" id="form-developer" value="true">
</form>

Expand Down Expand Up @@ -282,7 +283,7 @@ <h1>Demo Run</h1>

if (methodSelect.value === 'GET') {
// GET method: build URL with query parameters
let url = `/run?prompt=${encodeURIComponent(promptValue)}&user_id={{ user_id }}&nonce={{ nonce }}&speed_vs_detail=${encodeURIComponent(speedVsDetailValue)}`;
let url = `/run?prompt=${encodeURIComponent(promptValue)}&user_id={{ user_id }}&nonce={{ nonce }}&speed_vs_detail=${encodeURIComponent(speedVsDetailValue)}&model_profile=baseline`;
if (developerChecked) {
url += '&developer';
}
Expand Down
43 changes: 43 additions & 0 deletions frontend_multi_user/templates/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -468,6 +468,21 @@ <h2>Start a New Plan</h2>
<form id="new-plan-form" method="POST" action="{{ url_for('create_plan') }}">
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}">
<input type="hidden" name="speed_vs_detail" value="all_details_but_slow">
<label for="model-profile" style="display:block; margin-bottom:8px; font-size:0.9rem; color:#a0aec0;">Model profile</label>
<select id="model-profile" name="model_profile" style="margin-bottom:12px; width:100%; max-width:240px; padding:8px; border-radius:8px;">
<option value="baseline" selected>baseline (default balanced)</option>
<option value="premium">premium (higher-cost ordering)</option>
<option value="frontier">frontier (highest-capability ordering)</option>
<option value="custom">custom (your custom file)</option>
</select>
<div style="margin-bottom:12px; font-size:0.85rem; color:#6b7280; line-height:1.4;">
baseline -> <code>llm_config.json</code>,
premium -> <code>llm_config.premium.json</code>,
frontier -> <code>llm_config.frontier.json</code>,
custom -> <code>llm_config.custom.json</code> (or <code>PLANEXE_LLM_CONFIG_CUSTOM_FILENAME</code>).
The actual models are read from the selected file's priority order.
</div>
<div id="model-profile-models" style="margin-bottom:12px; font-size:0.85rem; color:#4b5563; line-height:1.4;"></div>
<textarea name="prompt" id="plan-prompt" placeholder="Describe your project or idea in detail. The more context you provide, the better the plan will be." required></textarea>
<div class="char-count" id="char-count">0 characters</div>
<div class="new-plan-footer">
Expand Down Expand Up @@ -586,4 +601,32 @@ <h3>Avoid Surprises</h3>
});
</script>
{% endif %}
{% if user %}
<script>
var profileToModels = {{ model_profile_models_json | safe }};
var profileSelect = document.getElementById('model-profile');
var profileModelsDiv = document.getElementById('model-profile-models');

function renderProfileModels() {
if (!profileSelect || !profileModelsDiv) {
return;
}
var profile = profileSelect.value || 'baseline';
var models = profileToModels[profile] || [];
if (models.length === 0) {
profileModelsDiv.innerHTML = '<strong>Models in ' + profile + ':</strong> none found';
return;
}
var lines = models.map(function(modelName) {
return '<li><code>' + modelName + '</code></li>';
}).join('');
profileModelsDiv.innerHTML = '<strong>Models in ' + profile + ':</strong><ul style="margin:6px 0 0 16px;">' + lines + '</ul>';
}

if (profileSelect) {
profileSelect.addEventListener('change', renderProfileModels);
}
renderProfileModels();
</script>
{% endif %}
{% endblock %}
2 changes: 1 addition & 1 deletion frontend_single_user/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ RUN pip install --no-cache-dir --upgrade pip \
# Copy application code and supporting files
COPY worker_plan/worker_plan_api /app/worker_plan_api
COPY frontend_single_user /app/frontend_single_user
COPY llm_config.json /app/
COPY llm_config*.json /app/

# Default location for generated plans
RUN mkdir -p /app/run
Expand Down
Loading