-
Notifications
You must be signed in to change notification settings - Fork 47
Local Dockerised Eval Server #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
2a8a461
e9a4af4
700d1a9
7ec8302
ce6b75f
7724f65
4cda0cd
78b020b
1ab0393
855fb9a
d588841
38b725c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# Evaluation Server Configuration | ||
# Copy this file to .env and configure your settings | ||
|
||
# Server Configuration | ||
PORT=8080 | ||
HOST=127.0.0.1 | ||
|
||
# LLM Provider API Keys | ||
# Configure one or more providers for evaluation | ||
|
||
# OpenAI Configuration | ||
OPENAI_API_KEY=sk-your-openai-api-key-here | ||
|
||
# LiteLLM Configuration (if using a LiteLLM server) | ||
LITELLM_ENDPOINT=http://localhost:4000 | ||
LITELLM_API_KEY=your-litellm-api-key-here | ||
|
||
# Groq Configuration | ||
GROQ_API_KEY=gsk_your-groq-api-key-here | ||
|
||
# OpenRouter Configuration | ||
OPENROUTER_API_KEY=sk-or-v1-your-openrouter-api-key-here | ||
|
||
# Default LLM Configuration for Evaluations | ||
# These will be used as fallbacks when not specified in evaluation requests | ||
DEFAULT_PROVIDER=openai | ||
DEFAULT_MAIN_MODEL=gpt-4 | ||
DEFAULT_MINI_MODEL=gpt-4-mini | ||
DEFAULT_NANO_MODEL=gpt-3.5-turbo | ||
|
||
# Logging Configuration | ||
LOG_LEVEL=info | ||
LOG_DIR=./logs | ||
|
||
# Client Configuration | ||
CLIENTS_DIR=./clients | ||
EVALS_DIR=./evals | ||
|
||
# RPC Configuration | ||
RPC_TIMEOUT=30000 | ||
|
||
# Security | ||
# Set this to enable authentication for client connections | ||
# Leave empty to disable authentication | ||
AUTH_SECRET_KEY= |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,7 +49,7 @@ validation: | |
- "Delegated price filter setting to action_agent" | ||
- "Coordinated property type selection through action_agent" | ||
- "Applied search filters through proper action_agent calls" | ||
- "Extracted property listings with schema_based_extractor" | ||
- "Extracted property listings with extract_data" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion 🧩 Analysis chainEnsure repo-wide extractor rename alignment. Double-check that client schema/tool enums and templates now use extract_data. Run: 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "Find lingering old extractor names..."
rg -n -C1 -g '!**/node_modules/**' -e 'schema_based_extractor' -e 'extract_schema_data' || true
echo
echo "Validate schema/tool enums reference extract_data:"
fd default-client.yaml | xargs -I{} rg -n -C1 'extract_data' {}
fd client.schema.json | xargs -I{} rg -n -C2 'extract_data|extract_schema_data|schema_based_extractor' {} Length of output: 465 Ensure repo-wide extractor rename alignment default-client.yaml already uses "extract_data" (lines 29–31) but client.schema.json still enumerates "extract_schema_streamlined" alongside "extract_data" (lines 131–135); update/remove the legacy enum value and align any templates/libs to "extract_data", then run a repo-wide search for other old extractor names. 🤖 Prompt for AI Agents
|
||
- "Returned structured property data in readable text format (not JSON)" | ||
- "Each property includes address, price, bedrooms, bathrooms, and other key details" | ||
- "Properties are clearly numbered or organized for easy comparison" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,7 +49,7 @@ validation: | |
- "Delegated price filter setting to action_agent" | ||
- "Coordinated property type selection through action_agent" | ||
- "Applied search filters through proper action_agent calls" | ||
- "Extracted property listings with schema_based_extractor" | ||
- "Extracted property listings with extract_data" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion 🧩 Analysis chainConsistency check: fully migrate to extract_data across configs/schemas. Ensure no lingering references to schema_based_extractor or extract_schema_data remain in evals, templates, or schemas. Run to verify: 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "Searching for old extractor names..."
rg -n -C1 -g '!**/node_modules/**' -e 'schema_based_extractor' -e 'extract_schema_data' || true
echo
echo "Confirming extract_data is used for evaluation.tool values..."
rg -n -C1 -g '!**/node_modules/**' -e 'tool:\s*"extract_data"' -e '"tool"\s*:\s*"extract_data"' || true
echo
echo "Check client schema enum still doesn't reference old names..."
fd client.schema.json | xargs -I{} rg -n -C2 -e 'extract_schema_data|schema_based_extractor|extract_data' {} Length of output: 629 Consistency check — fully migrate to extract_data across configs/schemas. 🤖 Prompt for AI Agents
|
||
- "Returned structured property data in readable text format (not JSON)" | ||
- "Each property includes address, price, bedrooms, bathrooms, and other key details" | ||
- "Properties are clearly numbered or organized for easy comparison" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
🧩 Analysis chain
Tool rename to "extract_data" and longer timeout look good; verify end‑to‑end support.
Confirm server/tool registry, schemas, and tests accept "extract_data" (no lingering "extract_schema_data" usage).
🏁 Script executed:
Length of output: 9746
Replace remaining "extract_schema_data" references with "extract_data" and confirm server/tool registry + tests accept the new name.
Docs/examples still use the old tool name while eval YAMLs/templates use "extract_data" — update the docs/snippets and ensure any server enums/switches or tests accept "extract_data".
Affected files (examples found):
🤖 Prompt for AI Agents