Orq.ai Agent Skills

Agent Skills for the full Build → Evaluate → Optimize lifecycle of LLM pipelines on orq.ai.

Skills are multi-step workflows that require reasoning (e.g. build an agent, run an experiment);

Commands are quick actions for immediate results (list traces, show analytics).

Each skill encodes best practices from prompt engineering, agent design, evaluation methodology, and experimentation into repeatable workflows. From creating agents and writing prompts, through trace analysis and dataset generation, to running validated experiments and iterating on results.

Built on the Agent Skills standard format, so it works with any compatible agent (Claude Code, Cursor, Gemini CLI, and others).

Setup

Prerequisites

An orq.ai account
An API key from Settings → API Keys
```
export ORQ_API_KEY=your-key-here
```

Claude Code plugin

Use this if you want easy access to all components — skills, MCP tools, and trace hooks — in one install. Installed via the orq-ai/claude-plugins marketplace.

# In Claude Code:
/plugin marketplace add orq-ai/claude-plugins

# Install all 3 plugins
/plugin install orq-skills@orq-claude-plugin
/plugin install orq-mcp@orq-claude-plugin
/plugin install orq-trace@orq-claude-plugin

Plugin	What it gives you
`orq-skills`	Skills, commands, and agents for the Build → Evaluate → Optimize lifecycle
`orq-mcp`	MCP server registration — Claude can call orq.ai APIs directly
`orq-trace`	OTLP tracing hooks that capture Claude Code sessions into orq.ai

Verify with the interactive onboarding — checks ORQ_API_KEY, MCP reachability, and credentials:

/orq:quickstart

Skills-only install

Use this when you're on a non-Claude agent (Cursor, Gemini CLI, Cline, Copilot CLI, Codex, Windsurf, and many others), or when you only want the skills without MCP/trace hooks.

npx skills add orq-ai/orq-skills

Auto-detects your agent and writes skills to the correct location (e.g. .claude/skills/, .cursor/rules/). Run inside your project directory.

Agent-specific install guides:

MCP-only install

Use this when you want orq.ai MCP tools in a tool that isn't the Claude Code plugin (Claude Desktop, other MCP-capable clients, or manual Claude Code setup).

# Manual registration in Claude Code
claude mcp add --transport http orq-workspace https://my.orq.ai/v2/mcp \
  --header "Authorization: Bearer ${ORQ_API_KEY}"

For other clients, most accept a JSON block with url + headers:

{
  "mcpServers": {
    "orq-workspace": {
      "type": "http",
      "url": "https://my.orq.ai/v2/mcp",
      "headers": { "Authorization": "Bearer ${ORQ_API_KEY}" }
    }
  }
}

Manifest validation

tests/scripts/validate-plugin-manifests.sh

Commands

Quick-action slash commands. Use /orq:<command> in Claude Code.

Command	What It Does	Usage
quickstart	Interactive onboarding — credentials, MCP setup, skills tour	`/orq:quickstart`
workspace	Workspace overview — agents, deployments, prompts, datasets, experiments	`/orq:workspace [section]`
traces	Query and summarize traces with filters	`/orq:traces [--deployment name] [--status error] [--last 24h]`
models	List available AI models by provider	`/orq:models [search-term]`
analytics	Usage analytics — requests, cost, tokens, errors	`/orq:analytics [--last 24h] [--group-by model]`

Examples

/orq:workspace agents          # Show only agents
/orq:traces --status error --last 1h   # Recent errors
/orq:models gpt-4              # Search for GPT-4 variants
/orq:analytics --group-by deployment    # Cost per deployment

Skills

Skills are triggered by describing what you need. Claude picks the right skill automatically.

Skill	What It Does	Documentation
setup-observability	Set up orq.ai observability for LLM applications — AI Router proxy, OpenTelemetry, tracing setup, and trace enrichment	SKILL.md
invoke-deployment	Invoke orq.ai deployments, agents, and models via the Python SDK or HTTP API — pass prompt variables, stream responses, and generate integration code	SKILL.md
build-agent	Design, create, and configure an orq.ai Agent with tools, instructions, knowledge bases, and memory	SKILL.md
build-evaluator	Create validated LLM-as-a-Judge evaluators following evaluation best practices	SKILL.md
analyze-trace-failures	Read production traces, identify what's failing, build failure taxonomies, and categorize issues	SKILL.md
run-experiment	Create and run orq.ai experiments — compare configurations with specialized agent, conversation, and RAG evaluation	SKILL.md
compare-agents	Run cross-framework agent comparisons using evaluatorq — compare orq.ai, LangGraph, CrewAI, OpenAI Agents SDK, and others	SKILL.md
generate-synthetic-dataset	Generate and curate evaluation datasets — structured generation, quick from description, expansion, and dataset maintenance	SKILL.md
optimize-prompt	Analyze and optimize system prompts using a structured prompting guidelines framework	SKILL.md

Workflows

1. Build a New Agent

"I need a customer support agent"             → build-agent
"Create test cases for it"                     → generate-synthetic-dataset
"Build an evaluator for response accuracy"     → build-evaluator
"Run an experiment to get a baseline"          → run-experiment

2. Debug Production Issues

/orq:traces --status error --last 24h          # Find errors
"Analyze these failures"                       → analyze-trace-failures
"Fix the prompt based on the failure analysis" → optimize-prompt
"Re-run the experiment to verify the fix"      → run-experiment

3. Improve an Existing Agent

/orq:analytics --group-by deployment           # Spot high error rates
"Analyze traces for the checkout agent"        → analyze-trace-failures
"Build evaluators for the failure modes"       → build-evaluator
"Generate a dataset covering edge cases"       → generate-synthetic-dataset
"Run an experiment and compare"                → run-experiment
"Optimize the prompt based on results"         → optimize-prompt

4. Improve an existing Prompt

"My prompt isn't performing well, help me improve it" → optimize-prompt
"Create test cases to compare before and after"       → generate-synthetic-dataset
"Build an evaluator for [specific dimension]"         → build-evaluator
"Run an experiment: current vs optimized prompt"     → run-experiment
"Refine the prompt based on failure cases"            → optimize-prompt

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.agents/plugins		.agents/plugins
.claude-plugin		.claude-plugin
.codex-plugin		.codex-plugin
.cursor-plugin		.cursor-plugin
agents		agents
assets		assets
commands		commands
docs		docs
plugins/orq		plugins/orq
skills		skills
tests		tests
.gitignore		.gitignore
.mcp.json		.mcp.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
mcp.json		mcp.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orq.ai Agent Skills

Setup

Prerequisites

Claude Code plugin

Skills-only install

MCP-only install

Manifest validation

Commands

Examples

Skills

Workflows

1. Build a New Agent

2. Debug Production Issues

3. Improve an Existing Agent

4. Improve an existing Prompt

Links

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Orq.ai Agent Skills

Setup

Prerequisites

Claude Code plugin

Skills-only install

MCP-only install

Manifest validation

Commands

Examples

Skills

Workflows

1. Build a New Agent

2. Debug Production Issues

3. Improve an Existing Agent

4. Improve an existing Prompt

Links

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages