AgentHub SDK - Unified and Precise LLM SDK

AgentHub is the LLM API Hub for the Agent era, built for high-precision autonomous agents.

📢 Follow us on X: or join our Discord Community

Why AgentHub?

🔗 Unified: A consistent and intuitive interface for developing agents across different LLMs.
🎯 Precise: Automatically handles interleaved thinking during multi-step tool calls, preventing performance degradation.
🧭 Traceable: Provides lightweight yet fine-grained tracing for debugging and auditing LLM executions.

Features

AutoLLMClient (Python & TypeScript)

Switch different LLMs with zero code changes and no performance loss.

Built-in Observability

Audit LLM executions by adding a single trace_id parameter, no database required.

agenthub.mp4

Supported Models

Model Name	Vendor	Reasoning	Tool Use	Image Understanding	Image Generation	Speech Generation
Gemini 3/3.1	Official/Google Vertex AI	✅	✅	✅	✅	✅
Claude 4.6	Official/Amazon Bedrock/UModelVerse	✅	✅	✅	❌	❌
GPT-5.4/5.5	Official/UModelVerse	✅	✅	✅	❌	❌
Kimi-K2.5	Official/OpenRouter/SiliconFlow	✅	✅	✅	❌	❌
GLM-5	Official/OpenRouter/SiliconFlow	✅	✅	❌	❌	❌
Qwen3	OpenRouter/SiliconFlow/vLLM	✅	✅	❌	❌	❌

Installation

Python package

Install from PyPI:

uv add agenthub-python
# or
pip install agenthub-python

Build from source:

cd src_py && make

See src_py/README.md for comprehensive usage examples and API documentation.

TypeScript package

Install from npm:

npm install @prismshadow/agenthub

Build from source:

cd src_ts && make install && make build

See src_ts/README.md for comprehensive usage examples and API documentation.

APIs

AutoLLMClient is the main class for interacting with the AgentHub SDK. It provides the following methods:

(async) streaming_response(messages, config): Streams the response of LLMs in a stateless manner.
(async) streaming_response_stateful(message, config): Streams the response of LLMs in a stateful manner.
clear_history(): Clears the history of the stateful LLM client.
get_history(): Returns the history of the stateful LLM client.
set_history(history): Replaces the history of the stateful LLM client with a copy of the provided list.

Basic Usage

Note

We recommend using the stateful interface when calling the AgentHub SDK.

OpenAI GPT-5.5

Python Example:

import asyncio
import os
from agenthub import AutoLLMClient

os.environ["OPENAI_API_KEY"] = "your-openai-api-key"

async def main():
    client = AutoLLMClient(model="gpt-5.5")
    async for event in client.streaming_response_stateful(
        message={
            "role": "user",
            "content_items": [{"type": "text", "text": "Say 'Hello, World!'"}]
        },
        config={"temperature": 1.0}
    ):
        print(event)

asyncio.run(main())
# {'role': 'assistant', 'event_type': 'delta', 'content_items': [{'type': 'text', 'text': 'Hello'}], 'usage_metadata': None, 'finish_reason': None}
# {'role': 'assistant', 'event_type': 'delta', 'content_items': [{'type': 'text', 'text': ','}], 'usage_metadata': None, 'finish_reason': None}
# {'role': 'assistant', 'event_type': 'delta', 'content_items': [{'type': 'text', 'text': ' World'}], 'usage_metadata': None, 'finish_reason': None}
# {'role': 'assistant', 'event_type': 'delta', 'content_items': [{'type': 'text', 'text': '!'}], 'usage_metadata': None, 'finish_reason': None}
# {'role': 'assistant', 'event_type': 'stop', 'content_items': [], 'usage_metadata': {'cached_tokens': 0, 'prompt_tokens': 12, 'thoughts_tokens': 0, 'response_tokens': 8}, 'finish_reason': 'stop'}

TypeScript Example:

import { AutoLLMClient } from "@prismshadow/agenthub";

process.env.OPENAI_API_KEY = "your-openai-api-key";

async function main() {
  const client = new AutoLLMClient({ model: "gpt-5.5" });
  for await (const event of client.streamingResponseStateful({
    message: {
      role: "user",
      content_items: [{ type: "text", text: "Say 'Hello, World!'" }]
    },
    config: {}
  })) {
    console.log(event);
  }
}

main().catch(console.error);
// {'role': 'assistant', 'event_type': 'delta', 'content_items': [{'type': 'text', 'text': 'Hello'}], 'usage_metadata': null, 'finish_reason': null}
// {'role': 'assistant', 'event_type': 'delta', 'content_items': [{'type': 'text', 'text': ','}], 'usage_metadata': null, 'finish_reason': null}
// {'role': 'assistant', 'event_type': 'delta', 'content_items': [{'type': 'text', 'text': ' World'}], 'usage_metadata': null, 'finish_reason': null}
// {'role': 'assistant', 'event_type': 'delta', 'content_items': [{'type': 'text', 'text': '!'}], 'usage_metadata': null, 'finish_reason': null}
// {'role': 'assistant', 'event_type': 'stop', 'content_items': [], 'usage_metadata': {'cached_tokens': 0, 'prompt_tokens': 12, 'thoughts_tokens': 0, 'response_tokens': 8}, 'finish_reason': 'stop'}

Anthropic Claude 4.6

Python Example

import asyncio
import os
from agenthub import AutoLLMClient

os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-api-key"

async def main():
    client = AutoLLMClient(model="claude-sonnet-4-6")
    async for event in client.streaming_response_stateful(
        message={
            "role": "user",
            "content_items": [{"type": "text", "text": "Say 'Hello, World!'"}]
        },
        config={}
    ):
        print(event)

asyncio.run(main())

TypeScript Example

import { AutoLLMClient } from "@prismshadow/agenthub";

process.env.ANTHROPIC_API_KEY = "your-anthropic-api-key";

async function main() {
  const client = new AutoLLMClient({ model: "claude-sonnet-4-6" });
  for await (const event of client.streamingResponseStateful({
    message: {
      role: "user",
      content_items: [{"type": "text", "text": "Say 'Hello, World!'"}]
    },
    config: {}
  })) {
    console.log(event);
  }
}

main().catch(console.error);

OpenRouter GLM-5

Python Example

import asyncio
import os
from agenthub import AutoLLMClient

os.environ["GLM_API_KEY"] = "your-openrouter-api-key"
os.environ["GLM_BASE_URL"] = "https://openrouter.ai/api/v1"

async def main():
    client = AutoLLMClient(model="z-ai/glm-5")
    async for event in client.streaming_response_stateful(
        message={
            "role": "user",
            "content_items": [{"type": "text", "text": "Say 'Hello, World!'"}]
        },
        config={}
    ):
        print(event)

asyncio.run(main())

TypeScript Example

import { AutoLLMClient } from "@prismshadow/agenthub";

process.env.GLM_API_KEY = "your-openrouter-api-key";
process.env.GLM_BASE_URL = "https://openrouter.ai/api/v1";

async function main() {
  const client = new AutoLLMClient({ model: "z-ai/glm-5" });
  for await (const event of client.streamingResponseStateful({
    message: {
      role: "user",
      content_items: [{"type": "text", "text": "Say 'Hello, World!'"}]
    },
    config: {}
  })) {
    console.log(event);
  }
}

main().catch(console.error);

SiliconFlow Qwen3-8B

Python Example

import asyncio
import os
from agenthub import AutoLLMClient

os.environ["QWEN3_API_KEY"] = "your-siliconflow-api-key"
os.environ["QWEN3_BASE_URL"] = "https://api.siliconflow.cn/v1"

async def main():  
    client = AutoLLMClient(model="Qwen/Qwen3-8B")
    async for event in client.streaming_response_stateful(
        message={
            "role": "user",
            "content_items": [{"type": "text", "text": "Say 'Hello, World!'"}]
        },
        config={}
    ):
        print(event)

asyncio.run(main())

TypeScript Example

import { AutoLLMClient } from "@prismshadow/agenthub";

process.env.QWEN3_API_KEY = "your-siliconflow-api-key";
process.env.QWEN3_BASE_URL = "https://api.siliconflow.cn/v1";

async function main() {
  const client = new AutoLLMClient({ model: "Qwen/Qwen3-8B" });
  for await (const event of client.streamingResponseStateful({
    message: {
      role: "user",
      content_items: [{ type: "text", text: "Say 'Hello, World!'" }],
    },
    config: {}
  })) {
    console.log(event);
  }
}

main().catch(console.error);

Concepts: UniConfig, UniMessage and UniEvent

UniConfig

UniConfig is an object that contains the configuration for LLMs.

Example UniConfig:

{
  "max_tokens": 1024,
  "temperature": 1.0,
  "tools": [
    {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
          "type": "object",
          "properties": {
              "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
              }
          },
          "required": ["location"]
      }
    }
  ],
  "thinking_summary": true,
  "thinking_level": "none | low | medium | high",
  "tool_choice": "auto | required | none",
  "system_prompt": "You are a helpful assistant.",
  "prompt_caching": "enable | disable | enhance",
  "image_config": {"aspect_ratio": "4:3", "image_size": "1K"},
  "tts_config": [{"voice": "Kore"}],
  "trace_id": null
}

UniMessage

UniMessage is an object that contains the input for LLMs.

Example UniMessage:

{
  "role": "user | assistant",
  "content_items": [
    {"type": "text", "text": "How are you doing?"},
    {"type": "image_url", "image_url": "https://example.com/image.jpg"},
    {"type": "inline_data", "mime_type": "image/jpeg", "data": "base64-encoded-image"},
    {"type": "thinking", "thinking": "I am thinking.", "signature": "0x123456"},
    {"type": "inline_thinking", "mime_type": "image/jpeg", "data": "base64-encoded-image"},
    {"type": "tool_call", "name": "math", "arguments": {"expression": "2 + 3"}, "tool_call_id": "123"},
    {"type": "tool_result", "text": "2 + 3 = 5", "images": [], "tool_call_id": "123"}
  ]
}

UniEvent

UniEvent is an object that contains streaming output of LLMs.

Example UniEvent:

{
  "role": "assistant",
  "event_type": "delta",
  "content_items": [
    {"type": "partial_tool_call", "name": "math", "arguments": "", "tool_call_id": "123"}
  ],
  "usage_metadata": {
    "cached_tokens": null,
    "prompt_tokens": 10,
    "thoughts_tokens": null,
    "response_tokens": 1
  },
  "finish_reason": null,
  "created_at": 1694502400000
}

Token Usage

AgentHub provides detailed token usage information through the usage_metadata field in streaming events.

The usage_metadata object contains four fields:

cached_tokens: Cached input tokens
prompt_tokens: Non-cached input tokens
thoughts_tokens: Chain-of-thought output tokens
response_tokens: Non-chain-of-thought output tokens

You can calculate the total token usage as follows:

input_tokens = cached_tokens + prompt_tokens
output_tokens = thoughts_tokens + response_tokens
total_tokens = input_tokens + output_tokens

█████████████  ░░░░░░░░░░░░░ → LLM → ███████████████  ░░░░░░░░░░░░░░░
cached_tokens  prompt_tokens         thoughts_tokens  response_tokens
        input_tokens                          output_tokens

Tracing LLM Executions

We provide a tracer to help you monitor and debug your LLM executions. You can enable tracing by setting the trace_id parameter to a unique identifier in the config object.

async for event in client.streaming_response_stateful(
    message={
        "role": "user",
        "content_items": [{"type": "text", "text": "Say 'Hello, World!'"}]
    },
    config={"trace_id": "unique-trace-id"}
):
    print(event)

cd src_py && uv run python -m agenthub.integration.tracer --host 127.0.0.1 --port 25750

cd src_ts && npm run tracer

Then you can view the tracing output in the dashboard at http://localhost:25750/.

LLM Playground

We provide a LLM playground to help you test your LLMs.

cd src_py && uv run python -m agenthub.integration.playground --host 127.0.0.1 --port 25751

cd src_ts && npm run playground

You can access the playground at http://localhost:25751/.

Related Work

License

Licensed under the Apache License, Version 2.0. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.github		.github
llmsdk_docs		llmsdk_docs
src_py		src_py
src_ts		src_ts
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentHub SDK - Unified and Precise LLM SDK

Why AgentHub?

Features

AutoLLMClient (Python & TypeScript)

Built-in Observability

Supported Models

Installation

Python package

TypeScript package

APIs

Basic Usage

OpenAI GPT-5.5

Anthropic Claude 4.6

OpenRouter GLM-5

SiliconFlow Qwen3-8B

Concepts: UniConfig, UniMessage and UniEvent

UniConfig

UniMessage

UniEvent

Token Usage

Tracing LLM Executions

LLM Playground

Related Work

License

Star History

About

Uh oh!

Releases 3

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentHub SDK - Unified and Precise LLM SDK

Why AgentHub?

Features

AutoLLMClient (Python & TypeScript)

Built-in Observability

Supported Models

Installation

Python package

TypeScript package

APIs

Basic Usage

OpenAI GPT-5.5

Anthropic Claude 4.6

OpenRouter GLM-5

SiliconFlow Qwen3-8B

Concepts: UniConfig, UniMessage and UniEvent

UniConfig

UniMessage

UniEvent

Token Usage

Tracing LLM Executions

LLM Playground

Related Work

License

Star History

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Uh oh!

Contributors

Uh oh!

Languages