Release v1.0.1 · lightseekorg/smg

🎉 Introducing Shepherd Model Gateway v1.0.1!

We're thrilled to announce Shepherd Model Gateway v1.0.1 – formerly SGLang Model Gateway. This major release marks a new chapter with a complete architectural overhaul, new enterprise features, and production-grade improvements!

🐑 Welcome to Shepherd

SGLang Model Gateway is now Shepherd Model Gateway (SMG).

Truly Engine-Agnostic Architecture: Shepherd is your universal gateway supporting all major inference engines – SGLang, vLLM, and TensorRT-LLM – plus complete 3rd party model provider integration including OpenAI, Anthropic, and Gemini. One gateway to route them all.

Universal API Support: Native implementation of Chat Completions, Responses API, Messages API, Interactions API, and Realtime API. Whether you're running open-source models on your infrastructure or routing to cloud providers, Shepherd handles it seamlessly.

Same powerful technology, new identity focused on guiding and managing your entire LLM infrastructure at scale – regardless of where your models run.

✨ Major New Features

⚡ TensorRT-LLM Backend Support - Native gRPC integration for NVIDIA TensorRT-LLM

🔄 vLLM Prefill-Decode-Disaggregation Support
Mooncake and NIXL-based KV transfer for disaggregated inference:

Auto-discovery for seamless integration
Massive scalability improvements for large deployments
Efficient KV cache sharing across workers

🎯 smg serve - Unified Worker Management
New serve subcommand with complete worker lifecycle orchestration:

Multi-worker data parallelism with GPU assignment
ServeOrchestrator for automated worker management
Two-pass argument parsing for flexible configuration
One command to rule them all

🤖 Anthropic Messages API Support
Full implementation of Anthropic's Messages API with streaming and non-streaming support. Deploy Claude models alongside your existing inference fleet.

🔌 Industry-First: Universal Built-in Tools via MCP 🔥

Turn any MCP server into built-in tools for all models – an industry-first capability that brings OpenAI-style built-in tools (FileSearch, WebSearch, CodeInterpreter) to every LLM, not just proprietary models.

Complete MCP Orchestration Stack:

McpOrchestrator with YAML policy configuration
Built-in tool routing infrastructure with qualified names – seamlessly integrate any MCP server as a native capability
ResponseFormat transformation pipeline - expose MCP servers as built-in tools (FileSearch, WebSearch, CodeInterpreter, and custom tools)
Auth-aware connection pooling for scalable multi-tenant deployments
Batch tool execution API for efficient processing
Approval system for controlled tool execution
Automatic reconnection manager for reliability
Graceful shutdown support
HTTP header forwarding to MCP servers

Impact: Deploy Llama, Qwen, DeepSeek, or any open-source model with the same built-in tool capabilities as GPT-4. Your infrastructure, your models, OpenAI-grade tooling.

📡 Realtime API Foundation
Event types and protocol support for real-time streaming applications.

🏗️ Architectural Revolution

Workspace Modularization
Complete extraction into standalone, publishable crates:

smg-auth - JWT/OIDC authentication
smg-mesh - High availability mesh networking
smg-mcp - Model Context Protocol orchestration
smg-wasm - WebAssembly middleware
smg-grpc-client - gRPC client infrastructure
smg-grpc-proto - Protocol definitions (published to PyPI!)
smg-kv-index - Cache-aware routing engine
llm-tokenizer - Tokenization logic
llm-multimodal - Multimodal processing
openai-protocol - OpenAI API specifications
wfaas - Workflow-as-a-Service engine
And more...

Result: Faster builds, independent evolution, better maintainability, and easy integration into your own projects.

⚡ Performance Optimizations

Zero-Copy & Algorithm Improvements:

Zero-copy multimodal payload handling
Aho-Corasick algorithm for stop sequence and special token search
WASM Linker reuse across executions
Optimized consistent hashing with zero allocations

🛠️ Production Enhancements

High Availability:

Mesh service refactoring and cleanup
State synchronization improvements
Oracle external auth support for enterprise backends

Observability:

Nightly benchmark workflow for comprehensive model performance tracking
gRPC vs HTTP comparison benchmarks
GetLoads RPC for load metrics

Developer Experience:

Comprehensive documentation restructure (concept-centric)
Issue templates and PR templates
Pre-commit hooks with Ruff + mypy Python linting
Automated crate publishing workflows
Dependabot integration

Testing Infrastructure:

Kubernetes-based CI runners
Service containers for Oracle and Brave
vLLM and TensorRT-LLM gRPC E2E tests
Thread-safe test fixtures with proper resource management

🐛 Critical Bug Fixes

Fixed synthetic "empty" tenant pollution in radix tree
Prevented resource leaks causing GPU starvation
Fixed STDIO MCP server triggering
Aligned multi-server MCP output handling across routers
Fixed completion token counting for vLLM harmony streaming
Corrected proto definitions (logprobs token_ids uint32)

📚 Documentation

Complete restructure from configuration-centric to concept-centric
Architecture diagrams and gradient mesh homepage
Comprehensive README with features overview
Admin API reference
Getting started guides

🔧 Tool Parser Support

New model support:

Cohere Command models (tool parser + reasoning parser)
Qwen Coder (XML format for Qwen3 Coder and MicroThinker)

🔗 Repository: https://github.com/lightseekorg/smg

Install now: pip install smg --upgrade

🐑 Shepherd your LLM infrastructure with confidence.

⚡ Built for speed. Engineered for scale. Production-proven.

What's Changed

fix: render README images on PyPI/crates.io and bump version to 1.0.1 by Simo Lin
fix(ci): fix H200 nightly benchmark model path, worker logs and CUDA errors (#411) by @key4ng in #411
fix(ci): use single Python interpreter for Windows/macOS PyPI builds (#418) by @slin1237 in #418
chore(mesh): bump smg-mesh version to 1.1.0 (#419) by @slin1237 in #419
chore: unify workspace dependency management and bump crate versions (#344) by @slin1237 in #344
refactor: remove remaining pub use re-export aliases from lib.rs (#416) by @slin1237 in #416
refactor: remove pub use re-export aliases from lib.rs (#413) by @slin1237 in #413
refactor(protocols,gateway): redesign worker type hierarchy and consolidate protocol layer (#412) by @slin1237 in #412
fix(grpc-proto): bump grpcio minimum to >=1.78.0 (#409) by @CatherineSue in #409
chore(ci): increase chat-completions-trtllm timeout to 60 minutes (#408) by @CatherineSue in #408
fix(trtllm): tokenize and inject user stop sequences for TRT-LLM requests (#346) by @ppraneth in #346
fix(e2e): migrate genai-bench to Docker and fix router pipe hang (#403) by @key4ng in #403
chore(deps): update kube requirement from 1.1.0 to 3.0.1 (#397) by @app/dependabot in #397
chore(deps): update opentelemetry-proto requirement from 0.27 to 0.31 (#398) by @app/dependabot in #398
chore(deps): update ndarray requirement from 0.16 to 0.17 (#394) by @app/dependabot in #394
feat: support oracle external auth for oracle backend (#404) by @zhaowenzi in #404
fix(grpc-proto): reorder authors in pyproject.toml (#400) by @CatherineSue in #400
chore[ci]: upgrade oracle image (#393) by @key4ng in #393
chore(e2e): overhaul nightly benchmark summary and trim model list (#392) by @slin1237 in #392
feat: Implement ReconnectionManager for automatic MCP server recovery (#265) by @ppraneth in #265
perf(multimodal): optimize payload handling with zero-copy (#391) by @ppraneth in #391
refactor(mcp): standardize output injection ordering across routers (#388) by @slin1237 in #388
ci(grpc): add proto package publishing and codegen checks (#386) by @CatherineSue in #386
feat(grpc): add smg-grpc-proto Python package for proto definitions (#385) by @CatherineSue in #385
chore(e2e): include model size in gpt-oss nightly benchmark slug (#384) by @CatherineSue in #384
refactor(mcp): remove requested_servers and introduce ResponsesCallContext (#382) by @CatherineSue in #382
refactor(mcp): use imports instead of fully-qualified paths in McpToolSession (#383) by @CatherineSue in #383
e2e: rewrite nightly summary with gRPC vs HTTP comparison (#381) by @slin1237 in #381
feat(realtime api): realtime api event types (#349) by @pallasathena92 in #349
refactor(mcp): add session-level tool builder convenience methods (#380) by @slin1237 in #380
fix+refactor: harden OpenAI router and unify error handling (#379) by @slin1237 in #379
docs: add Slack invite links (#377) by @slin1237 in #377
docs: add release badge and Discord links (#375) by @slin1237 in #375
chore: fix nightly benchmark, prevent execution when pushing to main (#374) by @slin1237 in #374
e2e: canonical HF model IDs and nightly benchmark runner split (#373) by @slin1237 in #373
mesh: refactor and cleanup mesh code (#281) by @llfl in #281
mcp: unify session-mapped tool execution and payload builders (#370) by @slin1237 in #370
docs: fix broken getting started link in logging guide (#372) by @slin1237 in #372
docs: consolidate onboarding into getting-started and align API docs to code (#371) by @slin1237 in #371
refactor(mcp): simplify ensure_request_mcp_client and remove McpLoopConfig (#368) by @CatherineSue in #368
chore[ci]: use service containers for Oracle and Brave in E2E tests (#369) by @key4ng in #369
fix(ci): Use H100 runner for Specific Tests (#357) by @XinyueZhang369 in #357
refactor(mcp): introduce McpToolSession to bundle MCP execution state (#360) by @CatherineSue in #360
docs: update logos and fix mobile rendering (#365) by @slin1237 in #365
feat: add Anthropic Messages API e2e tests (#358) by @key4ng in #358
feat(interactions): Create protocol for interactions api (#336) by @XinyueZhang369 in #336
chore(ci): optimize Docker setup for ephemeral K8s pod runners (#348) by @key4ng in #348
refactor(routers): reduce IGW router registration boilerplate (#362) by @slin1237 in #362
fix(docs): restore gradient text rendering in light mode (#359) by @slin1237 in #359
fix(grpc): align MCP multi-server outputs for responses (#277) by @zhaowenzi in #277
fix(proto): change logprobs token_ids from int32 to uint32 (#354) by @CatherineSue in #354
fix(docs): add light mode support for custom CSS theme (#355) by @slin1237 in #355
fix: fail responses tests on missing API keys instead of silently skipping (#343) by @key4ng in #343
chore(deps): update criterion requirement from 0.5 to 0.8 (#203) by @app/dependabot in #203
docs: fix Rust lint/fmt commands to match CI and pre-commit config (#353) by @CatherineSue in #353
chore(deps): update wasm-encoder requirement from 0.242 to 0.244 (#303) by @app/dependabot in #303
chore(ci): apply uv for sglang installation (#347) by @key4ng in #347
docs: restructure Getting Started section and fix SVG paths (#350) by @slin1237 in #350
chore: simplify SGLang CI install with explicit dependencies (#333) by @key4ng in #333
fix(mcp): serialize env-mutating proxy config tests (#345) by @CatherineSue in #345
refactor(protocols): use serde_with::skip_serializing_none to reduce boilerplate (#342) by @CatherineSue in #342
fix(ci): correct working-directory paths in PyPI release workflow (#341) by @CatherineSue in #341
chore: add ruff + mypy Python linting with pre-commit and CI (#340) by @CatherineSue in #340
chore: enable unused_qualifications lint workspace-wide (#339) by @CatherineSue in #339
refactor(grpc): unify shared utilities across harmony and regular routers (#337) by @CatherineSue in #337
refactor(grpc): extract duplicated MCP streaming helpers (#334) by @CatherineSue in #334
feat(serve): add unified --connection-mode and fix gRPC health checks (#335) by @slin1237 in #335
feat(messages api): support streaming (#280) by @key4ng in #280
fix(ci): use latest tag for sglang dependency (#332) by @key4ng in #332
chore(ci): reduce CI test timeout to surface failures sooner (#328) by @key4ng in #328
refactor(ci): split nightly benchmark into per-model jobs (#327) by @slin1237 in #327
fix(e2e): release leaked worker refs causing GPU starvation (#326) by @CatherineSue in #326
fix(ci): remove ineffective caching from vLLM and SGLang actions (#323) by @CatherineSue in #323
chore(deps): bump actions/setup-go from 5 to 6 (#298) by @app/dependabot in #298
chore(deps): bump actions/cache from 4 to 5 (#299) by @app/dependabot in #299
fix(ci): reduce docker stop grace period for CI cleanup containers (#322) by @CatherineSue in #322
fix(ci): add reusable composite actions for backend setup (#319) by @CatherineSue in #319
feat(reasoning-parser): add CohereCmdParser for Command models (#317) by @slin1237 in #317
fix(e2e): thread-aware caching for setup_backend fixture (#321) by @CatherineSue in #321
feat(vllm-pd): add Mooncake KV transfer support with auto-discovery (#312) by @slin1237 in #312
test: Update the H200 benchmark schedule to run weekly. (#318) by @key4ng in #318
fix(logging): add --log-json to python cli (#316) by @zhaowenzi in #316
feat(tool-parser): add Cohere Command model tool call parser (#315) by @slin1237 in #315
fix(e2e): prevent flaky tests from port races and resource leaks (#314) by @slin1237 in #314
feat(ci): add nightly benchmark workflow for comprehensive model performance tracking (#231) by @key4ng in #231
fix(logging): fix --log-json and Python binding support (#310) by @zhaowenzi in #310
fix(e2e): vLLM PD worker tracking and test infrastructure improvements (#311) by @slin1237 in #311
fix(grpc): correct completion_tokens counting for vLLM harmony streaming (#282) by @key4ng in #282
refactor: rename SGLang Model Gateway to Shepherd Model Gateway (#297) by @slin1237 in #297
feat(e2e): add TensorRT-LLM gRPC backend support for CI (#274) by @slin1237 in #274
docs: add vLLM PD disaggregation support (#294) by @slin1237 in #294
feat(grpc): add vLLM PD disaggregation support via NIXL (#293) by @slin1237 in #293
fix(ci): Move to K8s Cpu Runners (#279) by @XinyueZhang369 in #279
fix(mcp): stdio MCP servers cannot get triggered (#273) by @xuwenyihust in #273
feat(serve): add GPU assignment for multi-worker DP (#272) by @slin1237 in #272
feat(serve): add ServeOrchestrator for worker lifecycle management (#271) by @slin1237 in #271
feat(cli): add smg serve subcommand with two-pass arg parsing (#270) by @slin1237 in #270
fix(ci): migrate benchmark-radix-tree to k8s gpu runner (#269) by @slin1237 in #269
fix: critical correctness and reliability bugs across 12 crates (#254) by @slin1237 in #254
fix(kv-index): prevent synthetic "empty" tenant from polluting tree (#268) by @slin1237 in #268
fix(ci): speed up chat-completions CI jobs (#267) by @CatherineSue in #267
fix(ci): use standard install scripts for go-bindings-benchmark (#264) by @slin1237 in #264
docs(README): resolve broken doc links (#256) by @xuwenyihust in #256
fix(ci): add apt-get update before SGLang dependency install (#255) by @slin1237 in #255
chore(bindings): rename SGLang to Shepherd in Python CLI (#247) by @xuwenyihust in #247
feat(messages api): support basic non streaming (#230) by @key4ng in #230
test(go-bindings): expand E2E test coverage and add performance benchmarks (#206) by @slin1237 in #206
fix(mcp): align OpenAI router streaming multi-server handling (#209) by @zhaowenzi in #209
run test workflow to cluster (#192) by @XinyueZhang369 in #192
feat: Add Messages API foundation (#227) by @key4ng in #227
feat: implement graceful shutdown for MCP (#228) by @ppraneth in #228
refactor(proto): Use uint32 for n, max_tokens, min_tokens in SGLang/TRT-LLM protos (#226) by @CatherineSue in #226
ci: add vLLM gRPC e2e tests (#158) by @key4ng in #158
refactor(grpc): Centralize request building dispatch in GrpcClient (#219) by @CatherineSue in #219
refactor(grpc): Standardize token counts to uint32 across all protos (#218) by @CatherineSue in #218
[model-gateway] Reuse WASM Linker across executions (#211) by @ppraneth in #211
feat(harmony): Enable vLLM and TensorRT-LLM gRPC backend support (#217) by @CatherineSue in #217
feat(grpc): Add TensorRT-LLM logprobs support in proto_wrapper (#216) by @CatherineSue in #216
feat(grpc): Add vLLM logprobs and n>1 sampling support (#215) by @CatherineSue in #215
chore(deps): update tiktoken-rs requirement from 0.7.0 to 0.9.1 (#201) by @app/dependabot in #201
docs: Add missing e2e-test.sh script (#214) by @CatherineSue in #214
feat: Rename TensorRT-LLM gRPC client from TrtLlmEngine to TrtllmService (#213) by @CatherineSue in #213
fix clippy warnings by Kun(llfl)
refactor: mesh service clean up by Kun(llfl)
refactor: mesh lib exports clean up by Kun(llfl)
chore(deps): bump actions/setup-go from 5 to 6 (#197) by @app/dependabot in #197
chore: add zoey to workflow and e2e owner, add yanbo to golang owner (#207) by @slin1237 in #207
feat(grpc/harmony): Add logprobs support for Harmony models (#205) by @CatherineSue in #205
refactor(go-bindings): eliminate code duplication and consolidate shared modules (#204) by @slin1237 in #204
chore(deps): bump actions/labeler from 5 to 6 (#198) by @app/dependabot in #198
fix(mcp): align streamable transport headers with SSE to forward custom headers consistently (#196) by @zhaowenzi in #196
feat(bindings): add safety docs and include bindings in workspace (#195) by @slin1237 in #195
ci(go-bindings): add dedicated CI job for Go bindings testing (#189) by @slin1237 in #189
feat(grpc): Add TensorRT-LLM Backend Support (#194) by @CatherineSue in #194
Fix chat_template format detection by Chang Su
fix(mcp): use headers from request payload instead of HTTP headers (#191) by @slin1237 in #191
perf(tokenizer): optimize stop sequence search using Aho-Corasick algorithm (#190) by @slin1237 in #190
chore(repo): add GitHub issue templates (#188) by @slin1237 in #188
fix(streaming): complete SSE event emission for all built-in tool types (#187) by @slin1237 in #187
docs(reasoning-parser): add comprehensive README (#186) by @slin1237 in #186
docs(wfaas): add comprehensive README for workflow engine crate (#185) by @slin1237 in #185
fix(streaming): correct SSE event emission for built-in tools and remove dead code (#184) by @slin1237 in #184
feat(mcp): complete router integration for built-in tool routing (#182) by @slin1237 in #182
[bugfix] fix ci dep script path (#183) by @slin1237 in #183
docs: gradient mesh homepage design with animations (#180) by @slin1237 in #180
feat(mcp): implement built-in tool routing infrastructure (#181) by @slin1237 in #181
fix(mcp): OpenAI router non-streaming MCP outputs for multiple servers. (#157) by @zhaowenzi in #157
feat(mcp): implement ResponseFormat transformation in tool execution pipeline (#178) by @slin1237 in #178
fix(docs): correct alignment issues in MCP architecture SVG (#175) by @slin1237 in #175
docs: restructure documentation from configuration-centric to concept-centric architecture (#174) by @slin1237 in #174
feat(mcp): forward HTTP request headers to MCP servers (#155) by @slin1237 in #155
feat(mcp): implement auth-aware connection pooling and code quality improvements (#154) by @slin1237 in #154
feat(mcp): add batch tool execution API to McpOrchestrator (#153) by @slin1237 in #153
ci: remove docker-build-test from PR workflow (#152) by @slin1237 in #152
refactor(mcp): migrate from McpManager to McpOrchestrator (#151) by @slin1237 in #151
feat(mcp): implement McpOrchestrator with YAML policy configuration (#149) by @slin1237 in #149
feat(mcp): implement ResponseFormat and ResponseTransformer (#146) by @slin1237 in #146
refactor(mcp): production hardening - remove dead code and optimize allocations (#142) by @slin1237 in #142
feat(mcp): reorganize crate structure and implement approval system (#140) by @slin1237 in #140
chore(deps): update wasmtime-wasi requirement from 38.0 to 41.0 (#134) by @app/dependabot in #134
Update placeholder in PR template (#141) by @CatherineSue in #141
doc: Add PR template (#139) by @CatherineSue in #139
chore(deps): bump actions/upload-artifact from 4 to 6 (#132) by @app/dependabot in #132
chore(deps): update metrics-exporter-prometheus requirement from 0.17.0 to 0.18.1 (#137) by @app/dependabot in #137
chore(deps): bump actions/checkout from 4 to 6 (#129) by @app/dependabot in #129
chore(deps): bump actions/upload-pages-artifact from 3 to 4 (#128) by @app/dependabot in #128
chore(deps): bump actions/setup-python from 5 to 6 (#131) by @app/dependabot in #131
chore(deps): bump actions/download-artifact from 4 to 7 (#130) by @app/dependabot in #130
feat(mcp): implement qualified tool names for collision handling (#110) by @slin1237 in #110
fix(mcp): pass raw auth token to rmcp transport (#117) by @zhaowenzi in #117
ci: add Dependabot and centralize CI gate logic (#127) by @slin1237 in #127
fix: use crates.io version for openai-harmony dependency (#126) by @slin1237 in #126
fix: handle already-published crates gracefully and set smg to v0.4.0 (#125) by @slin1237 in #125
ci: consolidate crate publishing into single tiered workflow (#124) by @slin1237 in #124
fix: rename auth to smg-auth and add protoc to publish workflow (#123) by @slin1237 in #123
fix: add version numbers to path dependencies for crates.io publishing (#122) by @slin1237 in #122
fix: use correct rust-toolchain action name (#121) by @slin1237 in #121
chore: prepare v1.0.0 release (#120) by @slin1237 in #120
ci: add automated crate publishing workflows (#118) by @slin1237 in #118
[misc] update code owner (#119) by @slin1237 in #119
fix: remove broken cli.md references in docs (#116) by @slin1237 in #116
ci: re-enable GitHub Pages deployment workflow (#115) by @slin1237 in #115
refactor: rename tokenizer and workflow packages (#113) by @slin1237 in #113
feat: add new logo assets and update branding (#111) by @slin1237 in #111
Update labeler to match the latest file tree (#112) by @CatherineSue in #112
refactor: extract main binary to model_gateway/ workspace crate (#109) by @slin1237 in #109
docs: Add admin API reference documentation (#108) by @slin1237 in #108
docs: Improve README with features overview and organization (#107) by @slin1237 in #107
docs: Comprehensive configuration documentation and architecture diagrams (#106) by @slin1237 in #106
docs(readme): Restructure for clarity and conciseness (#83) by @slin1237 in #83
[docs] disable github doc page deployment since repo is private (#71) by @slin1237 in #71
refactor: Rename sgl-model-gateway to smg across entire codebase (#70) by @slin1237 in #70
refactor: Rename Python bindings from sglang_router to smg (#69) by @slin1237 in #69
refactor: Extract grpc_client into standalone smg-grpc-client workspace crate (#68) by @slin1237 in #68
Add architecture diagram SVG (#67) by @slin1237 in #67
chore: Update CODEOWNERS for extracted workspace crates (#66) by @slin1237 in #66
refactor: Rename mcp package from 'mcp' to 'smg-mcp' (#65) by @slin1237 in #65
refactor: Extract mesh module into standalone smg-mesh workspace crate (#64) by @slin1237 in #64
Add site/ to gitignore by Simo Lin
Add MkDocs Material documentation by Simo Lin
refactor: Extract wasm module into standalone smg-wasm workspace crate (#62) by @slin1237 in #62
fix: clippy multi modal code import error (#61) by @slin1237 in #61
fix: remove dead code (#60) by @slin1237 in #60
chore: Standardize package metadata across all workspace crates (#59) by @slin1237 in #59
refactor: Migrate cache_aware policy from tree.rs to kv_index crate (#58) by @slin1237 in #58
Fix golang bindings build and multimodal crate (#57) by @slin1237 in #57
Extract multimodal module into standalone llm-multimodal crate (#56) by @slin1237 in #56
Extract data_connector to workspace crate and standardize folder naming (#55) by @slin1237 in #55
rename: kv-index to kv_index to follow Rust naming conventions (#54) by @slin1237 in #54
refactor: Extract kv-index crate and modularize CI benchmark workflows (#53) by @slin1237 in #53
refactor: Extract MCP module into standalone workspace crate (#52) by @slin1237 in #52
Consolidate workspace dependencies across all crates (#50) by @slin1237 in #50
Reduce benchmark CI overhead (#51) by @slin1237 in #51
refactor: Extract auth into standalone smg-auth workspace crate (#49) by @slin1237 in #49
Use request-scoped MCP clients for Responses API tools (#43) by @zhaowenzi in #43
refactor: Extract tokenizer into standalone llm-tokenizer workspace crate (#48) by @slin1237 in #48
refactor: Extract workflow into standalone wfaas workspace crate (#47) by @slin1237 in #47
refactor: Extract tool-parser into standalone workspace crate (#46) by @slin1237 in #46
refactor: Extract reasoning-parser into standalone workspace crate (#44) by @slin1237 in #44
Extract protocols as openai-protocol workspace crate (#45) by @slin1237 in #45
Add ci benchmark workflow (#37) by @key4ng in #37
add release-pypi.yml (#41) by @key4ng in #41
Fix test_state_synchronization by Chang Su
Fix empty tenant issue in tree.rs by Chang Su
Setup auto labeler (#40) by @CatherineSue in #40
refactor: Consolidate "unknown" model id usage (#39) by @CatherineSue in #39
refactor: unify registration workflow and duplication check (#38) by @CatherineSue in #38
Add ci workflow[wip] (#36) by @key4ng in #36
Add .gitignore (#34) by @key4ng in #34
Add Code owner (#35) by @key4ng in #35
Add pre-commit hooks and fix clippy warnings (#27) by @key4ng in #27
[smg] release 0.3.2 (#17168) by Simo Lin in https://github.com/lightseekorg/smg/pull/17168

New Contributors

@dependabot[bot] made their first contribution
Kun(llfl) made their first contribution in 10b950e3
@pallasathena92 made their first contribution in 3ade189a

Full Changelog: 6caca5b...v1.0.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.1

Choose a tag to compare

Sorry, something went wrong.