v1.0.1
π Introducing Shepherd Model Gateway v1.0.1!
We're thrilled to announce Shepherd Model Gateway v1.0.1 β formerly SGLang Model Gateway. This major release marks a new chapter with a complete architectural overhaul, new enterprise features, and production-grade improvements!
π Welcome to Shepherd
SGLang Model Gateway is now Shepherd Model Gateway (SMG).
Truly Engine-Agnostic Architecture: Shepherd is your universal gateway supporting all major inference engines β SGLang, vLLM, and TensorRT-LLM β plus complete 3rd party model provider integration including OpenAI, Anthropic, and Gemini. One gateway to route them all.
Universal API Support: Native implementation of Chat Completions, Responses API, Messages API, Interactions API, and Realtime API. Whether you're running open-source models on your infrastructure or routing to cloud providers, Shepherd handles it seamlessly.
Same powerful technology, new identity focused on guiding and managing your entire LLM infrastructure at scale β regardless of where your models run.
β¨ Major New Features
β‘ TensorRT-LLM Backend Support - Native gRPC integration for NVIDIA TensorRT-LLM
π vLLM Prefill-Decode-Disaggregation Support
Mooncake and NIXL-based KV transfer for disaggregated inference:
- Auto-discovery for seamless integration
- Massive scalability improvements for large deployments
- Efficient KV cache sharing across workers
π― smg serve - Unified Worker Management
New serve subcommand with complete worker lifecycle orchestration:
- Multi-worker data parallelism with GPU assignment
- ServeOrchestrator for automated worker management
- Two-pass argument parsing for flexible configuration
- One command to rule them all
π€ Anthropic Messages API Support
Full implementation of Anthropic's Messages API with streaming and non-streaming support. Deploy Claude models alongside your existing inference fleet.
π Industry-First: Universal Built-in Tools via MCP π₯
Turn any MCP server into built-in tools for all models β an industry-first capability that brings OpenAI-style built-in tools (FileSearch, WebSearch, CodeInterpreter) to every LLM, not just proprietary models.
Complete MCP Orchestration Stack:
- McpOrchestrator with YAML policy configuration
- Built-in tool routing infrastructure with qualified names β seamlessly integrate any MCP server as a native capability
- ResponseFormat transformation pipeline - expose MCP servers as built-in tools (
FileSearch,WebSearch,CodeInterpreter, and custom tools) - Auth-aware connection pooling for scalable multi-tenant deployments
- Batch tool execution API for efficient processing
- Approval system for controlled tool execution
- Automatic reconnection manager for reliability
- Graceful shutdown support
- HTTP header forwarding to MCP servers
Impact: Deploy Llama, Qwen, DeepSeek, or any open-source model with the same built-in tool capabilities as GPT-4. Your infrastructure, your models, OpenAI-grade tooling.
π‘ Realtime API Foundation
Event types and protocol support for real-time streaming applications.
ποΈ Architectural Revolution
Workspace Modularization
Complete extraction into standalone, publishable crates:
smg-auth- JWT/OIDC authenticationsmg-mesh- High availability mesh networkingsmg-mcp- Model Context Protocol orchestrationsmg-wasm- WebAssembly middlewaresmg-grpc-client- gRPC client infrastructuresmg-grpc-proto- Protocol definitions (published to PyPI!)smg-kv-index- Cache-aware routing enginellm-tokenizer- Tokenization logicllm-multimodal- Multimodal processingopenai-protocol- OpenAI API specificationswfaas- Workflow-as-a-Service engine- And more...
Result: Faster builds, independent evolution, better maintainability, and easy integration into your own projects.
β‘ Performance Optimizations
Zero-Copy & Algorithm Improvements:
- Zero-copy multimodal payload handling
- Aho-Corasick algorithm for stop sequence and special token search
- WASM Linker reuse across executions
- Optimized consistent hashing with zero allocations
π οΈ Production Enhancements
High Availability:
- Mesh service refactoring and cleanup
- State synchronization improvements
- Oracle external auth support for enterprise backends
Observability:
- Nightly benchmark workflow for comprehensive model performance tracking
- gRPC vs HTTP comparison benchmarks
- GetLoads RPC for load metrics
Developer Experience:
- Comprehensive documentation restructure (concept-centric)
- Issue templates and PR templates
- Pre-commit hooks with Ruff + mypy Python linting
- Automated crate publishing workflows
- Dependabot integration
Testing Infrastructure:
- Kubernetes-based CI runners
- Service containers for Oracle and Brave
- vLLM and TensorRT-LLM gRPC E2E tests
- Thread-safe test fixtures with proper resource management
π Critical Bug Fixes
- Fixed synthetic "empty" tenant pollution in radix tree
- Prevented resource leaks causing GPU starvation
- Fixed STDIO MCP server triggering
- Aligned multi-server MCP output handling across routers
- Fixed completion token counting for vLLM harmony streaming
- Corrected proto definitions (logprobs token_ids uint32)
π Documentation
- Complete restructure from configuration-centric to concept-centric
- Architecture diagrams and gradient mesh homepage
- Comprehensive README with features overview
- Admin API reference
- Getting started guides
π§ Tool Parser Support
New model support:
- Cohere Command models (tool parser + reasoning parser)
- Qwen Coder (XML format for Qwen3 Coder and MicroThinker)
π Repository: https://github.com/lightseekorg/smg
Install now: pip install smg --upgrade
π Shepherd your LLM infrastructure with confidence.
β‘ Built for speed. Engineered for scale. Production-proven.
What's Changed
- fix: render README images on PyPI/crates.io and bump version to 1.0.1 by Simo Lin
- fix(ci): fix H200 nightly benchmark model path, worker logs and CUDA errors (#411) by @key4ng in #411
- fix(ci): use single Python interpreter for Windows/macOS PyPI builds (#418) by @slin1237 in #418
- chore(mesh): bump smg-mesh version to 1.1.0 (#419) by @slin1237 in #419
- chore: unify workspace dependency management and bump crate versions (#344) by @slin1237 in #344
- refactor: remove remaining pub use re-export aliases from lib.rs (#416) by @slin1237 in #416
- refactor: remove pub use re-export aliases from lib.rs (#413) by @slin1237 in #413
- refactor(protocols,gateway): redesign worker type hierarchy and consolidate protocol layer (#412) by @slin1237 in #412
- fix(grpc-proto): bump grpcio minimum to >=1.78.0 (#409) by @CatherineSue in #409
- chore(ci): increase chat-completions-trtllm timeout to 60 minutes (#408) by @CatherineSue in #408
- fix(trtllm): tokenize and inject user stop sequences for TRT-LLM requests (#346) by @ppraneth in #346
- fix(e2e): migrate genai-bench to Docker and fix router pipe hang (#403) by @key4ng in #403
- chore(deps): update kube requirement from 1.1.0 to 3.0.1 (#397) by @app/dependabot in #397
- chore(deps): update opentelemetry-proto requirement from 0.27 to 0.31 (#398) by @app/dependabot in #398
- chore(deps): update ndarray requirement from 0.16 to 0.17 (#394) by @app/dependabot in #394
- feat: support oracle external auth for oracle backend (#404) by @zhaowenzi in #404
- fix(grpc-proto): reorder authors in pyproject.toml (#400) by @CatherineSue in #400
- chore[ci]: upgrade oracle image (#393) by @key4ng in #393
- chore(e2e): overhaul nightly benchmark summary and trim model list (#392) by @slin1237 in #392
- feat: Implement ReconnectionManager for automatic MCP server recovery (#265) by @ppraneth in #265
- perf(multimodal): optimize payload handling with zero-copy (#391) by @ppraneth in #391
- refactor(mcp): standardize output injection ordering across routers (#388) by @slin1237 in #388
- ci(grpc): add proto package publishing and codegen checks (#386) by @CatherineSue in #386
- feat(grpc): add smg-grpc-proto Python package for proto definitions (#385) by @CatherineSue in #385
- chore(e2e): include model size in gpt-oss nightly benchmark slug (#384) by @CatherineSue in #384
- refactor(mcp): remove requested_servers and introduce ResponsesCallContext (#382) by @CatherineSue in #382
- refactor(mcp): use imports instead of fully-qualified paths in McpToolSession (#383) by @CatherineSue in #383
- e2e: rewrite nightly summary with gRPC vs HTTP comparison (#381) by @slin1237 in #381
- feat(realtime api): realtime api event types (#349) by @pallasathena92 in #349
- refactor(mcp): add session-level tool builder convenience methods (#380) by @slin1237 in #380
- fix+refactor: harden OpenAI router and unify error handling (#379) by @slin1237 in #379
- docs: add Slack invite links (#377) by @slin1237 in #377
- docs: add release badge and Discord links (#375) by @slin1237 in #375
- chore: fix nightly benchmark, prevent execution when pushing to main (#374) by @slin1237 in #374
- e2e: canonical HF model IDs and nightly benchmark runner split (#373) by @slin1237 in #373
- mesh: refactor and cleanup mesh code (#281) by @llfl in #281
- mcp: unify session-mapped tool execution and payload builders (#370) by @slin1237 in #370
- docs: fix broken getting started link in logging guide (#372) by @slin1237 in #372
- docs: consolidate onboarding into getting-started and align API docs to code (#371) by @slin1237 in #371
- refactor(mcp): simplify ensure_request_mcp_client and remove McpLoopConfig (#368) by @CatherineSue in #368
- chore[ci]: use service containers for Oracle and Brave in E2E tests (#369) by @key4ng in #369
- fix(ci): Use H100 runner for Specific Tests (#357) by @XinyueZhang369 in #357
- refactor(mcp): introduce McpToolSession to bundle MCP execution state (#360) by @CatherineSue in #360
- docs: update logos and fix mobile rendering (#365) by @slin1237 in #365
- feat: add Anthropic Messages API e2e tests (#358) by @key4ng in #358
- feat(interactions): Create protocol for interactions api (#336) by @XinyueZhang369 in #336
- chore(ci): optimize Docker setup for ephemeral K8s pod runners (#348) by @key4ng in #348
- refactor(routers): reduce IGW router registration boilerplate (#362) by @slin1237 in #362
- fix(docs): restore gradient text rendering in light mode (#359) by @slin1237 in #359
- fix(grpc): align MCP multi-server outputs for responses (#277) by @zhaowenzi in #277
- fix(proto): change logprobs token_ids from int32 to uint32 (#354) by @CatherineSue in #354
- fix(docs): add light mode support for custom CSS theme (#355) by @slin1237 in #355
- fix: fail responses tests on missing API keys instead of silently skipping (#343) by @key4ng in #343
- chore(deps): update criterion requirement from 0.5 to 0.8 (#203) by @app/dependabot in #203
- docs: fix Rust lint/fmt commands to match CI and pre-commit config (#353) by @CatherineSue in #353
- chore(deps): update wasm-encoder requirement from 0.242 to 0.244 (#303) by @app/dependabot in #303
- chore(ci): apply uv for sglang installation (#347) by @key4ng in #347
- docs: restructure Getting Started section and fix SVG paths (#350) by @slin1237 in #350
- chore: simplify SGLang CI install with explicit dependencies (#333) by @key4ng in #333
- fix(mcp): serialize env-mutating proxy config tests (#345) by @CatherineSue in #345
- refactor(protocols): use serde_with::skip_serializing_none to reduce boilerplate (#342) by @CatherineSue in #342
- fix(ci): correct working-directory paths in PyPI release workflow (#341) by @CatherineSue in #341
- chore: add ruff + mypy Python linting with pre-commit and CI (#340) by @CatherineSue in #340
- chore: enable unused_qualifications lint workspace-wide (#339) by @CatherineSue in #339
- refactor(grpc): unify shared utilities across harmony and regular routers (#337) by @CatherineSue in #337
- refactor(grpc): extract duplicated MCP streaming helpers (#334) by @CatherineSue in #334
- feat(serve): add unified --connection-mode and fix gRPC health checks (#335) by @slin1237 in #335
- feat(messages api): support streaming (#280) by @key4ng in #280
- fix(ci): use latest tag for sglang dependency (#332) by @key4ng in #332
- chore(ci): reduce CI test timeout to surface failures sooner (#328) by @key4ng in #328
- refactor(ci): split nightly benchmark into per-model jobs (#327) by @slin1237 in #327
- fix(e2e): release leaked worker refs causing GPU starvation (#326) by @CatherineSue in #326
- fix(ci): remove ineffective caching from vLLM and SGLang actions (#323) by @CatherineSue in #323
- chore(deps): bump actions/setup-go from 5 to 6 (#298) by @app/dependabot in #298
- chore(deps): bump actions/cache from 4 to 5 (#299) by @app/dependabot in #299
- fix(ci): reduce docker stop grace period for CI cleanup containers (#322) by @CatherineSue in #322
- fix(ci): add reusable composite actions for backend setup (#319) by @CatherineSue in #319
- feat(reasoning-parser): add CohereCmdParser for Command models (#317) by @slin1237 in #317
- fix(e2e): thread-aware caching for setup_backend fixture (#321) by @CatherineSue in #321
- feat(vllm-pd): add Mooncake KV transfer support with auto-discovery (#312) by @slin1237 in #312
- test: Update the H200 benchmark schedule to run weekly. (#318) by @key4ng in #318
- fix(logging): add --log-json to python cli (#316) by @zhaowenzi in #316
- feat(tool-parser): add Cohere Command model tool call parser (#315) by @slin1237 in #315
- fix(e2e): prevent flaky tests from port races and resource leaks (#314) by @slin1237 in #314
- feat(ci): add nightly benchmark workflow for comprehensive model performance tracking (#231) by @key4ng in #231
- fix(logging): fix --log-json and Python binding support (#310) by @zhaowenzi in #310
- fix(e2e): vLLM PD worker tracking and test infrastructure improvements (#311) by @slin1237 in #311
- fix(grpc): correct completion_tokens counting for vLLM harmony streaming (#282) by @key4ng in #282
- refactor: rename SGLang Model Gateway to Shepherd Model Gateway (#297) by @slin1237 in #297
- feat(e2e): add TensorRT-LLM gRPC backend support for CI (#274) by @slin1237 in #274
- docs: add vLLM PD disaggregation support (#294) by @slin1237 in #294
- feat(grpc): add vLLM PD disaggregation support via NIXL (#293) by @slin1237 in #293
- fix(ci): Move to K8s Cpu Runners (#279) by @XinyueZhang369 in #279
- fix(mcp): stdio MCP servers cannot get triggered (#273) by @xuwenyihust in #273
- feat(serve): add GPU assignment for multi-worker DP (#272) by @slin1237 in #272
- feat(serve): add ServeOrchestrator for worker lifecycle management (#271) by @slin1237 in #271
- feat(cli): add smg serve subcommand with two-pass arg parsing (#270) by @slin1237 in #270
- fix(ci): migrate benchmark-radix-tree to k8s gpu runner (#269) by @slin1237 in #269
- fix: critical correctness and reliability bugs across 12 crates (#254) by @slin1237 in #254
- fix(kv-index): prevent synthetic "empty" tenant from polluting tree (#268) by @slin1237 in #268
- fix(ci): speed up chat-completions CI jobs (#267) by @CatherineSue in #267
- fix(ci): use standard install scripts for go-bindings-benchmark (#264) by @slin1237 in #264
- docs(README): resolve broken doc links (#256) by @xuwenyihust in #256
- fix(ci): add apt-get update before SGLang dependency install (#255) by @slin1237 in #255
- chore(bindings): rename SGLang to Shepherd in Python CLI (#247) by @xuwenyihust in #247
- feat(messages api): support basic non streaming (#230) by @key4ng in #230
- test(go-bindings): expand E2E test coverage and add performance benchmarks (#206) by @slin1237 in #206
- fix(mcp): align OpenAI router streaming multi-server handling (#209) by @zhaowenzi in #209
- run test workflow to cluster (#192) by @XinyueZhang369 in #192
- feat: Add Messages API foundation (#227) by @key4ng in #227
- feat: implement graceful shutdown for MCP (#228) by @ppraneth in #228
- refactor(proto): Use uint32 for n, max_tokens, min_tokens in SGLang/TRT-LLM protos (#226) by @CatherineSue in #226
- ci: add vLLM gRPC e2e tests (#158) by @key4ng in #158
- refactor(grpc): Centralize request building dispatch in GrpcClient (#219) by @CatherineSue in #219
- refactor(grpc): Standardize token counts to uint32 across all protos (#218) by @CatherineSue in #218
- [model-gateway] Reuse WASM Linker across executions (#211) by @ppraneth in #211
- feat(harmony): Enable vLLM and TensorRT-LLM gRPC backend support (#217) by @CatherineSue in #217
- feat(grpc): Add TensorRT-LLM logprobs support in proto_wrapper (#216) by @CatherineSue in #216
- feat(grpc): Add vLLM logprobs and n>1 sampling support (#215) by @CatherineSue in #215
- chore(deps): update tiktoken-rs requirement from 0.7.0 to 0.9.1 (#201) by @app/dependabot in #201
- docs: Add missing e2e-test.sh script (#214) by @CatherineSue in #214
- feat: Rename TensorRT-LLM gRPC client from TrtLlmEngine to TrtllmService (#213) by @CatherineSue in #213
- fix clippy warnings by Kun(llfl)
- refactor: mesh service clean up by Kun(llfl)
- refactor: mesh lib exports clean up by Kun(llfl)
- chore(deps): bump actions/setup-go from 5 to 6 (#197) by @app/dependabot in #197
- chore: add zoey to workflow and e2e owner, add yanbo to golang owner (#207) by @slin1237 in #207
- feat(grpc/harmony): Add logprobs support for Harmony models (#205) by @CatherineSue in #205
- refactor(go-bindings): eliminate code duplication and consolidate shared modules (#204) by @slin1237 in #204
- chore(deps): bump actions/labeler from 5 to 6 (#198) by @app/dependabot in #198
- fix(mcp): align streamable transport headers with SSE to forward custom headers consistently (#196) by @zhaowenzi in #196
- feat(bindings): add safety docs and include bindings in workspace (#195) by @slin1237 in #195
- ci(go-bindings): add dedicated CI job for Go bindings testing (#189) by @slin1237 in #189
- feat(grpc): Add TensorRT-LLM Backend Support (#194) by @CatherineSue in #194
- Fix chat_template format detection by Chang Su
- fix(mcp): use headers from request payload instead of HTTP headers (#191) by @slin1237 in #191
- perf(tokenizer): optimize stop sequence search using Aho-Corasick algorithm (#190) by @slin1237 in #190
- chore(repo): add GitHub issue templates (#188) by @slin1237 in #188
- fix(streaming): complete SSE event emission for all built-in tool types (#187) by @slin1237 in #187
- docs(reasoning-parser): add comprehensive README (#186) by @slin1237 in #186
- docs(wfaas): add comprehensive README for workflow engine crate (#185) by @slin1237 in #185
- fix(streaming): correct SSE event emission for built-in tools and remove dead code (#184) by @slin1237 in #184
- feat(mcp): complete router integration for built-in tool routing (#182) by @slin1237 in #182
- [bugfix] fix ci dep script path (#183) by @slin1237 in #183
- docs: gradient mesh homepage design with animations (#180) by @slin1237 in #180
- feat(mcp): implement built-in tool routing infrastructure (#181) by @slin1237 in #181
- fix(mcp): OpenAI router non-streaming MCP outputs for multiple servers. (#157) by @zhaowenzi in #157
- feat(mcp): implement ResponseFormat transformation in tool execution pipeline (#178) by @slin1237 in #178
- fix(docs): correct alignment issues in MCP architecture SVG (#175) by @slin1237 in #175
- docs: restructure documentation from configuration-centric to concept-centric architecture (#174) by @slin1237 in #174
- feat(mcp): forward HTTP request headers to MCP servers (#155) by @slin1237 in #155
- feat(mcp): implement auth-aware connection pooling and code quality improvements (#154) by @slin1237 in #154
- feat(mcp): add batch tool execution API to McpOrchestrator (#153) by @slin1237 in #153
- ci: remove docker-build-test from PR workflow (#152) by @slin1237 in #152
- refactor(mcp): migrate from McpManager to McpOrchestrator (#151) by @slin1237 in #151
- feat(mcp): implement McpOrchestrator with YAML policy configuration (#149) by @slin1237 in #149
- feat(mcp): implement ResponseFormat and ResponseTransformer (#146) by @slin1237 in #146
- refactor(mcp): production hardening - remove dead code and optimize allocations (#142) by @slin1237 in #142
- feat(mcp): reorganize crate structure and implement approval system (#140) by @slin1237 in #140
- chore(deps): update wasmtime-wasi requirement from 38.0 to 41.0 (#134) by @app/dependabot in #134
- Update placeholder in PR template (#141) by @CatherineSue in #141
- doc: Add PR template (#139) by @CatherineSue in #139
- chore(deps): bump actions/upload-artifact from 4 to 6 (#132) by @app/dependabot in #132
- chore(deps): update metrics-exporter-prometheus requirement from 0.17.0 to 0.18.1 (#137) by @app/dependabot in #137
- chore(deps): bump actions/checkout from 4 to 6 (#129) by @app/dependabot in #129
- chore(deps): bump actions/upload-pages-artifact from 3 to 4 (#128) by @app/dependabot in #128
- chore(deps): bump actions/setup-python from 5 to 6 (#131) by @app/dependabot in #131
- chore(deps): bump actions/download-artifact from 4 to 7 (#130) by @app/dependabot in #130
- feat(mcp): implement qualified tool names for collision handling (#110) by @slin1237 in #110
- fix(mcp): pass raw auth token to rmcp transport (#117) by @zhaowenzi in #117
- ci: add Dependabot and centralize CI gate logic (#127) by @slin1237 in #127
- fix: use crates.io version for openai-harmony dependency (#126) by @slin1237 in #126
- fix: handle already-published crates gracefully and set smg to v0.4.0 (#125) by @slin1237 in #125
- ci: consolidate crate publishing into single tiered workflow (#124) by @slin1237 in #124
- fix: rename auth to smg-auth and add protoc to publish workflow (#123) by @slin1237 in #123
- fix: add version numbers to path dependencies for crates.io publishing (#122) by @slin1237 in #122
- fix: use correct rust-toolchain action name (#121) by @slin1237 in #121
- chore: prepare v1.0.0 release (#120) by @slin1237 in #120
- ci: add automated crate publishing workflows (#118) by @slin1237 in #118
- [misc] update code owner (#119) by @slin1237 in #119
- fix: remove broken cli.md references in docs (#116) by @slin1237 in #116
- ci: re-enable GitHub Pages deployment workflow (#115) by @slin1237 in #115
- refactor: rename tokenizer and workflow packages (#113) by @slin1237 in #113
- feat: add new logo assets and update branding (#111) by @slin1237 in #111
- Update labeler to match the latest file tree (#112) by @CatherineSue in #112
- refactor: extract main binary to model_gateway/ workspace crate (#109) by @slin1237 in #109
- docs: Add admin API reference documentation (#108) by @slin1237 in #108
- docs: Improve README with features overview and organization (#107) by @slin1237 in #107
- docs: Comprehensive configuration documentation and architecture diagrams (#106) by @slin1237 in #106
- docs(readme): Restructure for clarity and conciseness (#83) by @slin1237 in #83
- [docs] disable github doc page deployment since repo is private (#71) by @slin1237 in #71
- refactor: Rename sgl-model-gateway to smg across entire codebase (#70) by @slin1237 in #70
- refactor: Rename Python bindings from sglang_router to smg (#69) by @slin1237 in #69
- refactor: Extract grpc_client into standalone smg-grpc-client workspace crate (#68) by @slin1237 in #68
- Add architecture diagram SVG (#67) by @slin1237 in #67
- chore: Update CODEOWNERS for extracted workspace crates (#66) by @slin1237 in #66
- refactor: Rename mcp package from 'mcp' to 'smg-mcp' (#65) by @slin1237 in #65
- refactor: Extract mesh module into standalone smg-mesh workspace crate (#64) by @slin1237 in #64
- Add site/ to gitignore by Simo Lin
- Add MkDocs Material documentation by Simo Lin
- refactor: Extract wasm module into standalone smg-wasm workspace crate (#62) by @slin1237 in #62
- fix: clippy multi modal code import error (#61) by @slin1237 in #61
- fix: remove dead code (#60) by @slin1237 in #60
- chore: Standardize package metadata across all workspace crates (#59) by @slin1237 in #59
- refactor: Migrate cache_aware policy from tree.rs to kv_index crate (#58) by @slin1237 in #58
- Fix golang bindings build and multimodal crate (#57) by @slin1237 in #57
- Extract multimodal module into standalone llm-multimodal crate (#56) by @slin1237 in #56
- Extract data_connector to workspace crate and standardize folder naming (#55) by @slin1237 in #55
- rename: kv-index to kv_index to follow Rust naming conventions (#54) by @slin1237 in #54
- refactor: Extract kv-index crate and modularize CI benchmark workflows (#53) by @slin1237 in #53
- refactor: Extract MCP module into standalone workspace crate (#52) by @slin1237 in #52
- Consolidate workspace dependencies across all crates (#50) by @slin1237 in #50
- Reduce benchmark CI overhead (#51) by @slin1237 in #51
- refactor: Extract auth into standalone smg-auth workspace crate (#49) by @slin1237 in #49
- Use request-scoped MCP clients for Responses API tools (#43) by @zhaowenzi in #43
- refactor: Extract tokenizer into standalone llm-tokenizer workspace crate (#48) by @slin1237 in #48
- refactor: Extract workflow into standalone wfaas workspace crate (#47) by @slin1237 in #47
- refactor: Extract tool-parser into standalone workspace crate (#46) by @slin1237 in #46
- refactor: Extract reasoning-parser into standalone workspace crate (#44) by @slin1237 in #44
- Extract protocols as openai-protocol workspace crate (#45) by @slin1237 in #45
- Add ci benchmark workflow (#37) by @key4ng in #37
- add release-pypi.yml (#41) by @key4ng in #41
- Fix
test_state_synchronizationby Chang Su - Fix empty tenant issue in tree.rs by Chang Su
- Setup auto labeler (#40) by @CatherineSue in #40
- refactor: Consolidate "unknown" model id usage (#39) by @CatherineSue in #39
- refactor: unify registration workflow and duplication check (#38) by @CatherineSue in #38
- Add ci workflow[wip] (#36) by @key4ng in #36
- Add .gitignore (#34) by @key4ng in #34
- Add Code owner (#35) by @key4ng in #35
- Add pre-commit hooks and fix clippy warnings (#27) by @key4ng in #27
- [smg] release 0.3.2 (#17168) by Simo Lin in https://github.com/lightseekorg/smg/pull/17168
New Contributors
- @dependabot[bot] made their first contribution
- Kun(llfl) made their first contribution in 10b950e3
- @pallasathena92 made their first contribution in 3ade189a
Full Changelog: 6caca5b...v1.0.1