From 33f6872668e531fdd42a0a26a3a741251b0a7c0a Mon Sep 17 00:00:00 2001 From: jayantxie Date: Mon, 29 Sep 2025 18:59:48 +0800 Subject: [PATCH] feat: release kitex v0.15.1 --- .../en/blog/releases/Kitex/release-v0_15_1.md | 97 ++++++++++++++ .../streamx/StreamX_Error_Handling.md | 67 +++++++--- .../streamx/StreamX_Lifecycle_Control.md | 94 ++++++++++++++ .../code-gen/idl_enumeration_type.md | 57 +++++++++ .../zh/blog/releases/Kitex/release-v0_15_1.md | 119 ++++++++++++++++++ .../streamx/StreamX_Error_Handling.md | 57 +++++++-- .../streamx/StreamX_Lifecycle_Control.md | 94 ++++++++++++++ .../code-gen/idl_enumeration_type.md | 57 +++++++++ 8 files changed, 615 insertions(+), 27 deletions(-) create mode 100644 content/en/blog/releases/Kitex/release-v0_15_1.md create mode 100644 content/en/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Lifecycle_Control.md create mode 100644 content/en/docs/kitex/Tutorials/code-gen/idl_enumeration_type.md create mode 100644 content/zh/blog/releases/Kitex/release-v0_15_1.md create mode 100644 content/zh/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Lifecycle_Control.md create mode 100644 content/zh/docs/kitex/Tutorials/code-gen/idl_enumeration_type.md diff --git a/content/en/blog/releases/Kitex/release-v0_15_1.md b/content/en/blog/releases/Kitex/release-v0_15_1.md new file mode 100644 index 00000000000..488fc0cfd43 --- /dev/null +++ b/content/en/blog/releases/Kitex/release-v0_15_1.md @@ -0,0 +1,97 @@ +--- +title: "Kitex Release v0.15.1" +linkTitle: "Release v0.15.1" +projects: ["Kitex"] +date: 2025-09-29 +description: > +--- + +## **Introduction to Key Changes** + +### **Announcements** +1. **Go Version Support Changes**: Kitex's minimum declared Go version has been adjusted to Go1.20 and supports up to Go1.25 + - Currently does not affect Go v1.18/v1.19 compilation, but after being declared for higher versions, subsequent versions will introduce features of higher versions + +### **New Features** +1. **Generic Call: New v2 API Supporting Multi-Services and Streaming Calls** + + The Thrift binary generic call API now provides v2 version, supporting multi-services and streaming calls. For detailed usage, see [Generic Call User Guide](/docs/kitex/tutorials/advanced-feature/generic-call/basic_usage) + +2. **Generic Call: Support for Unknown Service Handler** + + Facilitates rapid development of streaming proxy, see [Proxy Application Development Guide](/docs/kitex/tutorials/advanced-feature/proxy_application_development) for details + +3. **Generic Call: Support for Server-Level JSON/Map Streaming Generic Calls** + + See: [Generic Call User Guide](/docs/kitex/tutorials/advanced-feature/generic-call/basic_usage) for details + +4. **TTHeader Streaming: Support for ctx Cancel to Control Flow Lifecycle** + + - Quickly terminate streaming calls, saving model resources + - Aligns with gRPC, for detailed usage see [Stream Lifecycle Control Best Practices](/docs/kitex/tutorials/basic-feature/streamx/StreamX_Lifecycle_Control) + - Supports Client actively invoking cancel to end streaming calls + - Supports Client sensing the ctx cancel signal of the current Handler and cascading to end streaming calls + +5. **Streaming Error Handling Optimization** + + - Quickly address specific error scenarios, accelerate troubleshooting of cascade cancel link issues, see [Stream Error Handling Best Practices](/docs/kitex/tutorials/basic-feature/streamx/StreamX_Error_Handling) for details + - In cascade cancel scenarios, error description includes complete cancel link, quickly locating the first-hop service that actively cancels + - Error description includes specific error scenarios and corresponding unique error codes + - Unified and convenient cancel error handling method, eliminating the need for cumbersome string matching + +### **Feature/Experience Optimization** +1. **Generic Client: Optimize Background Goroutine Startup Logic** + + Starting from Kitex v0.13.0, a generic client supports both Ping-Pong and streaming calls, and uses the TTHeader Streaming protocol by default. Each generic client automatically starts a background goroutine to clean up idle connections for TTHeader Streaming. + + If users previously used the generic client incorrectly (e.g., creating a generic client for each request), upgrading to Kitex v0.13.x would result in a large number of background goroutines being created, leading to goroutine leaks, even though streaming generics are not actually used. + + The v0.15.1 version only creates background goroutines when streaming generalization is actually used. + +### **Code Generation Tool Kitex Tool** +1. **Strict Enum Value Checking** + + For scenarios where Thrift IDL defines enum value overflow, strict generation checks have been added, see [Kitex Tool Enum Type Checking Instructions](/docs/kitex/tutorials/code-gen/idl_enumeration_type) for details + + This change will cause some products to fail to generate because correctness already has issues, posing a significant risk to the service! + +### **Special Change - Minor Services May Be Affected** +> Interface Breaking Change that has no impact on 99.9% of users + +Kitex will ensure compatibility with normal usage patterns of internal users. However, individual users may have dependencies on definitions in the Kitex repository, and this version adjustment of Kitex will have an impact on these users. + +This version has made minor adjustments to non-standard usage of `remote.Message`, `rpcinfo.RPCInfo` or `generic.Generic` interfaces. If there are special usages, they need to be adjusted to conform to the new version's interface definition. + +## **Full Change** + +### Feature +* feat(ttstream): support ctx cancel and detailed canceled error by @DMwangnima in [#1821](https://github.com/cloudwego/kitex/pull/1821) | [#1859](https://github.com/cloudwego/kitex/pull/1859) | [#1856](https://github.com/cloudwego/kitex/pull/1856) +* feat(generic): support new thrift binary generic call api, server streaming generic call and unknown service or method handler by @jayantxie in [#1837](https://github.com/cloudwego/kitex/pull/1837) | [#1857](https://github.com/cloudwego/kitex/pull/1857) +* feat(grpc): support dump MaxConcurrentStreams of HTTP2 Client by @DMwangnima in [#1820](https://github.com/cloudwego/kitex/pull/1820) + +### Fix +* fix(retry): shallow copy response to avoid data race by @jayantxie in [#1799](https://github.com/cloudwego/kitex/pull/1799) | [#1814](https://github.com/cloudwego/kitex/pull/1814) +* fix(lbcache): check the existence before new Balancer to prevent leakage by @ppzqh in [#1825](https://github.com/cloudwego/kitex/pull/1825) +* fix(generic): descriptor.HTTPRequest.GetParam nil pointer exception by @jayantxie in [#1827](https://github.com/cloudwego/kitex/pull/1827) +* fix(generic): fix generic write int range check by @HeyJavaBean in [#1861](https://github.com/cloudwego/kitex/pull/1861) +* fix(rpcinfo): protect bizErr and extra field of ri.Invocation by lock by @jayantxie in [#1850](https://github.com/cloudwego/kitex/pull/1850) +* fix(timeout): remove timer pool to avoid timer race issue by @jayantxie in [#1858](https://github.com/cloudwego/kitex/pull/1858) +* fix(tool): disable fast api for protobuf by @DMwangnima in [#1807](https://github.com/cloudwego/kitex/pull/1807) +* fix(tool): skip pb code gen for arg -use by @xiaost in [#1819](https://github.com/cloudwego/kitex/pull/1819) + +### Optimize +* optimize(grpc): access metadata.MD without ToLower by @xiaost in [#1806](https://github.com/cloudwego/kitex/pull/1806) +* optimize(ttstream): lazy init cleaning task for ObjectPool to reduce the impact of lots of goroutines caused by creating too many Generic Client by @DMwangnima in [#1842](https://github.com/cloudwego/kitex/pull/1842) +* optimize(tool): remove string deepcopy because the string type is read-only in Go by @jayantxie in [#1832](https://github.com/cloudwego/kitex/pull/1832) + +### Refactor +* refactor(ttstream): remove ttstream provider by @jayantxie in [#1805](https://github.com/cloudwego/kitex/pull/1805) +* refactor(rpcinfo): move service/method info from message to rpcinfo, remove protocol info from message and update min go version to 1.20 by @jayantxie in [#1818](https://github.com/cloudwego/kitex/pull/1818) | [#1855](https://github.com/cloudwego/kitex/pull/1855) +* refactor(server): remove service middleware and SupportedTransportsFunc api by @jayantxie in [#1839](https://github.com/cloudwego/kitex/pull/1839) +* refactor(server): remove useless TargetSvcInfo field by @jayantxie in [#1840](https://github.com/cloudwego/kitex/pull/1840) + +### Chore +* chore: update dependencies of kitex to support go 1.25 and new features by @jayantxie @AsterDY in [#1848](https://github.com/cloudwego/kitex/pull/1848) | [#1834](https://github.com/cloudwego/kitex/pull/1834) | [#1862](https://github.com/cloudwego/kitex/pull/1862) | [#1836](https://github.com/cloudwego/kitex/pull/1836) +* chore: update version v0.15.0 by @jayantxie in [#1864](https://github.com/cloudwego/kitex/pull/1864) +* docs: fix broken link to blogs by @scientiacoder in [#1813](https://github.com/cloudwego/kitex/pull/1813) +* chore: support custom ctx key to pass to downstream in Service-Inline by @Duslia in [#1709](https://github.com/cloudwego/kitex/pull/1709) diff --git a/content/en/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Error_Handling.md b/content/en/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Error_Handling.md index 0b225f43926..5a45102f612 100644 --- a/content/en/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Error_Handling.md +++ b/content/en/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Error_Handling.md @@ -1,30 +1,65 @@ --- -title: "StreamX Error Handling" -date: 2025-01-13 -weight: 4 -keywords: ["Stream Error Handling"] -description: "" +title: "Stream Error Handling Best Practices" +linkTitle: "Stream Error Handling Best Practices" +weight: 3 +date: 2025-09-29 +description: "Kitex StreamX stream error handling best practices, introducing TTHeader Streaming error codes and error handling mechanisms." --- ## Preface -Unlike RPC, stream errors can occur at any time during stream processing. For example, a server can return an error after sending multiple messages. However, once a stream has sent an error, it cannot send any more messages. +Unlike PingPong RPC, stream errors can occur at any time during stream processing. For example, a server can return an error after sending multiple messages. However, once a stream has sent an error, it cannot send any more messages. -## Error type +## Error Types -### Business exception +### Framework Exceptions -**Usage example** : For example, in the ChatGPT scenario, we need to constantly check whether the user account balance can continue to call the large model to generate returns. +#### Error Description Meaning -Server implementation: +``` +[ttstream error, code=12007] [server-side stream] [canceled path: ServiceA] user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively +``` + +| Error Description | Meaning | Notes | +|-------------------|---------|-------| +| [ttstream error, code=12007] | TTHeader Streaming error, error code 12007, corresponding to the scenario where upstream actively cancels | | +| [server-side stream] | Indicates that the error is thrown by the Stream on the server side | | +| [canceled path: ServiceA] | Indicates that ServiceA actively initiated cancel | | +| user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively | Specific error description | | + +#### Error Code Summary + +TTHeader Streaming Error Summary + +| Error Code | Error Description | Meaning | Notes | +|------------|-------------------|---------|-------| +| 12001 | application exception | Business exception, downstream handler returns err | | +| 12002 | unexpected header frame | Header Frame related errors | | +| 12003 | illegal biz err | Failed to parse business exception contained in Trailer Frame | | +| 12004 | illegal frame | Failed to parse basic information of Frame | | +| 12005 | illegal operation | Error due to improper Stream usage, such as Stream has been CloseSend but still Send | | +| 12006 | transport is closing | Connection exception, such as connection has been closed | | +| 12007 | user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively | Upstream actively uses cancel() | | +| 12008 | user code canceled with cancelCause(error) | Upstream uses context.WithCancelCause and actively uses cancel(err) | | +| 12009 | canceled by downstream | Canceled by downstream service | | +| 12010 | canceled by upstream | Canceled by upstream service | | +| 12011 | Internal canceled | Cascade cancel scenario, such as gRPC handler ctx is canceled, cascade cancel TTHeader Streaming | | +| 12012 | canceled by business handler returning | Handler exits early, but there are still asynchronous goroutines using Recv/Send | | +| 12013 | canceled by connection closed | Stream lifecycle ends due to connection closure, common in server-side service migration/update | | + +### Business Exceptions + +Usage example: For example, in the ChatGPT scenario, we need to constantly check whether the user account balance can continue to call the large model to generate returns. + +**Server Implementation:** ```go func (si *streamingService) ServerStreamWithErr(ctx context.Context, req *echo.Request, stream echo.TestService_ServerStreamWithErrServer) error { - // 检查用户账户余额 + // Check user account balance for isHasBalance (req.UserId) { stream.Send(ctx, res) } - // 返回用户余额不足错误 + // Return insufficient user balance error bizErr := kerrors.NewBizStatusErrorWithExtra( 10001, "insufficient user balance", map[string]string{"testKey": "testVal"}, ) @@ -32,7 +67,7 @@ func (si *streamingService) ServerStreamWithErr(ctx context.Context, req *echo.R } ``` -Client implementation: +**Client Implementation:** ```go stream, err = cli.ServerStreamWithErr(ctx, req) @@ -50,11 +85,11 @@ if ok { } ``` -### Other errors +### Other Errors If the Error returned by the Server is a non-business exception, the framework will be uniformly encapsulated as `(*thrift.ApplicationException)`. At this time, only the error Message can be obtained. -Server implementation: +**Server Implementation:** ```go func (si *streamingService) ServerStreamWithErr(ctx context.Context, req *echo.Request, stream echo.TestService_ServerStreamWithErrServer) error { @@ -63,7 +98,7 @@ func (si *streamingService) ServerStreamWithErr(ctx context.Context, req *echo.R } ``` -Client implementation: +**Client Implementation:** ```go stream, err = cli.ServerStreamWithErr(ctx, req) diff --git a/content/en/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Lifecycle_Control.md b/content/en/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Lifecycle_Control.md new file mode 100644 index 00000000000..e4f10588bae --- /dev/null +++ b/content/en/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Lifecycle_Control.md @@ -0,0 +1,94 @@ +--- +title: "Stream Lifecycle Control Best Practices" +linkTitle: "Stream Lifecycle Control Best Practices" +weight: 4 +date: 2025-09-29 +description: "Kitex StreamX stream lifecycle control best practices, introducing how to use ctx cancel to control streaming call lifecycle." +--- + +## Background + +When directly interacting with the model layer through streaming, the caller needs to directly notify the model layer to stop responding in certain scenarios, thereby saving model resources. + +In large model application scenarios such as classic Chat, the entire chain uses streaming interfaces that need to be connected in series, requiring perception of end-user disconnection signals and quickly ending the entire chain. + +The above scenarios essentially require the upstream to be able to actively end streaming calls, often using ctx for control. When ctx is canceled, the Stream lifecycle will also end. + +Kitex gRPC and TTHeader Streaming both support the mechanism of controlling Stream lifecycle based on ctx cancel, and TTHeader Streaming optimizes error descriptions on the basis of gRPC, which can better handle problem diagnosis in cascade cancel scenarios. + +## TTHeader Streaming Supports Stream Lifecycle Control Based on ctx cancel + +**Kitex >= v0.15.1 supports this feature** + +### Upstream Actively Cancels Downstream + +Here we use ServerStreaming as an example. When the upstream receives a special response, it actively calls cancel() to end the downstream Stream lifecycle. + +#### Upstream - ServiceA + +```go +// ctx generally comes from handler +ctx, cancel := context.WithCancel(ctx) +defer cancel() +cliSt, err := cli.InvokeStreaming(ctx, req) +if err != nil { + // Log or perform other operations + return +} + +for { + resp, err := cliSt.Recv(cliSt.Context()) + if err != nil { + if err == io.EOF { + // Normal end + return + } + // Log or perform other operations + // Abnormal end + return + } + // Determine if it is a business-specific response, for example, a special flag is defined in resp to indicate end + if isBizSpecialResp(resp) { + // Cancel downstream Stream + cancel() + return + } +} +``` + +#### Downstream - ServiceB + +```go +import ( + "github.com/cloudwego/kitex/pkg/kerrors" +) + +func (impl *ServiceImpl) InvokeStreaming(ctx context.Context, stream Service_InvokeStreamingServer) (err error) { + // Downstream continuously sends data, only for demonstration + for { + if err = stream.Send(ctx, resp); err != nil { + if errors.Is(kerrors.ErrStreamingCanceled, err) { + // Upstream cancel + } + // Log or perform other operations + return + } + time.Sleep(100 * time.Millisecond) + } +} +``` + +At this time, the downstream error description is: + +``` +[ttstream error, code=12007] [server-side stream] [canceled path: ServiceA] user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively +``` + +The meaning of each part of the error description is as follows: + +| Error Description | Meaning | Notes | +|-------------------|---------|-------| +| [ttstream error, code=12007] | TTHeader Streaming error, error code 12007, corresponding to the scenario where upstream actively cancels | | +| [server-side stream] | Indicates that the error is thrown by the Stream on the server side | | +| [canceled path: ServiceA] | Indicates that ServiceA actively initiated cancel | | +| user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively | Specific error description | | diff --git a/content/en/docs/kitex/Tutorials/code-gen/idl_enumeration_type.md b/content/en/docs/kitex/Tutorials/code-gen/idl_enumeration_type.md new file mode 100644 index 00000000000..e0760089df2 --- /dev/null +++ b/content/en/docs/kitex/Tutorials/code-gen/idl_enumeration_type.md @@ -0,0 +1,57 @@ +--- +title: "Kitex Tool Enum Type Checking Instructions" +linkTitle: "Enum Type Checking" +weight: 12 +date: 2025-09-29 +description: "Kitex Tool enum type checking instructions, introducing the checking mechanism for Thrift IDL enum value overflow issues." +--- + +## Background: Enum int32 Overflow Issue + +In the Thrift protocol, enum types are actually passed as int32. If the Thrift IDL defines enum values that exceed the int32 range, they will overflow during transmission, and the peer cannot receive the correct value and cannot match the correct enum type. + +**The correctness already has issues, posing a significant risk to the service!!!** + +A common error writing is as follows: (treating enum values as similar to fixed-format error codes, actually overflowing as int32) + +```thrift +enum MyEnum{ + A = 3000000001000, + B = 3000000001001, + C = 3000000001002, +} +``` + +## Tool Changes: Strict Correctness Checking + +Generally, Goland IDE will not prompt this Thrift syntax issue, but as long as it's written this way, using enums will definitely cause errors. + +Therefore, to ensure correctness and avoid risk hazards, Kitex Tool after v0.15.1 (Thriftgo v0.4.3) will check this enum scenario, and when encountering out-of-bounds, it will directly fail to generate and prompt the location: + +``` +[WARN] enum overflow: the value (3000000001000) of enum 'xxx/base.thrift MyEnum' exceeds the range of int32. +Due to legacy implementation, thriftgo generates int64 for enums in Go code. +However, during network, values undergo int64->int32->int64 conversion. Values outside int32 will overflow. +Please adjust the enum value to fit within the int32 range [-2147483648, 2147483647]. +If you just want to define a very big constant, please use 'const i64 MyConst = xxx' instead. +``` + +This error message indicates that in the `xxx/base.thrift` file, the enum `3000000001000` of `MyEnum` overflows. + +## Solution: Correct Incorrect Enum Values + +The tool's error message will contain information about the incorrect enum value: + +``` +enum overflow: the value (3000000001000) of enum 'xxx/base.thrift MyEnum' +``` + +You need to find the problematic enum value according to the prompt and correct it to within the int32 range (-2147483648, 2147483647). + +If this IDL belongs to other public libraries, you can blame the file history and contact the corresponding classmate to modify it. + +**This interception check does not support skipping for now** + +## Indirect Impact of Dependency Introduction + +Sorry, if your IDL introduces illegally defined Enums from others, it will affect your product generation. To eliminate this wrong usage, unified failure handling is applied. Please contact the corresponding IDL definition classmates to modify it. diff --git a/content/zh/blog/releases/Kitex/release-v0_15_1.md b/content/zh/blog/releases/Kitex/release-v0_15_1.md new file mode 100644 index 00000000000..4decec4533c --- /dev/null +++ b/content/zh/blog/releases/Kitex/release-v0_15_1.md @@ -0,0 +1,119 @@ +--- +title: "Kitex Release v0.15.1" +linkTitle: "Release v0.15.1" +projects: ["Kitex"] +date: 2025-09-29 +description: > +--- + +## **重要变更介绍** + +### **公告** +1. **Go 版本支持变化**:Kitex 最低声明 Go 版本调整至 Go1.20,并支持至 Go1.25 + - 暂时不影响 Go v1.18/v1.19 编译,但声明到高版本后,后续版本会引入高版本特性 + +### **新特性** +1. **泛化调用:全新 v2 API 支持 multi services 和流式调用** + + Thrift 二进制泛化调用 API 提供 v2 版本,支持 multi services 和 streaming 调用,详细用法见[泛化调用使用指南](/zh/docs/kitex/tutorials/advanced-feature/generic-call/basic_usage) + +2. **泛化调用:支持 unknown service handler** + + 便于快速开发 streaming proxy,详见[Proxy 应用开发指南](/zh/docs/kitex/tutorials/advanced-feature/proxy_application_development) + +3. **泛化调用:支持服务端 json/map 流式泛化调用** + + 详见:[泛化调用使用指南](/zh/docs/kitex/tutorials/advanced-feature/generic-call/basic_usage) + +4. **TTHeader Streaming:支持 ctx cancel 控制流生命周期** + + - 快速结束流式调用,节省模型资源 + - 对齐 gPRC,详细用法见[流生命周期控制最佳实践](/zh/docs/kitex/tutorials/basic-feature/streamx/StreamX_Lifecycle_Control) + - 支持 Client 主动调用 cancel 结束流式调用 + - 支持 Client 感知所处 Handler 的 ctx cancel 信号,级联结束流式调用 + +5. **流式错误处理优化** + + - 快速对应具体错误场景,加速级联 cancel 链路问题排查,详情见[流错误处理最佳实践](/zh/docs/kitex/tutorials/basic-feature/streamx/StreamX_Error_Handling) + - 级联 cancel 场景,错误描述包含完整 cancel 链路,快速定位主动 cancel 的第一跳服务 + - 错误描述包含具体的错误场景,以及与之唯一对应的错误码 + - 统一方便的 cancel 错误处理方式,无需使用繁琐的字符串匹配 + +### **功能/体验优化** +1. **泛化 Client:优化后台 goroutine 启动逻辑** + + 从 Kitex v0.13.0 开始,一个泛化 Client 同时支持 Ping-Pong 和流式调用,并默认使用 TTHeader Streaming 协议,每个泛化 Client 都会自动开启一个后台 goroutine 用于清理 TTHeader Streaming 的空闲连接。 + + 若用户之前使用泛化 Client 的姿势不当(例如每次请求都创建一个泛化 Client),升级到 Kitex v0.13.x 后会导致大量后台 goroutine 被创建,产生 goroutine 泄漏的现象,但实际上没使用流式泛化。 + + v0.15.1 版本只有在真正使用到流式泛化时才会创建后台 goroutine。 + +### **代码生成工具 Kitex Tool** +1. **严格的枚举值检查** + + 针对 Thrift IDL 定义枚举值溢出的场景,增加了严格的生成检查,详见[Kitex Tool 检查枚举类型说明](/zh/docs/kitex/tutorials/code-gen/idl_enumeration_type) + + 该变更会导致部分产物生成失败,因为正确性已经存在问题,对服务风险较大! + +### **特殊变更 - 少数服务可能会有影响** +> 对 99.9% 用户无影响的接口 Breaking Change + +Kitex 会保证内部用户正常使用方式的兼容性。但个别用户可能对 Kitex 仓库的定义有依赖,Kitex 本次版本调整对这部分用户有影响。 + +本版本对 `remote.Message`、`rpcinfo.RPCInfo` 或 `generic.Generic` 接口非普通使用方式做了微调,如果有特殊的使用需要调整至符合新版本的接口定义。 + +## **详细变更** + +### Feature +* feat(ttstream): support ctx cancel and detailed canceled error by @DMwangnima in [#1821](https://github.com/cloudwego/kitex/pull/1821) | [#1859](https://github.com/cloudwego/kitex/pull/1859) | [#1856](https://github.com/cloudwego/kitex/pull/1856) +> 特性:TTStream 支持上下文取消及详细的取消错误信息 +* feat(generic): support new thrift binary generic call api, server streaming generic call and unknown service or method handler by @jayantxie in [#1837](https://github.com/cloudwego/kitex/pull/1837) | [#1857](https://github.com/cloudwego/kitex/pull/1857) +> 特性:支持新的 thrift 二进制泛化调用 api,服务端流式泛化调用和 unknown service or method handler +* feat(grpc): support dump MaxConcurrentStreams of HTTP2 Client by @DMwangnima in [#1820](https://github.com/cloudwego/kitex/pull/1820) +> 特性:gRPC 支持导出 HTTP2 客户端的 MaxConcurrentStreams 配置 + +### Fix +* fix(retry): shallow copy response to avoid data race by @jayantxie in [#1799](https://github.com/cloudwego/kitex/pull/1799) | [#1814](https://github.com/cloudwego/kitex/pull/1814) +> 修复:浅拷贝 response 以避免数据竞争 +* fix(lbcache): check the existence before new Balancer to prevent leakage by @ppzqh in [#1825](https://github.com/cloudwego/kitex/pull/1825) +> 修复:负载均衡器缓存中创建新均衡器前检查存在性以防止泄漏 +* fix(generic): descriptor.HTTPRequest.GetParam nil pointer exception by @jayantxie in [#1827](https://github.com/cloudwego/kitex/pull/1827) +> 修复:描述符 HTTPRequest.GetParam 的空指针异常 +* fix(generic): fix generic write int range check by @HeyJavaBean in [#1861](https://github.com/cloudwego/kitex/pull/1861) +> 修复:泛化写入整数的范围检查 +* fix(rpcinfo): protect bizErr and extra field of ri.Invocation by lock by @jayantxie in [#1850](https://github.com/cloudwego/kitex/pull/1850) +> 修复:通过锁保护 ri.Invocation 的 bizErr 和 extra 字段 +* fix(timeout): remove timer pool to avoid timer race issue by @jayantxie in [#1858](https://github.com/cloudwego/kitex/pull/1858) +> 修复:移除计时器池以避免计时器竞争问题 +* fix(tool): disable fast api for protobuf by @DMwangnima in [#1807](https://github.com/cloudwego/kitex/pull/1807) +> 修复:工具中为 Protobuf 禁用 Fast API +* fix(tool): skip pb code gen for arg -use by @xiaost in [#1819](https://github.com/cloudwego/kitex/pull/1819) +> 修复:工具中为 -use 参数跳过 PB 代码生成 + +### Optimize +* optimize(grpc): access metadata.MD without ToLower by @xiaost in [#1806](https://github.com/cloudwego/kitex/pull/1806) +> 优化:gRPC 访问 metadata.MD 时不转换为小写 +* optimize(ttstream): lazy init cleaning task for ObjectPool to reduce the impact of lots of goroutines caused by creating too many Generic Client by @DMwangnima in [#1842](https://github.com/cloudwego/kitex/pull/1842) +> 优化:对象池延迟初始化清理任务,减少创建过多泛化客户端导致的大量 goroutine 影响 +* optimize(tool): remove string deepcopy because the string type is read-only in Go by @jayantxie in [#1832](https://github.com/cloudwego/kitex/pull/1832) +> 优化:移除字符串深拷贝,因为 Go 中字符串类型是只读的 + +### Refactor +* refactor(ttstream): remove ttstream provider by @jayantxie in [#1805](https://github.com/cloudwego/kitex/pull/1805) +> 重构:移除 TTStream provider 接口 +* refactor(rpcinfo): move service/method info from message to rpcinfo, remove protocol info from message and update min go version to 1.20 by @jayantxie in [#1818](https://github.com/cloudwego/kitex/pull/1818) | [#1855](https://github.com/cloudwego/kitex/pull/1855) +> 重构:将服务/方法信息从消息移至 rpcinfo,从消息中移除协议信息,并更新最低 Go 版本至 1.20 +* refactor(server): remove service middleware and SupportedTransportsFunc api by @jayantxie in [#1839](https://github.com/cloudwego/kitex/pull/1839) +> 重构:移除服务中间件和 SupportedTransportsFunc API +* refactor(server): remove useless TargetSvcInfo field by @jayantxie in [#1840](https://github.com/cloudwego/kitex/pull/1840) +> 重构:移除无用的 TargetSvcInfo 字段 + +### Chore +* chore: update dependencies of kitex to support go 1.25 and new features by @jayantxie @AsterDY in [#1848](https://github.com/cloudwego/kitex/pull/1848) | [#1834](https://github.com/cloudwego/kitex/pull/1834) | [#1862](https://github.com/cloudwego/kitex/pull/1862) | [#1836](https://github.com/cloudwego/kitex/pull/1836) +> chore:更新 kitex 依赖项以支持 go1.25 和新特性 +* chore: update version v0.15.0 by @jayantxie in [#1864](https://github.com/cloudwego/kitex/pull/1864) +> chore:更新版本至 v0.15.0 +* docs: fix broken link to blogs by @scientiacoder in [#1813](https://github.com/cloudwego/kitex/pull/1813) +> chore:修复博客的损坏链接 +* chore: support custom ctx key to pass to downstream in Service-Inline by @Duslia in [#1709](https://github.com/cloudwego/kitex/pull/1709) +> 特性:在合并编译场景中支持传递自定义上下文 key 到下游 diff --git a/content/zh/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Error_Handling.md b/content/zh/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Error_Handling.md index 53198e2e5f8..cdf5b840c88 100644 --- a/content/zh/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Error_Handling.md +++ b/content/zh/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Error_Handling.md @@ -1,9 +1,9 @@ --- -title: "StreamX 流错误处理最佳实践" -date: 2025-01-10 -weight: 4 -keywords: ["流错误处理最佳实践"] -description: "" +title: "流错误处理最佳实践 | Stream Error Handling" +linkTitle: "流错误处理最佳实践" +weight: 3 +date: 2025-09-29 +description: "Kitex StreamX 流错误处理最佳实践,介绍 TTHeader Streaming 错误码和错误处理机制。" --- ## 前言 @@ -12,11 +12,46 @@ description: "" ## 错误类型 +### 框架异常 + +#### 错误描述含义 + +``` +[ttstream error, code=12007] [server-side stream] [canceled path: ServiceA] user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively +``` + +| 错误描述 | 含义 | 备注 | +|---------|------|------| +| [ttstream error, code=12007] | TTHeader Streaming 错误,错误码为 12007,对应上游主动 cancel 的场景 | | +| [server-side stream] | 表示该错误由 server 侧的 Stream 抛出 | | +| [canceled path: ServiceA] | 表示由 ServiceA 主动发起 cancel | | +| user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively | 具体的错误描述 | | + +#### 错误码汇总 + +TTHeader Streaming 错误汇总 + +| 错误码 | 错误描述 | 含义 | 备注 | +|--------|---------|------|------| +| 12001 | application exception | 业务异常,下游 handler 返回 err | | +| 12002 | unexpected header frame | Header Frame 相关的错误 | | +| 12003 | illegal biz err | 解析 Trailer Frame 中包含的业务异常失败 | | +| 12004 | illegal frame | 解析 Frame 的基础信息失败 | | +| 12005 | illegal operation | 使用 Stream 姿势不当报错,例如 Stream 已经 CloseSend 了,依然 Send | | +| 12006 | transport is closing | 连接异常,例如连接已被关闭 | | +| 12007 | user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively | 上游主动使用 cancel() | | +| 12008 | user code canceled with cancelCause(error) | 上游使用 context.WithCancelCause,并主动使用 cancel(err) | | +| 12009 | canceled by downstream | 被下游服务 cancel | | +| 12010 | canceled by upstream | 被上游服务 cancel | | +| 12011 | Internal canceled | 级联 cancel 场景,例如 gRPC handler ctx 被 cancel,级联 cancel TTHeader Streaming | | +| 12012 | canceled by business handler returning | Handler 提前退出,但仍有异步 goroutine 使用 Recv/Send | | +| 12013 | canceled by connection closed | 连接被关闭导致 Stream 生命周期结束,常见于 Server 侧服务迁移/更新 | | + ### 业务异常 -**使用范例**:例如 ChatGPT 场景,我们需要不停检查用户账户余额是否能继续调用大模型生成返回。 +使用范例:例如 ChatGPT 场景,我们需要不停检查用户账户余额是否能继续调用大模型生成返回。 -Server 实现: +**Server 实现:** ```go func (si *streamingService) ServerStreamWithErr(ctx context.Context, req *echo.Request, stream echo.TestService_ServerStreamWithErrServer) error { @@ -32,7 +67,7 @@ func (si *streamingService) ServerStreamWithErr(ctx context.Context, req *echo.R } ``` -Client 实现: +**Client 实现:** ```go stream, err = cli.ServerStreamWithErr(ctx, req) @@ -52,9 +87,9 @@ if ok { ### 其他错误 -如果 Server 返回的 Error 为非业务异常,框架会统一封装为 `(*thrift.ApplicationException)` 。此时只能拿到错误的 Message 。 +如果 Server 返回的 Error 为非业务异常,框架会统一封装为 `(*thrift.ApplicationException)`。此时只能拿到错误的 Message。 -Server 实现: +**Server 实现:** ```go func (si *streamingService) ServerStreamWithErr(ctx context.Context, req *echo.Request, stream echo.TestService_ServerStreamWithErrServer) error { @@ -63,7 +98,7 @@ func (si *streamingService) ServerStreamWithErr(ctx context.Context, req *echo.R } ``` -Client 实现: +**Client 实现:** ```go stream, err = cli.ServerStreamWithErr(ctx, req) diff --git a/content/zh/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Lifecycle_Control.md b/content/zh/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Lifecycle_Control.md new file mode 100644 index 00000000000..16bdb8527fa --- /dev/null +++ b/content/zh/docs/kitex/Tutorials/basic-feature/streamx/StreamX_Lifecycle_Control.md @@ -0,0 +1,94 @@ +--- +title: "流生命周期控制最佳实践" +linkTitle: "流生命周期控制最佳实践" +weight: 4 +date: 2025-09-29 +description: "Kitex StreamX 流生命周期控制最佳实践,介绍如何使用 ctx cancel 控制流式调用生命周期。" +--- + +## 背景 + +与模型层直接进行流式交互时,需要调用方在某些场景直接通知模型层停止响应,从而节省模型资源。 + +在大模型应用场景如经典的 Chat,整个链路使用流式接口需要串联,需要感知端上用户断开的信号并快速让整条链路都结束。 + +以上场景本质上需要上游能够主动结束流式调用,常使用 ctx 来进行控制,ctx 被 cancel 那么 Stream 的生命周期也会结束。 + +Kitex gRPC 以及 TTHeader Streaming 都支持这种基于 ctx cancel 控制 Stream 生命周期的机制,并且 TTHeader Streaming 在 gRPC 的基础上优化了错误描述,能更好地应对级联 cancel 场景问题排查。 + +## TTHeader Streaming 支持基于 ctx cancel 控制 Stream 生命周期 + +**Kitex >= v0.15.1 支持该功能** + +### 上游主动 cancel 下游 + +此处以 ServerStreaming 为例,当上游接收到特殊的 resp 后,主动调用 cancel() 结束下游 Stream 生命周期。 + +#### 上游 - ServiceA + +```go +// ctx 一般来源于 handler +ctx, cancel := context.WithCancel(ctx) +defer cancel() +cliSt, err := cli.InvokeStreaming(ctx, req) +if err != nil { + // 打日志或执行其它操作 + return +} + +for { + resp, err := cliSt.Recv(cliSt.Context()) + if err != nil { + if err == io.EOF { + // 正常结束 + return + } + // 打日志或执行其它操作 + // 异常结束 + return + } + // 判断是否是业务上特殊的 resp,例如 resp 中定义了特殊 flag,表示结束 + if isBizSpecialResp(resp) { + // cancel 下游 Stream + cancel() + return + } +} +``` + +#### 下游 - ServiceB + +```go +import ( + "github.com/cloudwego/kitex/pkg/kerrors" +) + +func (impl *ServiceImpl) InvokeStreaming(ctx context.Context, stream Service_InvokeStreamingServer) (err error) { + // 下游不停发送数据,仅做展示 + for { + if err = stream.Send(ctx, resp); err != nil { + if errors.Is(kerrors.ErrStreamingCanceled, err) { + // 上游 cancel + } + // 打日志或执行其它操作 + return + } + time.Sleep(100 * time.Millisecond) + } +} +``` + +此时下游报错描述为: + +``` +[ttstream error, code=12007] [server-side stream] [canceled path: ServiceA] user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively +``` + +其中各部分错误描述含义如下所示: + +| 错误描述 | 含义 | 备注 | +|---------|------|------| +| [ttstream error, code=12007] | TTHeader Streaming 错误,错误码为 12007,对应上游主动 cancel 的场景 | | +| [server-side stream] | 表示该错误由 server 侧的 Stream 抛出 | | +| [canceled path: ServiceA] | 表示由 ServiceA 主动发起 cancel | | +| user code invoking stream RPC with context processed by context.WithCancel or context.WithTimeout, then invoking cancel() actively | 具体的错误描述 | | diff --git a/content/zh/docs/kitex/Tutorials/code-gen/idl_enumeration_type.md b/content/zh/docs/kitex/Tutorials/code-gen/idl_enumeration_type.md new file mode 100644 index 00000000000..3aa58074505 --- /dev/null +++ b/content/zh/docs/kitex/Tutorials/code-gen/idl_enumeration_type.md @@ -0,0 +1,57 @@ +--- +title: "Kitex Tool 检查枚举类型说明" +linkTitle: "枚举类型检查" +weight: 12 +date: 2025-09-29 +description: "Kitex Tool 枚举类型检查说明,介绍 Thrift IDL 枚举值溢出问题的检查机制。" +--- + +## 背景:枚举 int32 溢出问题 + +Thrift 协议里,枚举类型实际是以 int32 进行传递的,如果 Thrift IDL 在定义枚举值的时候,值超出了 int32 的范围,在传递时会溢出,对端无法收到正确的值,不能匹配到正确的枚举类型。 + +**正确性已经存在问题,对服务风险较大!!!** + +一种常见的错误写法如下:(把枚举值当成了类似固定格式的错误码的写法,实际作为 int32 溢出了) + +```thrift +enum MyEnum{ + A = 3000000001000, + B = 3000000001001, + C = 3000000001002, +} +``` + +## 工具改动:严格检查正确性 + +一般 Goland IDE 不会提示这个 Thrift 语法问题,但实际只要这样写了,使用枚举就一定会出错。 + +所以为了保证正确性,避免风险隐患,Kitex Tool 在 v0.15.1 (Thriftgo v0.4.3)之后,会检查这种枚举场景,遇到越界会直接生成失败并提示位置: + +``` +[WARN] enum overflow: the value (3000000001000) of enum 'xxx/base.thrift MyEnum' exceeds the range of int32. +Due to legacy implementation, thriftgo generates int64 for enums in Go code. +However, during network, values undergo int64->int32->int64 conversion. Values outside int32 will overflow. +Please adjust the enum value to fit within the int32 range [-2147483648, 2147483647]. +If you just want to define a very big constant, please use 'const i64 MyConst = xxx' instead. +``` + +这段报错表示,在 `xxx/base.thrift` 文件下,`MyEnum` 这个枚举的 `3000000001000` 存在溢出。 + +## 解决方式:修正错误的枚举值 + +工具的报错会包含错误的枚举值信息: + +``` +enum overflow: the value (3000000001000) of enum 'xxx/base.thrift MyEnum' +``` + +你需要根据提示找到有问题的枚举值并修正,调整到 int32 的范围内(-2147483648, 2147483647)。 + +如果这个 IDL 属于其他公共库,可以 blame 文件历史,联系对应同学进行修改。 + +**该拦截检查暂不支持跳过** + +## 依赖引入的间接影响 + +非常抱歉,如果你的 IDL 里引入了别人定义的非法 Enum,会对你的产物生成造成影响,为杜绝的这种错误的用法,统一失败处理,辛苦联系对应 IDL 定义的同学进行修改。