feat: channel model permission and error rate monitor#518
Merged
zijiren233 merged 5 commits intolabring:mainfrom Apr 14, 2026
Merged
feat: channel model permission and error rate monitor#518zijiren233 merged 5 commits intolabring:mainfrom
zijiren233 merged 5 commits intolabring:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds channel-level controls and UI/monitoring for temporarily excluding channel–model routes when they exhibit no-permission responses or high error rates, and refactors the backend monitor to return concrete error-rate values used by retry/alert logic.
Changes:
- Move
warn_error_rate/max_error_rateconfiguration from models to channels and surface the settings in the channel form/table plus i18n strings. - Update the monitor implementation to return
errorRate(and perform temporary exclusion based on channel settings / no-permission), including new local-cache helpers and tests. - Improve operational behavior via TTL jitter (cache/store) and refactor request error mapping / retry selection logic.
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| web/src/validation/model.ts | Removes model-level max_error_rate validation. |
| web/src/validation/channel.ts | Adds channel-level validation for no-permission ban + error-rate thresholds. |
| web/src/utils/runtime-metrics.ts | Adds helpers to read channel-model runtime metrics and exclusion state. |
| web/src/types/model.ts | Removes model-level warn/max error rate types. |
| web/src/types/channel.ts | Adds channel-level enabled_no_permission_ban, warn_error_rate, max_error_rate types. |
| web/src/feature/monitor/components/MonitorCharts.tsx | Uses metric helper and displays “temporarily excluded” state. |
| web/src/feature/model/components/ModelTable.tsx | Uses metric helper and localizes the banned/excluded badge label. |
| web/src/feature/model/components/ModelForm.tsx | Removes model-level max error rate form field and payload handling. |
| web/src/feature/channel/components/ChannelTable.tsx | Exports/imports new channel fields and adds “temporarily excluded models” column UI. |
| web/src/feature/channel/components/ChannelForm.tsx | Adds channel-level ban/error-rate controls and smarter “test saved vs preview” selection. |
| web/src/feature/channel/components/ChannelDialog.tsx | Wires new channel fields into dialog default values. |
| web/public/locales/zh/translation.json | Adds zh translations for temporary exclusion UI + new channel form fields. |
| web/public/locales/en/translation.json | Adds en translations for temporary exclusion UI + new channel form fields. |
| core/relay/plugin/monitor/monitor_test.go | Adds tests for channel-level warn/max error rate and no-permission ban switch behavior. |
| core/relay/plugin/monitor/monitor.go | Uses channel-level warn/max thresholds; adds notify predicate helpers and no-permission ban gating. |
| core/relay/plugin/cache/cache.go | Adds TTL jitter to reduce cache stampede risk. |
| core/relay/meta/meta.go | Adds channel meta fields for new ban/error-rate configuration and propagates from model.Channel. |
| core/relay/controller/dohelper_test.go | Adds tests for improved request error mapping behavior. |
| core/relay/controller/dohelper.go | Refactors request error mapping into mapRequestError. |
| core/relay/adaptor/aws/adaptor.go | Returns a structured relay error when an AWS sub-adaptor cannot be resolved. |
| core/monitor/model_integration_test.go | Updates tests for new AddRequest return shape and adds coverage for min-sample behavior + no-permission ban. |
| core/monitor/model.go | Changes AddRequest to return errorRate; updates Lua script contract; adds GetChannelModelErrorRate. |
| core/monitor/memmodel_test.go | Updates/extends tests for new AddRequest return shape and GetChannelModelErrorRate. |
| core/monitor/memmodel.go | Updates in-memory monitor API to return errorRate; adds ban-duration jitter; adds GetChannelModelErrorRate. |
| core/monitor/local_cache.go | Adds local cache for channel-model error rate (single lookup) and invalidation updates. |
| core/model/store_postgres_integration_test.go | Makes Postgres test startup more reliable with log-based readiness + connection retry. |
| core/model/store_cache.go | Adds local TTL jitter for store cache entries. |
| core/model/modelconfig.go | Removes model-level warn/max error rate fields. |
| core/model/channel.go | Adds channel-level fields and includes them in update field list. |
| core/docs/swagger.yaml | Updates API schema for channel warn/max error rate (but misses one new channel field; see comment). |
| core/docs/swagger.json | Regenerated swagger JSON to reflect schema changes. |
| core/docs/docs.go | Regenerated embedded swagger template to reflect schema changes. |
| core/controller/relay-controller.go | Tracks the lowest error-rate “has permission” channel during retry to improve fallback behavior. |
| core/controller/relay-channel_test.go | Updates retry channel tests and adds tests for lowest-error-rate fallback selection. |
| core/controller/relay-channel.go | Refactors channel picking/filtering pipeline and retry selection behavior. |
| core/controller/channel_test.go | Adds tests for auto-testing banned models concurrency and “model removed” clearing behavior. |
| core/controller/channel.go | Extends add-channel request payload to include new channel fields. |
| core/controller/channel-test.go | Refactors auto-test-banned-models into a worker pool with injectable deps and concurrency control. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+14
to
+16
| enabled_no_permission_ban: z.boolean().optional(), | ||
| warn_error_rate: z.number().min(0, 'Error rate must be at least 0').max(1, 'Error rate must be at most 1').optional(), | ||
| max_error_rate: z.number().min(0, 'Error rate must be at least 0').max(1, 'Error rate must be at most 1').optional(), |
Comment on lines
76
to
+80
| 'rpm', | ||
| 'tpm', | ||
| 'retry_times', | ||
| 'timeout_config', | ||
| 'max_error_rate', | ||
| 'force_save_detail', | ||
| 'tpm', | ||
| 'retry_times', | ||
| 'timeout_config', | ||
| 'force_save_detail', |
Comment on lines
26
to
+58
| @@ -50,6 +54,8 @@ definitions: | |||
| type: integer | |||
| type: | |||
| $ref: '#/definitions/model.ChannelType' | |||
| warn_error_rate: | |||
| type: number | |||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.