Skip to content

feat: channel model permission and error rate monitor#518

Merged
zijiren233 merged 5 commits intolabring:mainfrom
zijiren233:channel-ban-test
Apr 14, 2026
Merged

feat: channel model permission and error rate monitor#518
zijiren233 merged 5 commits intolabring:mainfrom
zijiren233:channel-ban-test

Conversation

@zijiren233
Copy link
Copy Markdown
Member

No description provided.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds channel-level controls and UI/monitoring for temporarily excluding channel–model routes when they exhibit no-permission responses or high error rates, and refactors the backend monitor to return concrete error-rate values used by retry/alert logic.

Changes:

  • Move warn_error_rate / max_error_rate configuration from models to channels and surface the settings in the channel form/table plus i18n strings.
  • Update the monitor implementation to return errorRate (and perform temporary exclusion based on channel settings / no-permission), including new local-cache helpers and tests.
  • Improve operational behavior via TTL jitter (cache/store) and refactor request error mapping / retry selection logic.

Reviewed changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
web/src/validation/model.ts Removes model-level max_error_rate validation.
web/src/validation/channel.ts Adds channel-level validation for no-permission ban + error-rate thresholds.
web/src/utils/runtime-metrics.ts Adds helpers to read channel-model runtime metrics and exclusion state.
web/src/types/model.ts Removes model-level warn/max error rate types.
web/src/types/channel.ts Adds channel-level enabled_no_permission_ban, warn_error_rate, max_error_rate types.
web/src/feature/monitor/components/MonitorCharts.tsx Uses metric helper and displays “temporarily excluded” state.
web/src/feature/model/components/ModelTable.tsx Uses metric helper and localizes the banned/excluded badge label.
web/src/feature/model/components/ModelForm.tsx Removes model-level max error rate form field and payload handling.
web/src/feature/channel/components/ChannelTable.tsx Exports/imports new channel fields and adds “temporarily excluded models” column UI.
web/src/feature/channel/components/ChannelForm.tsx Adds channel-level ban/error-rate controls and smarter “test saved vs preview” selection.
web/src/feature/channel/components/ChannelDialog.tsx Wires new channel fields into dialog default values.
web/public/locales/zh/translation.json Adds zh translations for temporary exclusion UI + new channel form fields.
web/public/locales/en/translation.json Adds en translations for temporary exclusion UI + new channel form fields.
core/relay/plugin/monitor/monitor_test.go Adds tests for channel-level warn/max error rate and no-permission ban switch behavior.
core/relay/plugin/monitor/monitor.go Uses channel-level warn/max thresholds; adds notify predicate helpers and no-permission ban gating.
core/relay/plugin/cache/cache.go Adds TTL jitter to reduce cache stampede risk.
core/relay/meta/meta.go Adds channel meta fields for new ban/error-rate configuration and propagates from model.Channel.
core/relay/controller/dohelper_test.go Adds tests for improved request error mapping behavior.
core/relay/controller/dohelper.go Refactors request error mapping into mapRequestError.
core/relay/adaptor/aws/adaptor.go Returns a structured relay error when an AWS sub-adaptor cannot be resolved.
core/monitor/model_integration_test.go Updates tests for new AddRequest return shape and adds coverage for min-sample behavior + no-permission ban.
core/monitor/model.go Changes AddRequest to return errorRate; updates Lua script contract; adds GetChannelModelErrorRate.
core/monitor/memmodel_test.go Updates/extends tests for new AddRequest return shape and GetChannelModelErrorRate.
core/monitor/memmodel.go Updates in-memory monitor API to return errorRate; adds ban-duration jitter; adds GetChannelModelErrorRate.
core/monitor/local_cache.go Adds local cache for channel-model error rate (single lookup) and invalidation updates.
core/model/store_postgres_integration_test.go Makes Postgres test startup more reliable with log-based readiness + connection retry.
core/model/store_cache.go Adds local TTL jitter for store cache entries.
core/model/modelconfig.go Removes model-level warn/max error rate fields.
core/model/channel.go Adds channel-level fields and includes them in update field list.
core/docs/swagger.yaml Updates API schema for channel warn/max error rate (but misses one new channel field; see comment).
core/docs/swagger.json Regenerated swagger JSON to reflect schema changes.
core/docs/docs.go Regenerated embedded swagger template to reflect schema changes.
core/controller/relay-controller.go Tracks the lowest error-rate “has permission” channel during retry to improve fallback behavior.
core/controller/relay-channel_test.go Updates retry channel tests and adds tests for lowest-error-rate fallback selection.
core/controller/relay-channel.go Refactors channel picking/filtering pipeline and retry selection behavior.
core/controller/channel_test.go Adds tests for auto-testing banned models concurrency and “model removed” clearing behavior.
core/controller/channel.go Extends add-channel request payload to include new channel fields.
core/controller/channel-test.go Refactors auto-test-banned-models into a worker pool with injectable deps and concurrency control.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +14 to +16
enabled_no_permission_ban: z.boolean().optional(),
warn_error_rate: z.number().min(0, 'Error rate must be at least 0').max(1, 'Error rate must be at most 1').optional(),
max_error_rate: z.number().min(0, 'Error rate must be at least 0').max(1, 'Error rate must be at most 1').optional(),
Comment on lines 76 to +80
'rpm',
'tpm',
'retry_times',
'timeout_config',
'max_error_rate',
'force_save_detail',
'tpm',
'retry_times',
'timeout_config',
'force_save_detail',
Comment on lines 26 to +58
@@ -50,6 +54,8 @@ definitions:
type: integer
type:
$ref: '#/definitions/model.ChannelType'
warn_error_rate:
type: number
@zijiren233 zijiren233 merged commit a8ad973 into labring:main Apr 14, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants