Refactor usage polling to paged APIs with legacy fallback#138
Conversation
3ff8db6 to
07aceca
Compare
0ae6c22 to
d3ddf09
Compare
1d3000d to
1e2f1ec
Compare
1e2f1ec to
522d7f1
Compare
522d7f1 to
9417d6d
Compare
|
@seakee could u please review? |
Thanks for the PR. The current PR does help alleviate the problems of slow loading, timeouts, and high frontend aggregation pressure on the request monitoring page under large data volumes. The subsequent refresh logic fix is also valuable, as it prevents the However, this PR affects the core data path of CPA-Manager request monitoring, so I’m not recommending merging it directly just yet. A few things still need to be addressed:
|
|
thanks for your review, I will fix them |
d741d30 to
a34cfbb
Compare
|
I have implemented the suggestions above. I tested manually and it works very well, but I have not manually looked through the code, I will review them later. I have something to check with you, after using FTS 5, users cannot search with partial queries, e.g. if I want to search codex usages I cannot get it with "co" "code" etc, I have to input full query. Is this good enough for user experience? @seakee |
a34cfbb to
2f5b26c
Compare
|
First of all, thank you again for this PR and for the follow-up updates. I agree with the overall direction, and I can see that you have already addressed a number of earlier review points. Splitting the monitoring page from high-frequency polling of the full One extra note: if this PR is eventually merged into CPA-Manager, these usage aggregation/pagination APIs will very likely be referenced or adapted in CPA-Manager-Plus as well. CPA-Manager-Plus is also improving request monitoring, data summaries, caller API key statistics, and account-level statistics, so having this API design stabilized here would be directly useful there too. You are also very welcome to continue contributing PRs to CPA-Manager-Plus. Once the usage query protocol, pagination shape, FTS search behavior, and model cost aggregation are polished here, reusing them in Plus should be quite natural. That said, because this PR touches the core data path of request monitoring, I still think a few issues should be addressed before merge:
Currently the first request uses Even if the current caller always passes page 1, the helper contract is still unsafe. Please either force the first request to use
It currently merges only Also, when merging
Right now, if the summary request fails due to a network error, timeout, CORS issue, or connection failure, the status can be undefined and the code may still fall back to the legacy This can hide the real error and make refreshes slower. Please only fallback on 404/405, and rethrow all other errors.
After replacing I do not suggest reverting to
The FTS table could also use a prefix index such as
GitHub still shows a hidden/bidirectional Unicode warning. Unless there is a specific reason to keep those characters, please remove them before merge to avoid confusion in future reviews and maintenance.
The frontend payload is now paginated, which is a good improvement. However, the backend still appears to call This means large time ranges can still create backend memory and aggregation pressure. If this PR is intended as a phase-one optimization, that is acceptable, but please document the current boundary in the PR description and ideally add a larger dataset test. Longer term, accounts and api-keys should also be pushed down into SQL with
Please either add those triggers now, or at least add a clear comment that Overall, I am positive on this PR. The direction is correct, it has real value, and it aligns well with the future monitoring improvements in both CPA-Manager and CPA-Manager-Plus. However, because this may become a foundation for later request monitoring work and possible Plus-side reuse, I would prefer to polish the issues above before merging. |
|
tysm! I will search for some docs about prefix searching and try to improve it |
|
oh sorry I didn't notice that this repo was maintenance only. Do you want me to create a new PR in CPA-Manager-Plus? |
No worries, and thanks again for working on this. My plan is not to reject this PR just because CPA-Manager is mostly maintenance-only. In my view, this kind of monitoring performance improvement still falls within maintenance work, and it was already part of my original plan. This PR actually helps fill an important gap in that plan, so I still think it has value for CPA-Manager. That said, CPA-Manager-Plus is where I plan to continue more active development. It already has some optimizations around request monitoring and usage aggregation, but the implementation is still not thorough enough, and further performance work is also on the roadmap there. So my suggestion is:
|
|
Thanks for the follow-up updates. I reviewed the latest head again. This version looks much better, and most of the previous blocking issues have been addressed. Confirmed fixed:
At this point I no longer think this needs to be blocked on the previous review items. There are still a few follow-up items worth keeping in mind:
Overall, I am positive on this PR now. It fits the maintenance/performance optimization scope of CPA-Manager, and it can also serve as a useful reference for CPA-Manager-Plus monitoring optimization later. I think this is now mergeable. |
中文
变更内容
/v0/management/usage响应拆分为更小的分页/聚合接口:summary、accounts、api-keys、realtime和models。/v0/management/usage/summary只返回全局聚合数据和筛选所需facets,不再携带详细 breakdown;前端下拉选项优先使用 summary facets,旧后端仍从 legacy rows 推导。tokens.*、latency_sum_ms、latency_count和重复 endpoint/model 的 details。/v0/management/usage;网络错误、超时、500、401/403 等非兼容性错误不再被 legacy fallback 掩盖。page_size硬上限,并通过白名单校验sort_key/sort_direction,避免过大分页或用户输入直接影响排序逻辑。totalCost排序现在按已保存 model price 计算预估成本,不再错误地退化为按 token 数排序。items;这三类端点的usage.apisdetail aggregates 为空,page size 约束返回的分页项而不是继续传 endpoint 聚合明细。LIKE改为 SQLite FTS5,并覆盖 API Key alias;FTS 查询现在使用 token prefix 语义,例如co/cod可匹配codex,多词输入保持 AND 语义。usage_events只追加;以后如果引入 retention cleanup、delete 或 update,需要补充对应 FTS trigger。end_ms被初次 render 固定的问题,避免总 tokens 和预估花费在第一次刷新后不再更新。当前边界
accounts/api-keys接口当前是 phase-one 优化:HTTP 响应已分页,但后端仍先按当前过滤条件聚合 detail 后在 Go 内存中 group/sort/slice。PR 中补了较大数据集分页测试;后续如继续扩大保留周期或数据量,应把这两类聚合下推到 SQLGROUP BY + ORDER BY + LIMIT/OFFSET。验证
go test ./...go test ./internal/store ./internal/httpapinpm test -- src/features/monitoring/hooks/useUsageData.test.ts src/services/api/usageService.test.ts src/features/monitoring/hooks/useMonitoringData.test.ts src/pages/MonitoringCenterPage.test.tsxnpm run type-checknpm run lintnpm run builddpsk/ DeepSeek review 做了独立代码审查,并修复其中确认真实的totalCost后端排序问题。off:cpa-manager-office cpa-manager:pr138-no-page-aggregates-20260527-0314,/health返回{"ok":true,"service":"cpa-manager"}。/v0/management/usage/summary返回apis=0且 facets 非空:providers=2、accounts=2、models=13、channels=2、api_keys=6。search=co、search=cod、search=gem均返回匹配 summary,且 summary 仍不携带 detailapis。/v0/management/usage/accounts?page=1&page_size=2&sort_key=totalCost&sort_direction=desc、accounts、api-keys、realtime、models分页端点均返回 200 和分页数据。items且usage.apis=0;models 仍通过usage.apis返回 model aggregates。English
Changes
/v0/management/usagepayload into smaller paged/aggregate endpoints:summary,accounts,api-keys,realtime, andmodels./v0/management/usage/summarynow returns only global aggregate data plus filterfacets; detailed breakdowns are no longer included. The frontend builds dropdown options from summary facets first and keeps legacy row-derived options for old backends.tokens.*,latency_sum_ms,latency_count, and appends details for repeated endpoint/model keys./v0/management/usageon 404/405. Network errors, timeouts, 500s, 401/403s, and other non-compatibility failures are surfaced instead of being hidden by fallback.page_sizeand backend whitelist validation forsort_key/sort_direction, preventing oversized pages or user input from directly affecting sorting logic.totalCostsorting for account/API-key grouped pages so it uses saved model prices to estimate cost instead of silently sorting by token count.items; theirusage.apisdetail aggregates are empty, so page size controls the returned page items instead of shipping endpoint-grouped details.LIKEmatching with SQLite FTS5 and included API Key aliases. FTS now uses token prefix semantics, soco/codcan matchcodex, while multi-term input keeps AND semantics.usage_eventsis currently append-only; future retention cleanup, delete, or update paths must add matching FTS triggers.end_mswas frozen after the first render, which could keep total tokens and estimated cost unchanged after the first refresh.Current Boundary
accounts/api-keysare a phase-one optimization: the HTTP response is paginated, but the backend still aggregates details for the current filter and then groups/sorts/slices in Go memory. This PR adds a larger dataset pagination test; if retention or data volume grows further, these two aggregations should be pushed down to SQL withGROUP BY + ORDER BY + LIMIT/OFFSET.Verification
go test ./...go test ./internal/store ./internal/httpapinpm test -- src/features/monitoring/hooks/useUsageData.test.ts src/services/api/usageService.test.ts src/features/monitoring/hooks/useMonitoringData.test.ts src/pages/MonitoringCenterPage.test.tsxnpm run type-checknpm run lintnpm run builddpsk/ DeepSeek review and fixed the confirmed realtotalCostbackend sorting issue it found.off:cpa-manager-office cpa-manager:pr138-no-page-aggregates-20260527-0314, and/healthreturned{"ok":true,"service":"cpa-manager"}./v0/management/usage/summaryreturnsapis=0with non-empty facets: providers=2, accounts=2, models=13, channels=2, api_keys=6.search=co,search=cod, andsearch=gemreturn matching summaries while keeping summary detailapisempty./v0/management/usage/accounts?page=1&page_size=2&sort_key=totalCost&sort_direction=desc,accounts,api-keys,realtime, andmodelspaged endpoints return 200 with paged data.itemswithusage.apis=0; models still returns model aggregates throughusage.apis.