OpenClaw Performance Monitor解决方案#45853
Conversation
OpenClaw Performance Monitor解决方案
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ef02747d1f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| modelMetric.averageTokensPerRequest += tokens; | ||
| modelMetric.averageTokensPerSecond = modelMetric.averageTokensPerRequest / (modelMetric.totalResponseTime / 1000); | ||
| modelMetric.averageTokensPerRequest = modelMetric.averageTokensPerRequest / modelMetric.requestCount; |
There was a problem hiding this comment.
Stop re-averaging token totals in place
averageTokensPerRequest is treated as an accumulator and then divided by requestCount in place, which loses historical token totals after the second request (e.g., three 100-token requests end up at 66.7 instead of 100). This also skews averageTokensPerSecond because it is computed from the already-mutated value. Keep a separate total-token counter and derive averages from totals.
Useful? React with 👍 / 👎.
| for (const model of this.metrics.modelMetrics) { | ||
| if (model.errorRate > this.options.errorRateThreshold) { | ||
| this.createAlert('error_rate', 'high', `Model error rate exceeds threshold: ${model.modelId}`, model.errorRate, this.options.errorRateThreshold); | ||
| } | ||
| } |
There was a problem hiding this comment.
Resolve error-rate alerts after recovery
Unlike CPU/memory/latency checks, the error-rate branch only creates alerts and never resolves them when error rate drops back below threshold, so transient failures leave permanent active alerts and stale report output. Add a recovery path for error_rate alerts so alert state matches current model health.
Useful? React with 👍 / 👎.
| report += `Agent: ${agent.agentName} (${agent.agentId})\n`; | ||
| report += ` Executions: ${agent.executionCount}\n`; | ||
| report += ` Avg Time: ${agent.averageExecutionTime.toFixed(2)}ms\n`; | ||
| report += ` Success Rate: ${(agent.successRate * 100).toFixed(2)}%\n`; |
There was a problem hiding this comment.
Report success rate without multiplying twice
successRate is already stored as a percentage in updateAgentMetrics (multiplied by 100 there), so multiplying it by 100 again in the report inflates values by two orders of magnitude (for example 99% prints as 9900%). This makes the generated report misleading for operators.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR introduces a Key issues found:
Confidence Score: 0/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 1-19
Comment:
**TypeScript source code placed in a Markdown file**
This file has a `.md` extension but contains raw TypeScript source code with no Markdown formatting, prose, or code fences. A `.md` file is treated as documentation — it will not be compiled or executed by TypeScript. This code needs to live in a `.ts` (or `.tsx`) file within the appropriate source directory (e.g. `src/monitoring/PerformanceMonitor.ts`) for it to work at all. As-is, none of this code is reachable by the rest of the codebase.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 295
Comment:
**`successRate` and `errorRate` multiplied by 100 twice**
`successRate` is already stored as a 0–100 percentage (see line 397: `... * 100`), but here it is multiplied by 100 again — producing values like `9500.00%` for a 95% success rate.
The same issue occurs on line 314 for `errorRate`: `errorRate` is tracked as a 0–100 value (see line 431: `... + 100) / requestCount`), and then displayed with another `* 100` multiplier.
Both lines should simply use `.toFixed(2)%` without the extra `* 100`.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 365-370
Comment:
**CPU baseline never resets between intervals**
`this.initialCpuUsage` is set once in the constructor and is never updated. `process.cpuUsage(this.initialCpuUsage)` therefore returns the cumulative CPU time since object construction, not the CPU time consumed during the most recent monitoring interval. As the monitor runs longer, the per-interval estimate drifts further and further from reality — eventually the denominator (`this.options.interval * 1000` microseconds) becomes tiny relative to the ever-growing numerator, pushing the return value toward 100% on every collection cycle.
`this.initialCpuUsage` must be updated at the end of each `collectMetrics` call (i.e. `this.initialCpuUsage = process.cpuUsage()`) to measure only the work done in the most recent interval.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 424-428
Comment:
**`averageTokensPerRequest` used as both accumulator and final average**
On the first request with `tokens = 500`:
- `averageTokensPerRequest` starts at `0`, becomes `500` (accumulation step), then is divided by `requestCount (1)` → stored as `500` (correct).
On the second request with `tokens = 300`:
- `averageTokensPerRequest` is currently the *average* from last time (`500`), not the running total (`500`). Adding `300` gives `800`, and dividing by `requestCount (2)` gives `400` — but the correct average of `[500, 300]` is `400`. By coincidence this appears correct for the second call.
On the third request with `tokens = 200`:
- `averageTokensPerRequest` is `400` (previous average, not total `800`). Adding `200` gives `600`, divided by `3` = `200` — but the correct average of `[500, 300, 200]` is `333.3`.
The field is being used as if it holds a running total but is actually being overwritten with the average each iteration. A separate `totalTokens` field should be added to the metrics struct and used as the accumulator here.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 201-203
Comment:
**`agentId` extraction breaks for hyphenated IDs**
`executionId` is constructed as `` `${agentId}-${Date.now()}-${randomString}` ``. If `agentId` itself contains hyphens (which is common for UUID-style IDs), then `executionId.split('-')[0]` returns only the first hyphen-separated segment, not the full original ID. This causes `updateAgentMetrics` to either create phantom metric entries under a truncated ID or silently merge metrics from entirely different agents that happen to share the same first segment.
The identical bug exists for `modelId` on line 237 inside `endModelRequest` — `requestId.split('-')[0]` has the same problem.
The original IDs should be stored alongside `startTime` in the `agentExecutions` / `modelRequests` maps and read back directly, rather than being re-derived from the composite execution ID.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: ef02747 |
| report += `Agent: ${agent.agentName} (${agent.agentId})\n`; | ||
| report += ` Executions: ${agent.executionCount}\n`; | ||
| report += ` Avg Time: ${agent.averageExecutionTime.toFixed(2)}ms\n`; | ||
| report += ` Success Rate: ${(agent.successRate * 100).toFixed(2)}%\n`; |
There was a problem hiding this comment.
successRate and errorRate multiplied by 100 twice
successRate is already stored as a 0–100 percentage (see line 397: ... * 100), but here it is multiplied by 100 again — producing values like 9500.00% for a 95% success rate.
The same issue occurs on line 314 for errorRate: errorRate is tracked as a 0–100 value (see line 431: ... + 100) / requestCount), and then displayed with another * 100 multiplier.
Both lines should simply use .toFixed(2)% without the extra * 100.
Prompt To Fix With AI
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 295
Comment:
**`successRate` and `errorRate` multiplied by 100 twice**
`successRate` is already stored as a 0–100 percentage (see line 397: `... * 100`), but here it is multiplied by 100 again — producing values like `9500.00%` for a 95% success rate.
The same issue occurs on line 314 for `errorRate`: `errorRate` is tracked as a 0–100 value (see line 431: `... + 100) / requestCount`), and then displayed with another `* 100` multiplier.
Both lines should simply use `.toFixed(2)%` without the extra `* 100`.
How can I resolve this? If you propose a fix, please make it concise.| private calculateCpuUsage(): number { | ||
| const cpuUsage = process.cpuUsage(this.initialCpuUsage); | ||
| const totalTime = cpuUsage.user + cpuUsage.system; | ||
| // Rough estimate: convert to percentage | ||
| return Math.min((totalTime / (this.options.interval * 1000)) * 100, 100); | ||
| } |
There was a problem hiding this comment.
CPU baseline never resets between intervals
this.initialCpuUsage is set once in the constructor and is never updated. process.cpuUsage(this.initialCpuUsage) therefore returns the cumulative CPU time since object construction, not the CPU time consumed during the most recent monitoring interval. As the monitor runs longer, the per-interval estimate drifts further and further from reality — eventually the denominator (this.options.interval * 1000 microseconds) becomes tiny relative to the ever-growing numerator, pushing the return value toward 100% on every collection cycle.
this.initialCpuUsage must be updated at the end of each collectMetrics call (i.e. this.initialCpuUsage = process.cpuUsage()) to measure only the work done in the most recent interval.
Prompt To Fix With AI
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 365-370
Comment:
**CPU baseline never resets between intervals**
`this.initialCpuUsage` is set once in the constructor and is never updated. `process.cpuUsage(this.initialCpuUsage)` therefore returns the cumulative CPU time since object construction, not the CPU time consumed during the most recent monitoring interval. As the monitor runs longer, the per-interval estimate drifts further and further from reality — eventually the denominator (`this.options.interval * 1000` microseconds) becomes tiny relative to the ever-growing numerator, pushing the return value toward 100% on every collection cycle.
`this.initialCpuUsage` must be updated at the end of each `collectMetrics` call (i.e. `this.initialCpuUsage = process.cpuUsage()`) to measure only the work done in the most recent interval.
How can I resolve this? If you propose a fix, please make it concise.| if (tokens) { | ||
| modelMetric.averageTokensPerRequest += tokens; | ||
| modelMetric.averageTokensPerSecond = modelMetric.averageTokensPerRequest / (modelMetric.totalResponseTime / 1000); | ||
| modelMetric.averageTokensPerRequest = modelMetric.averageTokensPerRequest / modelMetric.requestCount; | ||
| } |
There was a problem hiding this comment.
averageTokensPerRequest used as both accumulator and final average
On the first request with tokens = 500:
averageTokensPerRequeststarts at0, becomes500(accumulation step), then is divided byrequestCount (1)→ stored as500(correct).
On the second request with tokens = 300:
averageTokensPerRequestis currently the average from last time (500), not the running total (500). Adding300gives800, and dividing byrequestCount (2)gives400— but the correct average of[500, 300]is400. By coincidence this appears correct for the second call.
On the third request with tokens = 200:
averageTokensPerRequestis400(previous average, not total800). Adding200gives600, divided by3=200— but the correct average of[500, 300, 200]is333.3.
The field is being used as if it holds a running total but is actually being overwritten with the average each iteration. A separate totalTokens field should be added to the metrics struct and used as the accumulator here.
Prompt To Fix With AI
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 424-428
Comment:
**`averageTokensPerRequest` used as both accumulator and final average**
On the first request with `tokens = 500`:
- `averageTokensPerRequest` starts at `0`, becomes `500` (accumulation step), then is divided by `requestCount (1)` → stored as `500` (correct).
On the second request with `tokens = 300`:
- `averageTokensPerRequest` is currently the *average* from last time (`500`), not the running total (`500`). Adding `300` gives `800`, and dividing by `requestCount (2)` gives `400` — but the correct average of `[500, 300]` is `400`. By coincidence this appears correct for the second call.
On the third request with `tokens = 200`:
- `averageTokensPerRequest` is `400` (previous average, not total `800`). Adding `200` gives `600`, divided by `3` = `200` — but the correct average of `[500, 300, 200]` is `333.3`.
The field is being used as if it holds a running total but is actually being overwritten with the average each iteration. A separate `totalTokens` field should be added to the metrics struct and used as the accumulator here.
How can I resolve this? If you propose a fix, please make it concise.| // Extract agent ID from execution ID | ||
| const agentId = executionId.split('-')[0]; | ||
| this.updateAgentMetrics(agentId, executionTime, success); |
There was a problem hiding this comment.
agentId extraction breaks for hyphenated IDs
executionId is constructed as `${agentId}-${Date.now()}-${randomString}`. If agentId itself contains hyphens (which is common for UUID-style IDs), then executionId.split('-')[0] returns only the first hyphen-separated segment, not the full original ID. This causes updateAgentMetrics to either create phantom metric entries under a truncated ID or silently merge metrics from entirely different agents that happen to share the same first segment.
The identical bug exists for modelId on line 237 inside endModelRequest — requestId.split('-')[0] has the same problem.
The original IDs should be stored alongside startTime in the agentExecutions / modelRequests maps and read back directly, rather than being re-derived from the composite execution ID.
Prompt To Fix With AI
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 201-203
Comment:
**`agentId` extraction breaks for hyphenated IDs**
`executionId` is constructed as `` `${agentId}-${Date.now()}-${randomString}` ``. If `agentId` itself contains hyphens (which is common for UUID-style IDs), then `executionId.split('-')[0]` returns only the first hyphen-separated segment, not the full original ID. This causes `updateAgentMetrics` to either create phantom metric entries under a truncated ID or silently merge metrics from entirely different agents that happen to share the same first segment.
The identical bug exists for `modelId` on line 237 inside `endModelRequest` — `requestId.split('-')[0]` has the same problem.
The original IDs should be stored alongside `startTime` in the `agentExecutions` / `modelRequests` maps and read back directly, rather than being re-derived from the composite execution ID.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 69428802c4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "main": "dist/monitoring/performance-monitor.js", | ||
| "types": "dist/monitoring/performance-monitor.d.ts", | ||
| "scripts": { | ||
| "build": "tsc", | ||
| "test": "ts-node src/monitoring/performance-monitor.test.ts", |
There was a problem hiding this comment.
Point package entrypoints to files built by this repo
package.json now declares main, types, and test paths under src/monitoring/dist/monitoring, but this commit only adds performance-monitor-fixed.ts at the repo root and no src/monitoring/performance-monitor*.ts files. In this state, npm run build cannot produce the declared runtime entrypoint and npm test targets a non-existent test file, so the published package is broken immediately.
Useful? React with 👍 / 👎.
| "name": "@openclaw/performance-monitor", | ||
| "version": "1.0.3", | ||
| "description": "Performance monitoring tool for OpenClaw", | ||
| "main": "dist/monitoring/performance-monitor.js", | ||
| "types": "dist/monitoring/performance-monitor.d.ts", |
There was a problem hiding this comment.
Keep the root manifest aligned with the openclaw CLI
This replaces the repository’s root manifest with a standalone @openclaw/performance-monitor package, which removes the root CLI packaging contract (openclaw bin/export surface and full app dependency graph). If merged, installs from this repo would no longer ship the OpenClaw CLI/application and would effectively publish a different product from the same root package.
Useful? React with 👍 / 👎.
| this.alerts.set(alertId, alert); | ||
| this.emit('alert', alert); | ||
| this.log(`Alert created: ${message}`); |
There was a problem hiding this comment.
Enforce the maxAlertsHistory cap when creating alerts
createAlert always inserts a new entry into this.alerts but never prunes older records, even though maxAlertsHistory is exposed as an option. In sustained threshold violations (for example repeated error-rate alerts per interval), alert history will grow unbounded and progressively increase memory/iteration cost in getAlerts and resolver scans.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cd8a10f470
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| { | ||
| "compilerOptions": { | ||
| "target": "ES2020", | ||
| "module": "commonjs", |
There was a problem hiding this comment.
Restore ESM-compatible TypeScript module settings
Switching the root tsconfig to "module": "commonjs" breaks compilation for existing source files that use ESM-only constructs like import.meta.url (for example src/version.ts), because TypeScript only allows import.meta with ESM module targets. In this state, a normal build/typecheck fails with TS1343 before producing artifacts, so this change blocks the repository’s standard build pipeline.
Useful? React with 👍 / 👎.
| } | ||
|
|
||
| private createAlert(type: 'cpu' | 'memory' | 'latency' | 'error_rate', severity: 'low' | 'medium' | 'high' | 'critical', message: string, value: number, threshold: number): void { | ||
| const alertId = `${type}-${Date.now()}`; |
There was a problem hiding this comment.
Generate collision-safe alert IDs
Using ${type}-${Date.now()} as the alert key can overwrite alerts created in the same millisecond for the same type, which is especially likely in the per-model error_rate loop where several models can breach threshold in one collection pass. Because alerts are stored in a Map, later inserts replace earlier ones and silently drop active alerts, causing incomplete alert history/reporting under multi-model failures.
Useful? React with 👍 / 👎.
|
欢迎评论区讨论这个的解决方案 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7aa87e6465
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| private resolveAlertsByTypeAndContext(type: string, context: string): void { | ||
| this.alerts.forEach((alert, id) => { | ||
| if (alert.type === type && !alert.resolved && alert.message.includes(context)) { |
There was a problem hiding this comment.
Match error-rate alert context by exact model ID
The resolver currently matches error_rate alerts with alert.message.includes(context), so model IDs that are prefixes of others (for example gpt-4 and gpt-4o-mini) can resolve each other’s alerts incorrectly. In that case, a healthy model can clear an unrelated active alert and hide a real ongoing failure, which makes alert state and reporting inaccurate under multi-model traffic.
Useful? React with 👍 / 👎.
| } | ||
|
|
||
| public getMetrics(): PerformanceMetrics { | ||
| return { ...this.metrics }; |
There was a problem hiding this comment.
Return deep snapshots from metrics accessor
getMetrics() only shallow-copies the root object, so callers receive live references to nested structures like agentMetrics, modelMetrics, and websocketMetrics. Any external mutation of the returned value (even accidental sorting/splicing) mutates monitor internals and can corrupt subsequent threshold checks and generated reports.
Useful? React with 👍 / 👎.
A comprehensive data analysis toolkit for OpenClaw performance metrics.
OpenClaw Analytics创新性的解决方案
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bdf207b626
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if 'metrics' in data: | ||
| metrics_list = data['metrics'] |
There was a problem hiding this comment.
Parse wrapped monitor exports as a single metrics sample
load_from_json treats any dictionary with a metrics key as an iterable metrics list, but the monitor exporter writes metrics as a single object (performance-monitor-fixed.ts exportMetrics). In that case this loop iterates object keys instead of metric objects, _parse_metric_point fails, and the analyzer silently loads zero data points from native OpenClaw exports.
Useful? React with 👍 / 👎.
| requests = points[i].model_requests + points[i].agent_executions | ||
| rate = (requests / time_diff) * 60 |
There was a problem hiding this comment.
Compute throughput from interval deltas, not cumulative totals
Throughput currently uses the absolute model_requests and agent_executions values from the current snapshot, which are cumulative counters when parsed from monitor metrics, so each later sample appears to have higher requests/min even if the real per-interval load is constant. Subtract the previous sample’s totals before dividing by elapsed time to avoid inflated trend and capacity conclusions.
Useful? React with 👍 / 👎.
A comprehensive data analysis toolkit for OpenClaw performance metrics.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 28cf3ebb50
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| memory_usage = data.get('memoryUsage', {}) | ||
|
|
||
| return MetricPoint( | ||
| timestamp=data.get('timestamp', 0), |
There was a problem hiding this comment.
Normalize monitor timestamps to seconds
The parser stores timestamp as-is, but the monitor exporter in this change emits Date.now() millisecond values, so downstream calls like datetime.fromtimestamp(...) in summary/anomaly paths can throw ValueError (year out of range) on real OpenClaw exports and time-based rates are scaled incorrectly. Convert millisecond epochs to seconds during parsing so native exported metrics files can be analyzed safely.
Useful? React with 👍 / 👎.
| if self.analysis_cache and not force: | ||
| return self.analysis_cache |
There was a problem hiding this comment.
Invalidate cached analysis after data updates
analyze() returns analysis_cache whenever it is populated, but data mutators (add_metric_point, load_from_json, load_from_csv) never clear that cache, so any points added after the first analysis are silently ignored unless callers remember force=True. This causes stale metrics, anomaly lists, and recommendations in incremental or multi-file workflows.
Useful? React with 👍 / 👎.
| total_agents = sum(p.agent_executions for p in sorted_points) | ||
| total_models = sum(p.model_requests for p in sorted_points) |
There was a problem hiding this comment.
Compute summary totals from deltas or latest counters
Summary totals currently sum agent_executions and model_requests across every sampled point, but these values are parsed from monitor snapshots as cumulative counters, so totals are double-counted over time (for example, 10 then 15 executions is reported as 25). This inflates total_operations and skews overall_error_rate; derive totals from per-interval deltas or from the latest cumulative sample instead.
Useful? React with 👍 / 👎.
|
Closing this as not actionable after Codex review. The PR is not actionable as submitted: the body is an unfilled template, and the diff bundles multiple unrelated proposals that conflict with the current root package/compiler contract on What I checked:
Review notes: reviewed against f03252aaf9cc. |
OpenClaw Performance Monitor解决方案
Summary
Describe the problem and fix in 2–5 bullets:
Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
List user-visible changes (including defaults/config).
If none, write
None.Security Impact (required)
Yes/No)Yes/No)Yes/No)Yes/No)Yes/No)Yes, explain risk + mitigation:Repro + Verification
Environment
Steps
Expected
Actual
Evidence
Attach at least one:
Human Verification (required)
What you personally verified (not just CI), and how:
Review Conversations
If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.
Compatibility / Migration
Yes/No)Yes/No)Yes/No)Failure Recovery (if this breaks)
Risks and Mitigations
List only real risks for this PR. Add/remove entries as needed. If none, write
None.