Skip to content

OpenClaw Performance Monitor解决方案#45853

Closed
wktomo wants to merge 10 commits into
openclaw:mainfrom
wktomo:main
Closed

OpenClaw Performance Monitor解决方案#45853
wktomo wants to merge 10 commits into
openclaw:mainfrom
wktomo:main

Conversation

@wktomo
Copy link
Copy Markdown

@wktomo wktomo commented Mar 14, 2026

OpenClaw Performance Monitor解决方案

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem:
  • Why it matters:
  • What changed:
  • What did NOT change (scope boundary):

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #

User-visible / Behavior Changes

List user-visible changes (including defaults/config).
If none, write None.

Security Impact (required)

  • New permissions/capabilities? (Yes/No)
  • Secrets/tokens handling changed? (Yes/No)
  • New/changed network calls? (Yes/No)
  • Command/tool execution surface changed? (Yes/No)
  • Data access scope changed? (Yes/No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS:
  • Runtime/container:
  • Model/provider:
  • Integration/channel (if any):
  • Relevant config (redacted):

Steps

Expected

Actual

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:
  • Edge cases checked:
  • What you did not verify:

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes/No)
  • Config/env changes? (Yes/No)
  • Migration needed? (Yes/No)
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly:
  • Files/config to restore:
  • Known bad symptoms reviewers should watch for:

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk:
    • Mitigation:

OpenClaw Performance Monitor解决方案
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ef02747d1f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread COMPLETE_CODE.md
Comment thread COMPLETE_CODE.md
Comment thread COMPLETE_CODE.md
Comment on lines +425 to +427
modelMetric.averageTokensPerRequest += tokens;
modelMetric.averageTokensPerSecond = modelMetric.averageTokensPerRequest / (modelMetric.totalResponseTime / 1000);
modelMetric.averageTokensPerRequest = modelMetric.averageTokensPerRequest / modelMetric.requestCount;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop re-averaging token totals in place

averageTokensPerRequest is treated as an accumulator and then divided by requestCount in place, which loses historical token totals after the second request (e.g., three 100-token requests end up at 66.7 instead of 100). This also skews averageTokensPerSecond because it is computed from the already-mutated value. Keep a separate total-token counter and derive averages from totals.

Useful? React with 👍 / 👎.

Comment thread COMPLETE_CODE.md
Comment on lines +479 to +483
for (const model of this.metrics.modelMetrics) {
if (model.errorRate > this.options.errorRateThreshold) {
this.createAlert('error_rate', 'high', `Model error rate exceeds threshold: ${model.modelId}`, model.errorRate, this.options.errorRateThreshold);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Resolve error-rate alerts after recovery

Unlike CPU/memory/latency checks, the error-rate branch only creates alerts and never resolves them when error rate drops back below threshold, so transient failures leave permanent active alerts and stale report output. Add a recovery path for error_rate alerts so alert state matches current model health.

Useful? React with 👍 / 👎.

Comment thread COMPLETE_CODE.md
report += `Agent: ${agent.agentName} (${agent.agentId})\n`;
report += ` Executions: ${agent.executionCount}\n`;
report += ` Avg Time: ${agent.averageExecutionTime.toFixed(2)}ms\n`;
report += ` Success Rate: ${(agent.successRate * 100).toFixed(2)}%\n`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Report success rate without multiplying twice

successRate is already stored as a percentage in updateAgentMetrics (multiplied by 100 there), so multiplying it by 100 again in the report inflates values by two orders of magnitude (for example 99% prints as 9900%). This makes the generated report misleading for operators.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 14, 2026

Greptile Summary

This PR introduces a COMPLETE_CODE.md file containing a TypeScript PerformanceMonitor class intended to track CPU/memory usage, agent execution times, WebSocket latency, and model response metrics for OpenClaw. The PR template is entirely unfilled (problem, change type, security impact, verification steps, etc. are all blank), and the code itself has several blocking issues that must be resolved before merging.

Key issues found:

  • Wrong file extension: The code is placed in a .md file and will never be compiled or executed by TypeScript. It must live in a .ts source file in the appropriate package directory.
  • Hyphen-split ID extraction bug: endAgentExecution and endModelRequest both re-derive the original ID via split('-')[0], which silently truncates any ID that contains hyphens (e.g. UUID-style IDs), causing metrics to be attributed to the wrong agent/model.
  • Double-percent display bug: successRate and errorRate are each stored as 0–100 percentage values, but generateReport multiplies them by 100 again — printing values like 9500.00% for a 95% success rate.
  • Stale CPU baseline: this.initialCpuUsage is set only once at construction and never updated, so calculateCpuUsage measures cumulative CPU since startup rather than per-interval load, eventually saturating at 100%.
  • Token accumulator/average aliasing: averageTokensPerRequest is used as a running total accumulator between requests but is overwritten with the computed average each cycle, causing incorrect averages from the third tracked request onward.

Confidence Score: 0/5

  • Not safe to merge — the code cannot run in its current form and contains multiple correctness bugs.
  • The single changed file has the wrong extension (.md instead of .ts), meaning the code is dead on arrival. On top of that, there are five distinct logical bugs across ID extraction, percentage display, CPU measurement, and token tracking. The PR template is also completely unfilled with no description of the problem, change type, security assessment, or verification evidence.
  • COMPLETE_CODE.md — needs to be converted to a .ts source file and all five logic bugs corrected before this can be reviewed further.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 1-19

Comment:
**TypeScript source code placed in a Markdown file**

This file has a `.md` extension but contains raw TypeScript source code with no Markdown formatting, prose, or code fences. A `.md` file is treated as documentation — it will not be compiled or executed by TypeScript. This code needs to live in a `.ts` (or `.tsx`) file within the appropriate source directory (e.g. `src/monitoring/PerformanceMonitor.ts`) for it to work at all. As-is, none of this code is reachable by the rest of the codebase.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 295

Comment:
**`successRate` and `errorRate` multiplied by 100 twice**

`successRate` is already stored as a 0–100 percentage (see line 397: `... * 100`), but here it is multiplied by 100 again — producing values like `9500.00%` for a 95% success rate.

The same issue occurs on line 314 for `errorRate`: `errorRate` is tracked as a 0–100 value (see line 431: `... + 100) / requestCount`), and then displayed with another `* 100` multiplier.

Both lines should simply use `.toFixed(2)%` without the extra `* 100`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 365-370

Comment:
**CPU baseline never resets between intervals**

`this.initialCpuUsage` is set once in the constructor and is never updated. `process.cpuUsage(this.initialCpuUsage)` therefore returns the cumulative CPU time since object construction, not the CPU time consumed during the most recent monitoring interval. As the monitor runs longer, the per-interval estimate drifts further and further from reality — eventually the denominator (`this.options.interval * 1000` microseconds) becomes tiny relative to the ever-growing numerator, pushing the return value toward 100% on every collection cycle.

`this.initialCpuUsage` must be updated at the end of each `collectMetrics` call (i.e. `this.initialCpuUsage = process.cpuUsage()`) to measure only the work done in the most recent interval.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 424-428

Comment:
**`averageTokensPerRequest` used as both accumulator and final average**

On the first request with `tokens = 500`:
- `averageTokensPerRequest` starts at `0`, becomes `500` (accumulation step), then is divided by `requestCount (1)` → stored as `500` (correct).

On the second request with `tokens = 300`:
- `averageTokensPerRequest` is currently the *average* from last time (`500`), not the running total (`500`). Adding `300` gives `800`, and dividing by `requestCount (2)` gives `400` — but the correct average of `[500, 300]` is `400`. By coincidence this appears correct for the second call.

On the third request with `tokens = 200`:
- `averageTokensPerRequest` is `400` (previous average, not total `800`). Adding `200` gives `600`, divided by `3` = `200` — but the correct average of `[500, 300, 200]` is `333.3`.

The field is being used as if it holds a running total but is actually being overwritten with the average each iteration. A separate `totalTokens` field should be added to the metrics struct and used as the accumulator here.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 201-203

Comment:
**`agentId` extraction breaks for hyphenated IDs**

`executionId` is constructed as `` `${agentId}-${Date.now()}-${randomString}` ``. If `agentId` itself contains hyphens (which is common for UUID-style IDs), then `executionId.split('-')[0]` returns only the first hyphen-separated segment, not the full original ID. This causes `updateAgentMetrics` to either create phantom metric entries under a truncated ID or silently merge metrics from entirely different agents that happen to share the same first segment.

The identical bug exists for `modelId` on line 237 inside `endModelRequest``requestId.split('-')[0]` has the same problem.

The original IDs should be stored alongside `startTime` in the `agentExecutions` / `modelRequests` maps and read back directly, rather than being re-derived from the composite execution ID.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: ef02747

Comment thread COMPLETE_CODE.md
Comment thread COMPLETE_CODE.md
report += `Agent: ${agent.agentName} (${agent.agentId})\n`;
report += ` Executions: ${agent.executionCount}\n`;
report += ` Avg Time: ${agent.averageExecutionTime.toFixed(2)}ms\n`;
report += ` Success Rate: ${(agent.successRate * 100).toFixed(2)}%\n`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

successRate and errorRate multiplied by 100 twice

successRate is already stored as a 0–100 percentage (see line 397: ... * 100), but here it is multiplied by 100 again — producing values like 9500.00% for a 95% success rate.

The same issue occurs on line 314 for errorRate: errorRate is tracked as a 0–100 value (see line 431: ... + 100) / requestCount), and then displayed with another * 100 multiplier.

Both lines should simply use .toFixed(2)% without the extra * 100.

Prompt To Fix With AI
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 295

Comment:
**`successRate` and `errorRate` multiplied by 100 twice**

`successRate` is already stored as a 0–100 percentage (see line 397: `... * 100`), but here it is multiplied by 100 again — producing values like `9500.00%` for a 95% success rate.

The same issue occurs on line 314 for `errorRate`: `errorRate` is tracked as a 0–100 value (see line 431: `... + 100) / requestCount`), and then displayed with another `* 100` multiplier.

Both lines should simply use `.toFixed(2)%` without the extra `* 100`.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread COMPLETE_CODE.md
Comment on lines +365 to +370
private calculateCpuUsage(): number {
const cpuUsage = process.cpuUsage(this.initialCpuUsage);
const totalTime = cpuUsage.user + cpuUsage.system;
// Rough estimate: convert to percentage
return Math.min((totalTime / (this.options.interval * 1000)) * 100, 100);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPU baseline never resets between intervals

this.initialCpuUsage is set once in the constructor and is never updated. process.cpuUsage(this.initialCpuUsage) therefore returns the cumulative CPU time since object construction, not the CPU time consumed during the most recent monitoring interval. As the monitor runs longer, the per-interval estimate drifts further and further from reality — eventually the denominator (this.options.interval * 1000 microseconds) becomes tiny relative to the ever-growing numerator, pushing the return value toward 100% on every collection cycle.

this.initialCpuUsage must be updated at the end of each collectMetrics call (i.e. this.initialCpuUsage = process.cpuUsage()) to measure only the work done in the most recent interval.

Prompt To Fix With AI
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 365-370

Comment:
**CPU baseline never resets between intervals**

`this.initialCpuUsage` is set once in the constructor and is never updated. `process.cpuUsage(this.initialCpuUsage)` therefore returns the cumulative CPU time since object construction, not the CPU time consumed during the most recent monitoring interval. As the monitor runs longer, the per-interval estimate drifts further and further from reality — eventually the denominator (`this.options.interval * 1000` microseconds) becomes tiny relative to the ever-growing numerator, pushing the return value toward 100% on every collection cycle.

`this.initialCpuUsage` must be updated at the end of each `collectMetrics` call (i.e. `this.initialCpuUsage = process.cpuUsage()`) to measure only the work done in the most recent interval.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread COMPLETE_CODE.md
Comment on lines +424 to +428
if (tokens) {
modelMetric.averageTokensPerRequest += tokens;
modelMetric.averageTokensPerSecond = modelMetric.averageTokensPerRequest / (modelMetric.totalResponseTime / 1000);
modelMetric.averageTokensPerRequest = modelMetric.averageTokensPerRequest / modelMetric.requestCount;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

averageTokensPerRequest used as both accumulator and final average

On the first request with tokens = 500:

  • averageTokensPerRequest starts at 0, becomes 500 (accumulation step), then is divided by requestCount (1) → stored as 500 (correct).

On the second request with tokens = 300:

  • averageTokensPerRequest is currently the average from last time (500), not the running total (500). Adding 300 gives 800, and dividing by requestCount (2) gives 400 — but the correct average of [500, 300] is 400. By coincidence this appears correct for the second call.

On the third request with tokens = 200:

  • averageTokensPerRequest is 400 (previous average, not total 800). Adding 200 gives 600, divided by 3 = 200 — but the correct average of [500, 300, 200] is 333.3.

The field is being used as if it holds a running total but is actually being overwritten with the average each iteration. A separate totalTokens field should be added to the metrics struct and used as the accumulator here.

Prompt To Fix With AI
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 424-428

Comment:
**`averageTokensPerRequest` used as both accumulator and final average**

On the first request with `tokens = 500`:
- `averageTokensPerRequest` starts at `0`, becomes `500` (accumulation step), then is divided by `requestCount (1)` → stored as `500` (correct).

On the second request with `tokens = 300`:
- `averageTokensPerRequest` is currently the *average* from last time (`500`), not the running total (`500`). Adding `300` gives `800`, and dividing by `requestCount (2)` gives `400` — but the correct average of `[500, 300]` is `400`. By coincidence this appears correct for the second call.

On the third request with `tokens = 200`:
- `averageTokensPerRequest` is `400` (previous average, not total `800`). Adding `200` gives `600`, divided by `3` = `200` — but the correct average of `[500, 300, 200]` is `333.3`.

The field is being used as if it holds a running total but is actually being overwritten with the average each iteration. A separate `totalTokens` field should be added to the metrics struct and used as the accumulator here.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread COMPLETE_CODE.md
Comment on lines +201 to +203
// Extract agent ID from execution ID
const agentId = executionId.split('-')[0];
this.updateAgentMetrics(agentId, executionTime, success);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agentId extraction breaks for hyphenated IDs

executionId is constructed as `${agentId}-${Date.now()}-${randomString}`. If agentId itself contains hyphens (which is common for UUID-style IDs), then executionId.split('-')[0] returns only the first hyphen-separated segment, not the full original ID. This causes updateAgentMetrics to either create phantom metric entries under a truncated ID or silently merge metrics from entirely different agents that happen to share the same first segment.

The identical bug exists for modelId on line 237 inside endModelRequestrequestId.split('-')[0] has the same problem.

The original IDs should be stored alongside startTime in the agentExecutions / modelRequests maps and read back directly, rather than being re-derived from the composite execution ID.

Prompt To Fix With AI
This is a comment left during a code review.
Path: COMPLETE_CODE.md
Line: 201-203

Comment:
**`agentId` extraction breaks for hyphenated IDs**

`executionId` is constructed as `` `${agentId}-${Date.now()}-${randomString}` ``. If `agentId` itself contains hyphens (which is common for UUID-style IDs), then `executionId.split('-')[0]` returns only the first hyphen-separated segment, not the full original ID. This causes `updateAgentMetrics` to either create phantom metric entries under a truncated ID or silently merge metrics from entirely different agents that happen to share the same first segment.

The identical bug exists for `modelId` on line 237 inside `endModelRequest``requestId.split('-')[0]` has the same problem.

The original IDs should be stored alongside `startTime` in the `agentExecutions` / `modelRequests` maps and read back directly, rather than being re-derived from the composite execution ID.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 69428802c4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread package.json
Comment on lines +5 to +9
"main": "dist/monitoring/performance-monitor.js",
"types": "dist/monitoring/performance-monitor.d.ts",
"scripts": {
"build": "tsc",
"test": "ts-node src/monitoring/performance-monitor.test.ts",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Point package entrypoints to files built by this repo

package.json now declares main, types, and test paths under src/monitoring/dist/monitoring, but this commit only adds performance-monitor-fixed.ts at the repo root and no src/monitoring/performance-monitor*.ts files. In this state, npm run build cannot produce the declared runtime entrypoint and npm test targets a non-existent test file, so the published package is broken immediately.

Useful? React with 👍 / 👎.

Comment thread package.json
Comment on lines +2 to +6
"name": "@openclaw/performance-monitor",
"version": "1.0.3",
"description": "Performance monitoring tool for OpenClaw",
"main": "dist/monitoring/performance-monitor.js",
"types": "dist/monitoring/performance-monitor.d.ts",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Keep the root manifest aligned with the openclaw CLI

This replaces the repository’s root manifest with a standalone @openclaw/performance-monitor package, which removes the root CLI packaging contract (openclaw bin/export surface and full app dependency graph). If merged, installs from this repo would no longer ship the OpenClaw CLI/application and would effectively publish a different product from the same root package.

Useful? React with 👍 / 👎.

Comment thread performance-monitor-fixed.ts Outdated
Comment on lines +750 to +752
this.alerts.set(alertId, alert);
this.emit('alert', alert);
this.log(`Alert created: ${message}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce the maxAlertsHistory cap when creating alerts

createAlert always inserts a new entry into this.alerts but never prunes older records, even though maxAlertsHistory is exposed as an option. In sustained threshold violations (for example repeated error-rate alerts per interval), alert history will grow unbounded and progressively increase memory/iteration cost in getAlerts and resolver scans.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cd8a10f470

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread tsconfig.json
{
"compilerOptions": {
"target": "ES2020",
"module": "commonjs",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Restore ESM-compatible TypeScript module settings

Switching the root tsconfig to "module": "commonjs" breaks compilation for existing source files that use ESM-only constructs like import.meta.url (for example src/version.ts), because TypeScript only allows import.meta with ESM module targets. In this state, a normal build/typecheck fails with TS1343 before producing artifacts, so this change blocks the repository’s standard build pipeline.

Useful? React with 👍 / 👎.

}

private createAlert(type: 'cpu' | 'memory' | 'latency' | 'error_rate', severity: 'low' | 'medium' | 'high' | 'critical', message: string, value: number, threshold: number): void {
const alertId = `${type}-${Date.now()}`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Generate collision-safe alert IDs

Using ${type}-${Date.now()} as the alert key can overwrite alerts created in the same millisecond for the same type, which is especially likely in the per-model error_rate loop where several models can breach threshold in one collection pass. Because alerts are stored in a Map, later inserts replace earlier ones and silently drop active alerts, causing incomplete alert history/reporting under multi-model failures.

Useful? React with 👍 / 👎.

@wktomo
Copy link
Copy Markdown
Author

wktomo commented Mar 14, 2026

欢迎评论区讨论这个的解决方案

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7aa87e6465

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


private resolveAlertsByTypeAndContext(type: string, context: string): void {
this.alerts.forEach((alert, id) => {
if (alert.type === type && !alert.resolved && alert.message.includes(context)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Match error-rate alert context by exact model ID

The resolver currently matches error_rate alerts with alert.message.includes(context), so model IDs that are prefixes of others (for example gpt-4 and gpt-4o-mini) can resolve each other’s alerts incorrectly. In that case, a healthy model can clear an unrelated active alert and hide a real ongoing failure, which makes alert state and reporting inaccurate under multi-model traffic.

Useful? React with 👍 / 👎.

}

public getMetrics(): PerformanceMetrics {
return { ...this.metrics };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return deep snapshots from metrics accessor

getMetrics() only shallow-copies the root object, so callers receive live references to nested structures like agentMetrics, modelMetrics, and websocketMetrics. Any external mutation of the returned value (even accidental sorting/splicing) mutates monitor internals and can corrupt subsequent threshold checks and generated reports.

Useful? React with 👍 / 👎.

wktomo added 3 commits March 14, 2026 17:18
A comprehensive data analysis toolkit for OpenClaw performance metrics.
 OpenClaw Analytics创新性的解决方案
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bdf207b626

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +191 to +192
if 'metrics' in data:
metrics_list = data['metrics']
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse wrapped monitor exports as a single metrics sample

load_from_json treats any dictionary with a metrics key as an iterable metrics list, but the monitor exporter writes metrics as a single object (performance-monitor-fixed.ts exportMetrics). In that case this loop iterates object keys instead of metric objects, _parse_metric_point fails, and the analyzer silently loads zero data points from native OpenClaw exports.

Useful? React with 👍 / 👎.

Comment on lines +654 to +655
requests = points[i].model_requests + points[i].agent_executions
rate = (requests / time_diff) * 60
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Compute throughput from interval deltas, not cumulative totals

Throughput currently uses the absolute model_requests and agent_executions values from the current snapshot, which are cumulative counters when parsed from monitor metrics, so each later sample appears to have higher requests/min even if the real per-interval load is constant. Subtract the previous sample’s totals before dividing by elapsed time to avoid inflated trend and capacity conclusions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

@wktomo wktomo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

数据分析平台用于检测平台·的使用情况,并支持导出分析功能

A comprehensive data analysis toolkit for OpenClaw performance metrics.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 28cf3ebb50

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

memory_usage = data.get('memoryUsage', {})

return MetricPoint(
timestamp=data.get('timestamp', 0),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Normalize monitor timestamps to seconds

The parser stores timestamp as-is, but the monitor exporter in this change emits Date.now() millisecond values, so downstream calls like datetime.fromtimestamp(...) in summary/anomaly paths can throw ValueError (year out of range) on real OpenClaw exports and time-based rates are scaled incorrectly. Convert millisecond epochs to seconds during parsing so native exported metrics files can be analyzed safely.

Useful? React with 👍 / 👎.

Comment on lines +256 to +257
if self.analysis_cache and not force:
return self.analysis_cache
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Invalidate cached analysis after data updates

analyze() returns analysis_cache whenever it is populated, but data mutators (add_metric_point, load_from_json, load_from_csv) never clear that cache, so any points added after the first analysis are silently ignored unless callers remember force=True. This causes stale metrics, anomaly lists, and recommendations in incremental or multi-file workflows.

Useful? React with 👍 / 👎.

Comment on lines +458 to +459
total_agents = sum(p.agent_executions for p in sorted_points)
total_models = sum(p.model_requests for p in sorted_points)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Compute summary totals from deltas or latest counters

Summary totals currently sum agent_executions and model_requests across every sampled point, but these values are parsed from monitor snapshots as cumulative counters, so totals are double-counted over time (for example, 10 then 15 executions is reported as 25). This inflates total_operations and skews overall_error_rate; derive totals from per-interval deltas or from the latest cumulative sample instead.

Useful? React with 👍 / 👎.

@steipete
Copy link
Copy Markdown
Contributor

Closing this as not actionable after Codex review.

The PR is not actionable as submitted: the body is an unfilled template, and the diff bundles multiple unrelated proposals that conflict with the current root package/compiler contract on main. This is stronger evidence for incoherent than for a code-level review outcome.

What I checked:

  • Repo contribution rules: VISION.md:36-39 requires one PR per issue/topic and notes PRs over ~5,000 changed lines are exceptional. This PR combines unrelated changes across root packaging, TypeScript code, and a separate Python toolkit. (VISION.md:36, f03252aaf9cc)
  • Current root package contract: package.json:2-18 and package.json:23-54 show the repo root is the openclaw CLI/package with the openclaw bin and broad root export surface. The PR context shows it rewrites the root manifest into a different standalone package, which is a separate product direction rather than a focused feature change. (package.json:2, f03252aaf9cc)
  • Current TypeScript module contract: tsconfig.json:2-31 keeps the repo on module: "NodeNext" / moduleResolution: "NodeNext". src/entry.ts:38-40 and src/version.ts:157-160 use import.meta, so the PR's CommonJS rewrite in the provided diff is incompatible with current main's module model. (tsconfig.json:2, f03252aaf9cc)
  • PR contents are internally contradictory: From the provided PR context: the body is still the blank PR template; the diff adds TypeScript in COMPLETE_CODE.md, a second implementation in performance-monitor-fixed.ts, a Python package under tmp/openclaw-analytics-package/**, and root rewrites to package.json and tsconfig.json. That is not one coherent, reviewable change.

Review notes: reviewed against f03252aaf9cc.

@steipete steipete closed this Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants