feat: add automatic config rollback on gateway startup failure#44238
feat: add automatic config rollback on gateway startup failure#44238jung-geun wants to merge 27 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR adds automatic config rollback on gateway startup failure, introducing a new Issues found:
Confidence Score: 2/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/config/config-backup-restore.ts
Line: 82
Comment:
**Timestamped backups are invisible to `listConfigBackups`**
The regex `^${base}\.bak\.(\d+)$` only matches files with purely numeric suffixes (e.g., `config.json.bak.1`, `config.json.bak.2`). However, `createConfigBackup` with `timestamp: true` (the default used by the CLI `backup` command) writes files in the format `config.json.bak.2024-01-15_00-00-00` or `config.json.bak.2024-01-15_00-00-00-label` (see lines 135–138).
These timestamp-based backup files will never match this regex. As a consequence:
- `openclaw config list-backups` will never show backups created via `openclaw config backup`
- `attemptConfigRollback` will never try these backups when rolling back on gateway startup failure
- The entire auto-rollback feature silently fails for CLI-created backups
The regex needs to be broadened to match all backup file variants:
```suggestion
const match = entry.match(new RegExp(`^${base}\\.bak\\.(.+)$`));
```
However, this alone also means the `failed-*` and primary `.bak` entries might collide, so consider restricting to non-`.failed-` suffixes.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/cli/config-cli.ts
Line: 571-573
Comment:
**`--last` doesn't actually restore from the most recent backup**
`listConfigBackups` is called above (line 564) to check that at least one backup exists and it returns results sorted newest-first. But the `backups[0].path` is never passed to `restoreConfigBackup`. Instead, `restoreConfigBackup` is called with no `backupPath`, causing it to default to `${configPath}.bak` (the hard-coded primary backup, see `config-backup-restore.ts` line 161).
If `.bak` exists but a newer backup like `.bak.3` is available, `--last` silently restores from the older `.bak` file — the opposite of what the user expects.
Fix by passing the most recent backup path explicitly:
```suggestion
const result = await restoreConfigBackup(snapshot.path, {
backupPath: backups[0].path,
dryRun: opts.dryRun,
});
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/config/config-backup-restore.ts
Line: 16-20
Comment:
**`keepBackups` option is accepted but never enforced**
`ConfigBackupOptions.keepBackups` (and `GatewayConfigBackup.keepBackups` in `types.gateway.ts`) is documented as "Number of backup files to keep (default: 10)", but `createConfigBackup` never reads this value or prunes old backups. Over time, numbered backups will accumulate indefinitely regardless of what the user configures.
Consider adding a pruning step at the end of `createConfigBackup` that lists existing numbered backups and deletes the oldest ones beyond the configured limit.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 5bd5377 |
| try { | ||
| const entries = await ioFs.readdir(dir); | ||
| for (const entry of entries) { | ||
| const match = entry.match(new RegExp(`^${base}\\.bak\\.(\\d+)$`)); |
There was a problem hiding this comment.
Timestamped backups are invisible to listConfigBackups
The regex ^${base}\.bak\.(\d+)$ only matches files with purely numeric suffixes (e.g., config.json.bak.1, config.json.bak.2). However, createConfigBackup with timestamp: true (the default used by the CLI backup command) writes files in the format config.json.bak.2024-01-15_00-00-00 or config.json.bak.2024-01-15_00-00-00-label (see lines 135–138).
These timestamp-based backup files will never match this regex. As a consequence:
openclaw config list-backupswill never show backups created viaopenclaw config backupattemptConfigRollbackwill never try these backups when rolling back on gateway startup failure- The entire auto-rollback feature silently fails for CLI-created backups
The regex needs to be broadened to match all backup file variants:
| const match = entry.match(new RegExp(`^${base}\\.bak\\.(\\d+)$`)); | |
| const match = entry.match(new RegExp(`^${base}\\.bak\\.(.+)$`)); |
However, this alone also means the failed-* and primary .bak entries might collide, so consider restricting to non-.failed- suffixes.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/config-backup-restore.ts
Line: 82
Comment:
**Timestamped backups are invisible to `listConfigBackups`**
The regex `^${base}\.bak\.(\d+)$` only matches files with purely numeric suffixes (e.g., `config.json.bak.1`, `config.json.bak.2`). However, `createConfigBackup` with `timestamp: true` (the default used by the CLI `backup` command) writes files in the format `config.json.bak.2024-01-15_00-00-00` or `config.json.bak.2024-01-15_00-00-00-label` (see lines 135–138).
These timestamp-based backup files will never match this regex. As a consequence:
- `openclaw config list-backups` will never show backups created via `openclaw config backup`
- `attemptConfigRollback` will never try these backups when rolling back on gateway startup failure
- The entire auto-rollback feature silently fails for CLI-created backups
The regex needs to be broadened to match all backup file variants:
```suggestion
const match = entry.match(new RegExp(`^${base}\\.bak\\.(.+)$`));
```
However, this alone also means the `failed-*` and primary `.bak` entries might collide, so consider restricting to non-`.failed-` suffixes.
How can I resolve this? If you propose a fix, please make it concise.| const result = await restoreConfigBackup(snapshot.path, { | ||
| dryRun: opts.dryRun, | ||
| }); |
There was a problem hiding this comment.
--last doesn't actually restore from the most recent backup
listConfigBackups is called above (line 564) to check that at least one backup exists and it returns results sorted newest-first. But the backups[0].path is never passed to restoreConfigBackup. Instead, restoreConfigBackup is called with no backupPath, causing it to default to ${configPath}.bak (the hard-coded primary backup, see config-backup-restore.ts line 161).
If .bak exists but a newer backup like .bak.3 is available, --last silently restores from the older .bak file — the opposite of what the user expects.
Fix by passing the most recent backup path explicitly:
| const result = await restoreConfigBackup(snapshot.path, { | |
| dryRun: opts.dryRun, | |
| }); | |
| const result = await restoreConfigBackup(snapshot.path, { | |
| backupPath: backups[0].path, | |
| dryRun: opts.dryRun, | |
| }); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cli/config-cli.ts
Line: 571-573
Comment:
**`--last` doesn't actually restore from the most recent backup**
`listConfigBackups` is called above (line 564) to check that at least one backup exists and it returns results sorted newest-first. But the `backups[0].path` is never passed to `restoreConfigBackup`. Instead, `restoreConfigBackup` is called with no `backupPath`, causing it to default to `${configPath}.bak` (the hard-coded primary backup, see `config-backup-restore.ts` line 161).
If `.bak` exists but a newer backup like `.bak.3` is available, `--last` silently restores from the older `.bak` file — the opposite of what the user expects.
Fix by passing the most recent backup path explicitly:
```suggestion
const result = await restoreConfigBackup(snapshot.path, {
backupPath: backups[0].path,
dryRun: opts.dryRun,
});
```
How can I resolve this? If you propose a fix, please make it concise.| export interface ConfigBackupOptions { | ||
| keepBackups?: number; | ||
| timestamp?: boolean; | ||
| label?: string; | ||
| } |
There was a problem hiding this comment.
keepBackups option is accepted but never enforced
ConfigBackupOptions.keepBackups (and GatewayConfigBackup.keepBackups in types.gateway.ts) is documented as "Number of backup files to keep (default: 10)", but createConfigBackup never reads this value or prunes old backups. Over time, numbered backups will accumulate indefinitely regardless of what the user configures.
Consider adding a pruning step at the end of createConfigBackup that lists existing numbered backups and deletes the oldest ones beyond the configured limit.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/config-backup-restore.ts
Line: 16-20
Comment:
**`keepBackups` option is accepted but never enforced**
`ConfigBackupOptions.keepBackups` (and `GatewayConfigBackup.keepBackups` in `types.gateway.ts`) is documented as "Number of backup files to keep (default: 10)", but `createConfigBackup` never reads this value or prunes old backups. Over time, numbered backups will accumulate indefinitely regardless of what the user configures.
Consider adding a pruning step at the end of `createConfigBackup` that lists existing numbered backups and deletes the oldest ones beyond the configured limit.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5bd53776b5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Try numbered backups if primary doesn't exist | ||
| const backups = await listConfigBackups(configPath, ioFs); | ||
| if (backups.length === 0) { |
There was a problem hiding this comment.
Fail when requested backup path is missing
If options.backupPath is provided and that file does not exist, this branch falls back to listConfigBackups() and restores the newest backup instead of returning an error. In openclaw config restore --backup-path ..., a typo can therefore silently overwrite the active config with an unintended backup, which is destructive; fallback should only happen for the implicit default path.
Useful? React with 👍 / 👎.
| // Validate the backup content is valid JSON | ||
| try { | ||
| JSON.parse(content); | ||
| } catch { |
There was a problem hiding this comment.
Accept JSON5 backups during rollback validation
Rollback currently validates candidate backups with JSON.parse, but OpenClaw config files are JSON5 (comments/trailing commas are valid via readConfigFileSnapshot parsing). A perfectly valid backup containing JSON5 syntax is skipped as "invalid," which can make automatic rollback fail even when a recoverable backup exists.
Useful? React with 👍 / 👎.
| try { | ||
| const entries = await ioFs.readdir(dir); | ||
| for (const entry of entries) { | ||
| const match = entry.match(new RegExp(`^${base}\\.bak\\.(\\d+)$`)); |
There was a problem hiding this comment.
Match timestamped backup filenames in backup listing
This regex only matches numeric suffixes like .bak.1, but createConfigBackup(..., { timestamp: true }) (used by openclaw config backup) writes files like .bak.2026-03-12_16-39-40. Those backups are never returned by listConfigBackups, so list/restore paths that depend on this listing cannot see or restore manual timestamped backups.
Useful? React with 👍 / 👎.
| * When enabled, the gateway automatically backs up config before changes | ||
| * and can rollback to a known-good backup if startup fails due to config errors. | ||
| */ | ||
| configBackup?: GatewayConfigBackup; |
There was a problem hiding this comment.
Wire gateway.configBackup into config validation schema
This adds gateway.configBackup to the TypeScript config type, but the gateway validator still uses a strict schema (src/config/zod-schema.ts) that does not include this key. As a result, adding gateway.configBackup in user config makes the config invalid and can block startup, so the new setting cannot actually be used.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bb4b1d51c8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Validate the backup content is valid JSON | ||
| try { | ||
| JSON.parse(content); | ||
| } catch { |
There was a problem hiding this comment.
Try older backups when candidate fails schema validation
attemptConfigRollback treats any syntactically valid JSON backup as a success and returns immediately, but gateway startup re-validates against the full config schema afterward. If the newest backup parses as JSON but is still schema-invalid (for example wrong types or unsupported keys), startup aborts without trying older backups, even when a recoverable backup exists later in the list.
Useful? React with 👍 / 👎.
| const backupPath = options.timestamp | ||
| ? `${configPath}.bak.${timestamp}${label}` | ||
| : `${configPath}.bak`; |
There was a problem hiding this comment.
Keep manual timestamped backups compatible with rotation cleanup
Manual backups are written as *.bak.<timestamp><label>, but normal config writes run the existing backup maintenance cleanup that preserves only .bak plus numbered .bak.N files. In practice, a later config set/unset write will delete these timestamped manual backups, so the new config backup output is not durable for restore workflows.
Useful? React with 👍 / 👎.
c31c844 to
6e7456c
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6e7456c8a0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Restore from backup | ||
| try { | ||
| const content = await ioFs.readFile(backupPath); | ||
| await ioFs.writeFile(configPath, content); |
There was a problem hiding this comment.
Write restored config with owner-only permissions
restoreConfigBackup restores by calling ioFs.writeFile(configPath, content) without enforcing mode 0600. If the config file is missing (a common recovery path), Node creates a new file using the process umask (typically 0644), which can expose gateway tokens/API keys to other local users. This bypasses the normal secure write path (writeConfigFile) that hardens config permissions, so restore should also set owner-only permissions.
Useful? React with 👍 / 👎.
✅ 리뷰 피드백 반영 완료수정된 사항들commit 6e7456c8a - Greptile 리뷰 피드백 반영:
commit bef2dc58e - CI TypeScript 에러 수정:
CI 상태
모든 리뷰 피드백이 반영되었습니다. |
bef2dc5 to
2720dcf
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 07557494ce
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| log.warn("gateway: config validation failed, attempting automatic rollback..."); | ||
| const rollbackResult = await attemptConfigRollback(configSnapshot.path); |
There was a problem hiding this comment.
Move rollback ahead of gateway CLI config load
This rollback path is not reached in the main openclaw gateway run startup flow: src/cli/gateway-cli/run.ts:201 calls loadConfig() before startGatewayServer(...), and loadConfig fail-closes on invalid configs, so malformed/invalid config exits before this new rollback logic can execute. As shipped, users hitting startup failures from invalid config will still fail hard instead of getting automatic rollback.
Useful? React with 👍 / 👎.
| ? `${configPath}.bak.${timestamp}${label}` | ||
| : `${configPath}.bak`; | ||
|
|
||
| await ioFs.writeFile(backupPath, configContent); |
There was a problem hiding this comment.
Save manual backups with 0600 permissions
Manual backups are created with writeFile and no explicit mode, so new files use the process umask (commonly resulting in 0644). Because config backups contain gateway auth tokens and provider keys, this can expose secrets to other local users on shared systems; the manual backup write path should enforce owner-only permissions like the regular config write path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b64233c913
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Validate against the config schema | ||
| const schemaResult = OpenClawSchema.safeParse(parsed); | ||
| if (!schemaResult.success) { |
There was a problem hiding this comment.
Validate rollback backups with the same config pipeline
attemptConfigRollback accepts/rejects candidates with OpenClawSchema.safeParse on raw JSON5, but normal startup validation goes through readConfigFileSnapshot/validateConfigObjectWithPlugins after include resolution. This mismatch means valid backups that rely on supported features like $include (or plugin-aware validation paths) can be skipped as "invalid," causing rollback to report no recoverable backup even though one exists and would load normally after restore.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 652eba0d7a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| this.workspaceDir, | ||
| this.settings.extraPaths, | ||
| this.settings.multimodal, | ||
| this.userId, |
There was a problem hiding this comment.
Persist isolated memory indexes in separate sqlite files
When userId is passed here, each manager sync only indexes that user's files, but the underlying SQLite path is still shared per agent (settings.store.path), and syncMemoryFiles later prunes any source='memory' rows not in the current activePaths. With isolation enabled and two users on the same agent, one user's sync can delete the other user's indexed rows, leading to missing or incorrect memory search results until another full sync runs.
Useful? React with 👍 / 👎.
| /** | ||
| * User-specific memory isolation settings. | ||
| * When enabled, each user gets their own memory subdirectory. | ||
| */ | ||
| isolation?: MemoryIsolationConfig; |
There was a problem hiding this comment.
Allow memory.isolation in strict config schema
This adds memory.isolation to the public config type, but the strict MemorySchema validator still only accepts backend, citations, and qmd, so any user who sets memory.isolation.enabled will get a config validation failure at startup. Since runtime code now reads cfg.memory?.isolation, this setting is currently documented and consumed but not actually valid in config files.
Useful? React with 👍 / 👎.
| ? `${configPath}.bak.${timestamp}${label}` | ||
| : `${configPath}.bak`; | ||
|
|
||
| await fs.writeFile(backupPath, configContent, { mode: 0o600 }); |
There was a problem hiding this comment.
Route backup writes through the injected filesystem
createConfigBackup accepts an injected ioFs for filesystem abstraction, but this write bypasses it and always touches the real Node filesystem. In mocked/custom FS contexts, backup creation can write to disk while later stat/prune calls still use ioFs, causing inconsistent behavior and failing isolation in tests or alternate storage adapters.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d083884734
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const tokens = parsed.rest.split(":").filter(Boolean); | ||
| // direct:userId format check | ||
| const directIdx = tokens.indexOf("direct"); | ||
| if (directIdx !== -1 && directIdx + 1 < tokens.length) { | ||
| const userId = tokens[directIdx + 1]; |
There was a problem hiding this comment.
Parse full direct-session user IDs after
direct: marker
extractUserIdFromSessionKey splits parsed.rest on : and returns only the single token after direct, which truncates valid IDs that contain colons. Matrix direct peers use full Matrix user IDs (for example @alice:example.org from extensions/matrix/src/matrix/monitor/handler.ts), so different users can collapse to the same extracted ID (@alice), causing per-user memory isolation collisions and cross-user memory recall leakage.
Useful? React with 👍 / 👎.
| const configBackupSettings = preflightSnapshot.config?.gateway?.configBackup; | ||
| const autoRollbackEnabled = configBackupSettings?.autoRollback !== false; // default true | ||
|
|
||
| if (autoRollbackEnabled) { |
There was a problem hiding this comment.
Respect configBackup.enabled before attempting rollback
The startup preflight currently gates rollback only on configBackup.autoRollback and ignores configBackup.enabled. When users set gateway.configBackup.enabled: false, this path still runs attemptConfigRollback(...) and can overwrite the current config despite the feature being explicitly disabled, so the new disable switch is not actually enforced.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6e4766ea96
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Extract the user ID portion (everything after "direct:") | ||
| const userIdPart = rest.slice(directIdx + directMarker.length); |
There was a problem hiding this comment.
Strip thread/topic suffix from extracted direct user IDs
This extractor currently returns everything after direct: verbatim, so direct session keys that carry thread suffixes (for example agent:main:telegram:direct:123456:thread:99, which is exercised in existing tests) produce a userId of 123456:thread:99 instead of 123456. That causes memory isolation to shard one user across per-thread directories/cache keys, so recall in DMs becomes inconsistent across threads for the same person.
Useful? React with 👍 / 👎.
| const configBackupSettings = preflightSnapshot.config?.gateway?.configBackup; | ||
| const backupEnabled = configBackupSettings?.enabled !== false; // default true | ||
| const autoRollbackEnabled = configBackupSettings?.autoRollback !== false; // default true |
There was a problem hiding this comment.
Wire configBackup settings into the config write path
The new gateway.configBackup options are exposed as runtime config, but only enabled/autoRollback are read during startup rollback preflight here; the actual write path still always rotates backups with a fixed count (maintainConfigBackups + CONFIG_BACKUP_COUNT=5 in src/config/io.ts / src/config/backup-rotation.ts). As a result, gateway.configBackup.enabled: false does not stop automatic backup creation and keepBackups is ignored, so the new settings do not behave as configured.
Useful? React with 👍 / 👎.
6e4766e to
d9ed11b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d9ed11b73f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| tailscaleCleanup, | ||
| canvasHost, | ||
| canvasHostServer, |
There was a problem hiding this comment.
Pass plugin route cleanup callback to close handler
createGatewayRuntimeState() now returns a releasePluginRouteRegistry cleanup hook, but this startup path no longer forwards it into createGatewayCloseHandler. Because server-close.ts only releases pinned plugin HTTP routes when that callback is present, an in-process restart (runGatewayLoop fallback path) leaves the previous registry pinned, so subsequent startups in the same process can serve stale plugin routes and miss route changes until a full process restart.
Useful? React with 👍 / 👎.
When restoreConfigBackup is called with an explicit backupPath option that does not exist, return an error instead of falling back to the newest available backup. This prevents accidental restore from the wrong backup when a specific file was requested. The fallback to listConfigBackups() now only applies when using the default .bak path (no backupPath option provided). Addresses PR openclaw#44238 review P1-1.
Timestamped backups created with --timestamp flag were being deleted by cleanOrphanBackups because they didn't match the numbered backup pattern. Added TIMESTAMPED_BACKUP_PATTERN regex to identify and preserve timestamped backups (.bak.YYYY-MM-DD_HH-MM-SS[-label]) during cleanup. P2-2 from Codex review for PR openclaw#44238.
18e4cf4 to
883f9b3
Compare
✅ Rebase 완료 - Merge Conflict 해결브랜치를 main 기준으로 재정리했습니다. 변경 사항이전 문제: 브랜치에 memory isolation 커밋들이 섞여 있어서 main과 충돌 발생 수정 내용:
주요 기능 (유지됨)
CI 재실행 예정입니다. |
When restoreConfigBackup is called with an explicit backupPath option that does not exist, return an error instead of falling back to the newest available backup. This prevents accidental restore from the wrong backup when a specific file was requested. The fallback to listConfigBackups() now only applies when using the default .bak path (no backupPath option provided). Addresses PR openclaw#44238 review P1-1.
Wire gateway.configBackup into the config validation schema so users can set this option without triggering validation errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
attemptConfigRollback now validates backups against the OpenClaw Zod schema in addition to JSON5 parsing. This ensures that backups which pass JSON syntax validation but fail schema validation (e.g., invalid field types) are skipped in favor of older, valid backups. Fixes P1-4 from Codex review: Try older backups when candidate fails schema validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add pre-flight config validation with automatic rollback support - Move rollback logic before loadConfig() call so invalid configs can be recovered - This ensures rollback happens before loadConfig throws on invalid config
Timestamped backups created with --timestamp flag were being deleted by cleanOrphanBackups because they didn't match the numbered backup pattern. Added TIMESTAMPED_BACKUP_PATTERN regex to identify and preserve timestamped backups (.bak.YYYY-MM-DD_HH-MM-SS[-label]) during cleanup. P2-2 from Codex review for PR openclaw#44238.
- Add null checks for issues array in preflight validation - Update test mocks to include isNixMode: false - Fix test snapshot to include valid: true
- Add null/undefined check in listConfigBackups to prevent TypeError - Add null/undefined check in attemptConfigRollback - Add null/undefined check in cleanupFailedBackups - Escape special regex characters in filenames for safety This fixes Windows CI failures where configPath was undefined during config snapshot validation in tests.
… settings Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- fix run.option-collisions.test.ts: beforeEach reset was missing valid:true causing preflight config check to fire before auth validation in tests - regenerate schema.base.generated.ts to include configBackup schema Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- run.option-collisions.test.ts: add config:{} to snapshot so
effectiveCfg is not undefined when accessing .gateway
- providers.test.ts: revert vi.resetModules() pattern to beforeAll
import + beforeEach reset (matches main, fixes mock isolation)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ix snapshot valid flag
…117 duplicate key in merge
… fix merge conflict
…nflict with GatewayWebchatConfig
- MemoryIsolationConfig type + isolation field in MemoryConfig - extractUserIdFromSessionKey utility + export via memory-host-sdk - Platform-prefixed userId (e.g. discord:12345) via addPlatformPrefixToSenderId - sanitizeUserId strips path traversal chars before use in fs paths - isMemoryPath enforces per-user access control in memory/users/<uid>/ - listMemoryFiles scopes to user's directory when isolation enabled - walkRootFiles exposes shared MEMORY.md/memory/ to all users - MemoryIndexManager, QmdMemoryManager, getMemorySearchManager accept userId - Cache key includes userId to prevent cross-user cache hits - readMemoryFile passes userId for access-controlled reads - ctx.requesterSenderId wired into createMemorySearchTool/createMemoryGetTool - Fix Windows path comparison in multimodal reindex test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ck paths Codex P1: createMemoryGetTool was destructuring only cfg/agentId in the execute factory, so userId was never forwarded to readAgentMemoryFile or getMemoryManagerContextWithPurpose. With isolation enabled, any user could read another user's memory files via the builtin backend.
- Session key: recognize dm token + preserve colon-containing user IDs - tools.shared: replace hardcoded VALID_PLATFORMS with structure-based platform detection - packages/memory-host-sdk/host/internal: normalizeRelPath uses path.posix.normalize to block .. traversal; add sanitizeUserId; userId-scoped isMemoryPath and listMemoryFiles; walkDirSkipping helper - src/memory-host-sdk/host/internal,read-file: mirror userId support from packages counterpart - zod-schema: add MemoryIsolationSchema and isolation field to MemorySchema - search-manager: only pass userId to QMD when isolation.enabled - config-baseline: regenerate sha256 for schema changes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove duplicate GatewayConfigBackup type declaration - Restore missing webchat field in GatewayConfig - Fix non-async beforeEach using await in providers.test.ts - Fix OpenClawConfig | undefined assignment in applyPluginAutoEnable mock - Regenerate config-baseline.sha256 to reflect correct schema Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
21f6d7c to
701b55e
Compare
Rebase & Fix CompleteRebased onto Conflicts resolved:
TypeScript fixes applied:
All TypeScript checks pass for touched surfaces. Pre-existing failures in |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 701b55e09c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| command: z.string().optional(), | ||
| mcporter: MemoryQmdMcporterSchema.optional(), | ||
| searchMode: z.union([z.literal("query"), z.literal("search"), z.literal("vsearch")]).optional(), | ||
| searchTool: z.string().trim().min(1).optional(), | ||
| includeDefaultMemory: z.boolean().optional(), |
There was a problem hiding this comment.
Restore
memory.qmd.searchTool in config validator
MemoryQmdSchema is strict, and this change removed the searchTool property, so configs that already use memory.qmd.searchTool now fail validation at startup as an unknown key. That is a regression because searchTool is still part of the public config type (src/config/types.memory.ts) and is still consumed by the QMD runtime (extensions/memory-core/src/memory/qmd-manager.ts) to choose the tool path.
Useful? React with 👍 / 👎.
| static async create(params: { | ||
| cfg: OpenClawConfig; | ||
| agentId: string; | ||
| userId?: string; | ||
| resolved: ResolvedMemoryBackendConfig; |
There was a problem hiding this comment.
Apply user isolation to QMD manager instances
The new userId plumbing is dropped in QmdMemoryManager.create: it is accepted but never used when constructing manager state, so QMD still operates on shared per-agent state. In the same flow, getMemorySearchManager caches QMD managers by agent/config only, which means different DM users can reuse the same manager and see cross-user memory in isolation mode.
Useful? React with 👍 / 👎.
Summary
Implements automatic config rollback on gateway startup failure, addressing #44132.
Changes
1. Config Backup Utilities (
src/config/config-backup-restore.ts)listConfigBackups()- List all available backupscreateConfigBackup()- Create manual backup with optional labelrestoreConfigBackup()- Restore from backupattemptConfigRollback()- Auto-rollback on validation failurecleanupFailedBackups()- Cleanup old failed backupsgetBackupStats()- Get backup statistics2. Gateway Config Option (
src/config/types.gateway.ts)3. Gateway Startup Integration (
src/gateway/server.impl.ts)4. CLI Commands (
src/cli/config-cli.ts)openclaw config backup [--label NAME]- Create manual backupopenclaw config restore --last- Restore from most recent backupopenclaw config restore --backup-path PATH- Restore from specific fileopenclaw config restore --dry-run- Preview without changesopenclaw config list-backups [--json]- List available backups5. Tests (
src/config/config-backup-restore.test.ts)How It Works
autoRollbackis enabled (default):Testing
All 25 tests pass.
Fixes #44132