OPS-08: Backup/restore automation and DR drill playbook by Chris0Jeky · Pull Request #663 · Chris0Jeky/Taskdeck

Chris0Jeky · 2026-04-01T18:26:27Z

Closes #86

Summary

scripts/backup.sh — timestamped SQLite hot backup via sqlite3 .backup (safe with active writers); configurable retention (default 7 days); PRAGMA integrity_check on output; chmod 600 on backup files; cp fallback with explicit warning
scripts/restore.sh — SQLite magic-bytes + PRAGMA integrity_check verification before overwrite; automatic safety copy of live DB; post-restore integrity verification; interactive confirmation (--yes to skip)
scripts/backup.ps1 / restore.ps1 — PowerShell equivalents for Windows with equivalent logic and restricted ACL (owner-only) instead of chmod
docs/ops/DISASTER_RECOVERY_RUNBOOK.md — RTO/RPO targets, backup scheduling (cron + Task Scheduler + Docker), step-by-step restore procedure, backup verification checks, access control requirements, DR drill schedule, evidence table template, escalation path
docs/ops/rehearsal-scenarios/backup-restore-drill.md — three injection options (delete, destructive query, Docker volume); pass criteria table; evidence checklist
docs/ops/INCIDENT_REHEARSAL_CADENCE.md — registered new scenario in scenario library

RTO/RPO

RTO: < 30 minutes (local SQLite)
RPO: < 24 hours (daily backup rotation default); < 1 hour with hourly cron

Usage

# Backup
bash scripts/backup.sh --db-path ~/.taskdeck/taskdeck.db --output-dir ~/.taskdeck/backups/

# Restore
bash scripts/restore.sh --backup-file ~/.taskdeck/backups/taskdeck-backup-2026-04-01-120000.db

Test plan

Run bash scripts/backup.sh on a live database and confirm PRAGMA integrity_check passes on the output
Run bash scripts/restore.sh against the backup and confirm post-restore integrity check passes
Verify retention: create 9 backups, confirm only 7 remain after each run
Verify --yes flag skips confirmation prompt
Run PowerShell equivalents on Windows and confirm equivalent behavior
Execute backup-restore-drill rehearsal scenario per the runbook

- scripts/backup.sh: timestamped hot backup via sqlite3 .backup (safe with active writers); configurable retention (default 7); integrity check on output; chmod 600 on backup files; cp fallback with warning - scripts/restore.sh: magic-bytes + integrity_check verification before overwrite; automatic safety copy; post-restore integrity verification; interactive confirmation (--yes to skip) - scripts/backup.ps1 / restore.ps1: PowerShell equivalents for Windows with equivalent logic and ACL restriction instead of chmod

- docs/ops/DISASTER_RECOVERY_RUNBOOK.md: RTO < 30min / RPO < 24h targets; backup schedule and cron examples; step-by-step restore procedure; backup verification checks; access control requirements (chmod 600 / restricted ACL); DR drill schedule; evidence table template; escalation path - docs/ops/rehearsal-scenarios/backup-restore-drill.md: three injection options (delete, destructive query, Docker volume); pass criteria; evidence checklist; registered in scenario library

chatgpt-codex-connector · 2026-04-01T18:26:35Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

sqlite3 dot-commands (.backup, .restore) pass file paths as single-quoted string literals. Reject paths containing single-quote characters with a clear error message to prevent truncation or misrouting of the dot-command. Both backup.sh and restore.sh now validate --db-path, --output-dir, and --backup-file before proceeding.

Chris0Jeky · 2026-04-01T18:28:06Z

Self-Review Findings

Backup hot-copy safety (active writes): Safe — backup.sh uses sqlite3 .backup which acquires a shared lock and flushes WAL frames before copying. Safe with active writers. The cp fallback is explicitly labeled unsafe with a warning recommending sqlite3 for production use.

Restore integrity check: Fully verified before overwrite — (1) SQLite magic bytes checked via file -b or xxd, (2) PRAGMA integrity_check run on the backup file, (3) schema sanity check confirms a Boards table is present. Only then is the safety copy created and the live DB overwritten. It is not a file-exists-only check.

Retention logic correctness: Correct — ls -1t produces newest-first order; the loop runs from index $RETAIN to end, deleting the oldest entries. The newest $RETAIN backups are preserved. The glob taskdeck-backup-*.db is narrow enough to avoid touching unrelated files.

Path injection safety: Was a concern — sqlite3 .backup and .restore embed file paths as single-quoted string literals inside dot-commands. A path containing a single quote would break the command. Fixed in commit aeb41a52: both backup.sh and restore.sh now reject --db-path, --output-dir, and --backup-file values containing single-quote characters with a clear error message.

RTO/RPO realism: Realistic for a SQLite local-first app. RTO < 30 min is achievable: the restore script itself completes in seconds; the bulk of the window is API restart + WAL checkpoint. The runbook notes Docker/hosted instances may need up to 60 min. RPO < 24h under daily cron; < 1h with hourly scheduling — both documented.

Drill template usability: Actionable — the evidence table has Date, Operator, Backup Age, Backup File, Restore Duration, integrity_check result, Row Count Match, Pass/Fail, and Notes columns. The backup-restore-drill.md scenario has three injection options, a pass criteria table, and a concrete evidence checklist. The scenario is registered in INCIDENT_REHEARSAL_CADENCE.md.

Critical issues: None remaining. Path injection fix pushed; all other checks passed initial review.

gemini-code-assist

Code Review

This pull request establishes a disaster recovery framework for Taskdeck, including a comprehensive runbook, rehearsal scenarios, and automation scripts for SQLite backups and restores on both Linux/macOS and Windows. The review feedback identifies opportunities to harden connection string parsing against multi-parameter strings, ensure Bash compatibility for macOS users by avoiding Bash 4.0 features, and prevent potential memory issues when validating large database files in PowerShell.

gemini-code-assist · 2026-04-01T18:28:32Z

scripts/backup.sh

+if [[ -z "$DB_PATH" ]]; then
+    # Try to extract from the ConnectionStrings env var (e.g. "Data Source=/app/data/taskdeck.db")
+    if [[ -n "${ConnectionStrings__DefaultConnection:-}" ]]; then
+        DB_PATH="${ConnectionStrings__DefaultConnection#*=}"


The connection string parsing is fragile. Using ${VAR#*=} assumes the path is the only parameter and follows the first =. If the connection string contains multiple parameters (e.g., Data Source=taskdeck.db;Version=3), this will result in an invalid file path.

Suggested change

DB_PATH="${ConnectionStrings__DefaultConnection#*=}"

DB_PATH=$(echo "${ConnectionStrings__DefaultConnection}" | sed -n 's/.*Data Source=$[^;]*$.*/\1/p')

gemini-code-assist · 2026-04-01T18:28:32Z

scripts/backup.sh

+        echo "ERROR: backup integrity check failed: $INTEGRITY" >&2
+        rm -f "$BACKUP_FILE"
+        exit 1


mapfile (or readarray) was introduced in Bash 4.0. Default installations of macOS still use Bash 3.2, where this command is unavailable. Using a while loop with read provides better compatibility for local-first deployments.

Suggested change

echo "ERROR: backup integrity check failed: $INTEGRITY" >&2

rm -f "$BACKUP_FILE"

exit 1

ALL_BACKUPS=()

while IFS= read -r line; do

ALL_BACKUPS+=("$line")

done < <(ls -1t "${OUTPUT_DIR}/taskdeck-backup-"*.db 2>/dev/null)

gemini-code-assist · 2026-04-01T18:28:32Z

scripts/restore.sh

+# ---------------------------------------------------------------------------
+if [[ -z "$DB_PATH" ]]; then
+    if [[ -n "${ConnectionStrings__DefaultConnection:-}" ]]; then
+        DB_PATH="${ConnectionStrings__DefaultConnection#*=}"


The connection string parsing is fragile. Using ${VAR#*=} assumes the path is the only parameter and follows the first =. If the connection string contains multiple parameters (e.g., Data Source=taskdeck.db;Version=3), this will result in an invalid file path.

Suggested change

DB_PATH="${ConnectionStrings__DefaultConnection#*=}"

DB_PATH=$(echo "${ConnectionStrings__DefaultConnection}" | sed -n 's/.*Data Source=$[^;]*$.*/\1/p')

gemini-code-assist · 2026-04-01T18:28:32Z

scripts/backup.ps1

+    $csEnv = $env:ConnectionStrings__DefaultConnection
+    if (-not [string]::IsNullOrWhiteSpace($csEnv)) {
+        # Parse "Data Source=/path/to/taskdeck.db"
+        $DbPath = ($csEnv -replace '^(?i)data\s+source\s*=\s*', '').Trim()


The connection string parsing using -replace is fragile and will fail if the connection string contains multiple parameters. It's safer to split the string and extract the specific key.

$DbPath = ($csEnv -split ';' | Where-Object { $_ -match 'Data Source=' } | ForEach-Object { $_ -replace '.*Data Source=', '' }).Trim()

gemini-code-assist · 2026-04-01T18:28:32Z

scripts/restore.ps1

+if ([string]::IsNullOrWhiteSpace($DbPath)) {
+    $csEnv = $env:ConnectionStrings__DefaultConnection
+    if (-not [string]::IsNullOrWhiteSpace($csEnv)) {
+        $DbPath = ($csEnv -replace '^(?i)data\s+source\s*=\s*', '').Trim()


The connection string parsing using -replace is fragile and will fail if the connection string contains multiple parameters. It's safer to split the string and extract the specific key.

$DbPath = ($csEnv -split ';' | Where-Object { $_ -match 'Data Source=' } | ForEach-Object { $_ -replace '.*Data Source=', '' }).Trim()

gemini-code-assist · 2026-04-01T18:28:32Z

scripts/restore.ps1

+
+# Check SQLite magic bytes (first 16 bytes = "SQLite format 3\0")
+$SqliteMagic = [System.Text.Encoding]::ASCII.GetBytes("SQLite format 3")
+$FileBytes   = [System.IO.File]::ReadAllBytes($BackupFile)


[System.IO.File]::ReadAllBytes($BackupFile) reads the entire file into memory. For large database backups, this can lead to high memory consumption or OutOfMemoryException. Since only the first 16 bytes are needed for the magic check, it's more efficient to read only the required header.

$FileBytes = Get-Content -Path $BackupFile -Encoding Byte -TotalCount 16

Copilot

Pull request overview

Adds cross-platform (bash + PowerShell) automation and documentation to support SQLite backup/restore workflows and to formalize a DR drill playbook for Taskdeck’s local-first database.

Changes:

Introduces backup/restore scripts for Linux/macOS/WSL and Windows, including integrity checks and permission hardening.
Adds a Disaster Recovery runbook documenting RTO/RPO targets, scheduling guidance, verification steps, and escalation.
Adds a rehearsal scenario for running a backup/restore DR drill and registers it in the rehearsal cadence index.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 14 comments.

Show a summary per file

File	Description
scripts/backup.sh	Hot-backup + retention automation with integrity check and chmod hardening.
scripts/restore.sh	Restore automation with pre/post integrity checks, schema sanity checks, and safety copy creation.
scripts/backup.ps1	Windows backup equivalent with retention and owner-only ACL hardening.
scripts/restore.ps1	Windows restore equivalent with validation, safety copy, and owner-only ACL hardening.
docs/ops/DISASTER_RECOVERY_RUNBOOK.md	DR runbook describing backup/restore procedure, scheduling, verification, and evidence requirements.
docs/ops/rehearsal-scenarios/backup-restore-drill.md	Step-by-step DR drill scenario including injections, recovery steps, and pass criteria.
docs/ops/INCIDENT_REHEARSAL_CADENCE.md	Registers the new backup/restore drill scenario in the scenario library list.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-01T18:30:19Z

scripts/backup.sh

+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEFAULT_DB_PATH="${HOME}/.taskdeck/taskdeck.db"
+DEFAULT_OUTPUT_DIR="${HOME}/.taskdeck/backups"
+DEFAULT_RETAIN=7


SCRIPT_DIR is assigned but never used, which adds dead code/noise. Remove it or use it (e.g., to resolve repo-relative defaults) so the script stays minimal and easier to maintain.

Copilot · 2026-04-01T18:30:20Z

scripts/backup.sh

+# Restrict backup directory permissions (owner read/write only)
+chmod 700 "$OUTPUT_DIR" 2>/dev/null || true
+
+# ---------------------------------------------------------------------------
+# Build backup filename
+# ---------------------------------------------------------------------------


The sqlite3 dot-command is built with single-quoted interpolation (.backup '${BACKUP_FILE}'). If the output path contains a single quote, the backup will fail; worse, a crafted path could inject additional sqlite3 commands. Prefer passing a safely escaped path (e.g., escape single quotes for sqlite3) and avoid constructing dot-commands that can be interpreted as multiple commands.

Copilot · 2026-04-01T18:30:20Z

scripts/restore.sh

+    exit 1
+fi
+if [[ "$DB_PATH" == *"'"* ]]; then
+    echo "ERROR: --db-path must not contain single-quote characters: $DB_PATH" >&2


The header check comment says the first 16 bytes must match SQLite format 3\0, but the script reads 15 bytes and compares against the hex for SQLite format 3 (no null terminator). Either read/compare 16 bytes including the trailing 00, or update the comment/expected value so the check and documentation align.

Copilot · 2026-04-01T18:30:20Z

scripts/restore.sh

+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+# ---------------------------------------------------------------------------
+# Defaults
+# ---------------------------------------------------------------------------
+DEFAULT_DB_PATH="${HOME}/.taskdeck/taskdeck.db"


SCRIPT_DIR is assigned but never used. Please remove it to reduce dead code, or use it to resolve repo-relative paths if that was the intent.

Copilot · 2026-04-01T18:30:20Z

scripts/restore.sh

+    [[ "$CONFIRM" =~ ^[Yy]$ ]] || { echo "Aborted."; exit 1; }
+fi
+
+# ---------------------------------------------------------------------------
+# Step 3: Create safety copy of the current live database
+# ---------------------------------------------------------------------------
+mkdir -p "$SAFETY_DIR"


The sqlite3 dot-command string uses single-quoted interpolation (.backup '${SAFETY_FILE}'). A safety-copy path containing a single quote will break the command; additionally, constructing dot-commands from user-controlled paths can allow sqlite3 command injection. Escape single quotes for sqlite3 (or otherwise sanitize/encode the filename) before embedding it into a dot-command.

Copilot · 2026-04-01T18:30:22Z

docs/ops/DISASTER_RECOVERY_RUNBOOK.md

+```powershell
+# Create a daily backup task
+$action  = New-ScheduledTaskAction -Execute "pwsh.exe" `
+             -Argument "-NonInteractive -File C:\taskdeck\scripts\backup.ps1 -Yes"
+$trigger = New-ScheduledTaskTrigger -Daily -At "02:00"
+Register-ScheduledTask -TaskName "Taskdeck-Daily-Backup" `
+  -Action $action -Trigger $trigger -RunLevel Highest
+```


The Task Scheduler example passes -Yes to backup.ps1, but scripts/backup.ps1 does not define a Yes parameter. This will cause scheduled backups to fail. Remove -Yes from the example (or add a corresponding switch parameter in the script if intended).

Copilot · 2026-04-01T18:30:22Z

docs/ops/rehearsal-scenarios/backup-restore-drill.md

+# 4. Restore via the restore script (exec into a temp container)
+docker run --rm \
+  -v taskdeck_taskdeck-db:/data \
+  -v "$(pwd):/repo" \
+  alpine:3 \
+  sh -c "cp /data/backups/taskdeck-backup-<timestamp>.db /data/taskdeck.db"


In Option C, Step 4 says "Restore via the restore script" but the command shown does a raw cp into the volume. This bypasses the restore script’s validations (magic bytes/integrity check) and safety copy behavior. Either update the text to say this is a manual copy (and why it's acceptable), or change the instructions to use scripts/restore.sh inside a container that has sqlite3 available.

Suggested change

# 4. Restore via the restore script (exec into a temp container)

docker run --rm \

-v taskdeck_taskdeck-db:/data \

-v "$(pwd):/repo" \

alpine:3 \

sh -c "cp /data/backups/taskdeck-backup-<timestamp>.db /data/taskdeck.db"

# 4. Restore via the restore script (exec into a temp container with sqlite3 available)

docker run --rm \

-v taskdeck_taskdeck-db:/data \

-v "$(pwd):/repo" \

alpine:3 \

sh -c "apk add --no-cache bash sqlite && bash /repo/scripts/restore.sh --db-path /data/taskdeck.db --backup-path /data/backups/taskdeck-backup-<timestamp>.db"

Copilot · 2026-04-01T18:30:23Z

scripts/backup.ps1

+$Sqlite3 = Get-Command sqlite3.exe -ErrorAction SilentlyContinue
+
+if ($Sqlite3) {
+    # sqlite3 .backup is a hot backup: copies pages under a shared lock,
+    # flushing WAL frames first. Safe with active readers and writers.
+    & $Sqlite3.Source $DbPath ".backup '$BackupFile'"
+    if ($LASTEXITCODE -ne 0) {
+        Write-Error "sqlite3 .backup failed (exit code $LASTEXITCODE)."
+    }


The sqlite3 dot-command is constructed with single-quoted interpolation (.backup '$BackupFile'). If the path contains a single quote, sqlite3 parsing will fail; additionally, embedding unsanitized input into dot-commands can allow command injection. Escape/sanitize the filename before constructing the .backup command.

Copilot · 2026-04-01T18:30:23Z

scripts/restore.ps1

+# Check SQLite magic bytes (first 16 bytes = "SQLite format 3\0")
+$SqliteMagic = [System.Text.Encoding]::ASCII.GetBytes("SQLite format 3")
+$FileBytes   = [System.IO.File]::ReadAllBytes($BackupFile)
+
+if ($FileBytes.Length -lt 16) {
+    Write-Error "Backup file is too small to be a valid SQLite database."
+}


Comment says the first 16 bytes are SQLite format 3\0, but the code compares only the 15-byte ASCII string SQLite format 3 (and never validates the trailing null byte). Either include the null terminator in the expected bytes, or update the comment/length check to match what’s being validated.

Copilot · 2026-04-01T18:30:23Z

scripts/backup.sh

+# Quick integrity check on the backup
+# ---------------------------------------------------------------------------
+if command -v sqlite3 &>/dev/null; then
+    INTEGRITY="$(sqlite3 "$BACKUP_FILE" 'PRAGMA integrity_check;' 2>&1)"
+    if [[ "$INTEGRITY" != "ok" ]]; then
+        echo "ERROR: backup integrity check failed: $INTEGRITY" >&2
+        rm -f "$BACKUP_FILE"
+        exit 1


Retention comment says backups are sorted oldest-first, but ls -1t sorts newest-first (and the deletion loop relies on newest-first ordering). Update the comment to match the actual ordering to avoid confusion when maintaining the retention logic.

- Fix fragile connection string parsing: use sed/split to extract Data Source from multi-parameter connection strings in all 4 scripts - Replace mapfile with while-read loop for macOS Bash 3.2 compatibility - Remove dead SCRIPT_DIR variables from backup.sh and restore.sh - Add sqlite3 single-quote escaping as defense-in-depth against command injection in all .backup/.restore dot-commands - Fix header byte count comment: code reads 15 bytes (text only), not 16 (which includes the null terminator) - Replace ReadAllBytes with stream read of first 16 bytes in restore.ps1 to avoid loading entire file into memory - Fix retention comment: ls -t sorts newest-first, not oldest-first - Remove nonexistent -Yes flag from Task Scheduler example in runbook - Update Option C Docker restore to use restore.sh instead of raw cp

Chris0Jeky

Adversarial Review — PR #663 (backup/restore scripts + DR runbook)

The two fix-up commits addressed the bot findings well (connection string parsing, mapfile compatibility, ReadAllBytes, single-quote escaping, dead SCRIPT_DIR, comment corrections, Docker Option C). Good follow-through. The scripts are generally well-structured with solid defensive checks.

Below are the remaining findings I could confirm through code analysis and local shell testing.

1. [HIGH] Restore scripts do not clean up stale WAL/SHM files — risk of silent corruption

Files: scripts/restore.sh, scripts/restore.ps1

When the live database is in WAL mode (which EF Core enables by default), the database directory contains taskdeck.db-wal and taskdeck.db-shm companion files. The restore scripts replace the main .db file (via sqlite3 .restore or cp) but leave any existing -wal and -shm files untouched.

On the next application startup, SQLite will discover the old WAL file and attempt to replay it against the newly restored database. Since the WAL contains page references from the pre-restore database, this can silently corrupt the restored data or cause an immediate SQLITE_CORRUPT error.

Fix: Before overwriting DB_PATH, delete (or rename alongside the safety copy) the companion files:

# In restore.sh, before the restore step:
rm -f "${DB_PATH}-wal" "${DB_PATH}-shm"

# In restore.ps1, before the restore step:
Remove-Item "${DbPath}-wal" -Force -ErrorAction SilentlyContinue
Remove-Item "${DbPath}-shm" -Force -ErrorAction SilentlyContinue

This is the most important finding. Without it, a restore of a WAL-mode database followed by an API restart can produce data corruption that defeats the purpose of the backup.

2. [MEDIUM] `cp` fallback in backup.sh does not copy WAL file — backup may be incomplete

File: scripts/backup.sh (and backup.ps1 Copy-Item fallback)

When sqlite3 is not available, the script falls back to cp "$DB_PATH" "$BACKUP_FILE". If the database is in WAL mode, the main .db file alone is an incomplete snapshot — committed transactions in the -wal file are not included, and the backup will appear to have rolled back those transactions.

The script already warns that cp is unsafe with active writers. Consider strengthening the warning to mention WAL explicitly, or copying the -wal file alongside the main database in the fallback path (with a caveat that it is still not crash-safe). This would at least give the operator a complete set of files to work with.

Not a blocker since the warning is already present, but worth a one-line clarification.

3. [MEDIUM] `restore.sh` magic-byte validation logic: `file` is checked before using the computed `xxd` result

File: scripts/restore.sh, lines ~131-157

The script computes MAGIC_ACTUAL via dd | xxd unconditionally, then checks if command -v file first. If file is available, MAGIC_ACTUAL is computed but never used. More importantly, the file command is less precise — it checks heuristics, not exact bytes. A file that happens to contain "SQLite" in the right place but is not a valid SQLite database could pass the file check.

Consider reversing the priority: prefer the exact xxd magic-byte check when available, fall back to file only when xxd is not present. Or just skip the file fallback entirely since xxd is more common in production environments (it ships with vim which is nearly universal on Linux).

4. [LOW] No disk-space pre-check — partial backup files on full disk

Files: scripts/backup.sh, scripts/backup.ps1

If the disk fills during sqlite3 .backup, the command will fail and leave a partial .db file at the backup path. The subsequent PRAGMA integrity_check will catch it and delete the partial file, which is good. However, the partial file itself may consume the remaining scarce disk space, making recovery harder. A pre-flight check (df on Linux, Get-PSDrive on Windows) comparing available space to the source database size would give an early, actionable error message instead of a cryptic sqlite3 failure.

Not a blocker — the integrity check is a good safety net — but a nice-to-have for operational clarity.

5. [LOW] Sub-second re-runs silently overwrite the previous backup

File: scripts/backup.sh, scripts/backup.ps1

The backup filename uses second-granularity timestamps (%Y-%m-%d-%H%M%S). If the script is invoked twice within the same second (e.g., automated retry on transient failure), the second run overwrites the first backup without warning. Consider appending a random suffix or checking for filename collision, or simply documenting this as a known limitation.

Summary

The PR is in good shape after the fix-up commits. Finding #1 (WAL/SHM cleanup on restore) is the only one I would call a blocker — it is a real data corruption vector for the default EF Core SQLite configuration. The rest are improvements that could ship in a follow-up.

EF Core uses WAL mode by default. Both restore scripts replaced the main .db file but left stale -wal and -shm companion files in place. On next API startup the stale WAL would be replayed against the restored database, risking silent data corruption. Delete ${DB_PATH}-wal and ${DB_PATH}-shm (bash) and their PowerShell equivalents immediately before the sqlite3 .restore / cp step.

The cp fallback (used when sqlite3 is unavailable) only copied the main .db file, producing an incomplete snapshot for WAL-mode databases. Now also copies the -wal file if it exists and warns that the backup may be incomplete for WAL-mode databases.

Chris0Jeky added 3 commits April 1, 2026 19:23

Register backup-restore-drill scenario in rehearsal cadence library

fee4815

Copilot AI review requested due to automatic review settings April 1, 2026 18:26

github-project-automation bot added this to Taskdeck Execution Apr 1, 2026

github-project-automation bot moved this to Pending in Taskdeck Execution Apr 1, 2026

Copilot started reviewing on behalf of Chris0Jeky April 1, 2026 18:27 View session

gemini-code-assist bot reviewed Apr 1, 2026

View reviewed changes

Copilot AI reviewed Apr 1, 2026

View reviewed changes

Chris0Jeky commented Apr 1, 2026

View reviewed changes

Chris0Jeky added 2 commits April 1, 2026 20:36

Chris0Jeky merged commit 4b6467a into main Apr 1, 2026
22 checks passed

Chris0Jeky deleted the ops/backup-restore-dr-playbook branch April 1, 2026 20:20

github-project-automation bot moved this from Pending to Done in Taskdeck Execution Apr 1, 2026

	DB_PATH="${ConnectionStrings__DefaultConnection#*=}"
	DB_PATH=$(echo "${ConnectionStrings__DefaultConnection}" \| sed -n 's/.Data Source=\([^;]\).*/\1/p')

-        echo "ERROR: backup integrity check failed: $INTEGRITY" >&2
-        rm -f "$BACKUP_FILE"
-        exit 1
+ALL_BACKUPS=()
+while IFS= read -r line; do
+    ALL_BACKUPS+=("$line")
+done < <(ls -1t "${OUTPUT_DIR}/taskdeck-backup-"*.db 2>/dev/null)

Conversation

Chris0Jeky commented Apr 1, 2026

Summary

RTO/RPO

Usage

Test plan

Uh oh!

chatgpt-codex-connector bot commented Apr 1, 2026

Uh oh!

Chris0Jeky commented Apr 1, 2026

Self-Review Findings

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Chris0Jeky left a comment

Choose a reason for hiding this comment

Adversarial Review — PR #663 (backup/restore scripts + DR runbook)

1. [HIGH] Restore scripts do not clean up stale WAL/SHM files — risk of silent corruption

2. [MEDIUM] cp fallback in backup.sh does not copy WAL file — backup may be incomplete

3. [MEDIUM] restore.sh magic-byte validation logic: file is checked before using the computed xxd result

4. [LOW] No disk-space pre-check — partial backup files on full disk

5. [LOW] Sub-second re-runs silently overwrite the previous backup

Summary

Uh oh!

Uh oh!

2. [MEDIUM] `cp` fallback in backup.sh does not copy WAL file — backup may be incomplete

3. [MEDIUM] `restore.sh` magic-byte validation logic: `file` is checked before using the computed `xxd` result