feat: periodic soak test mode for RAC fuzz framework#22
Conversation
Phase A - Storage cleanup (works with existing finite runs): - Add created_at column to all FUZZ_* tables for TTL-based purge - Add FUZZ_WKL.cleanup() procedure + DBMS_SCHEDULER job (every 30min) - Add archive log cleanup loop in fuzz-test.sh up (hourly) - Add seq dict pruning in kafka-consumer.py (every 10min, 24h TTL) - Add SQLite event purge in validator.py after each validation cycle Phase B - Continuous soak operation: - Add FUZZ_WKL.run_forever() with rate limiting via DBMS_SESSION.SLEEP - Add SOAK_MODE to validator.py: continuous validate-purge cycles - Add stall detection (exit if no new events for 5min) - Add fuzz-test.sh soak subcommand with health monitoring - Add SQLite busy_timeout to handle concurrent writer contention Tested: 2-min finite run (11,238 events, 0 mismatches) and 5-min soak run (2,407 events validated in cycle 1, 0 mismatches) on RAC VM.
Phase B (continuous soak) is being redesigned as periodic fuzz runs; remove the streaming implementation: - Drop run_forever() in FUZZ_WKL package - Drop SOAK_MODE path in validator.py (stall detection, continuous loop) - Drop soak subcommand and action_soak from fuzz-test.sh - Remove SOAK-TEST.md design doc Replace host-side archive cleanup loop with a docker-compose service: - New archive-cleanup/ directory (Dockerfile + crontab) - Alpine + openssh-client + supercronic v0.2.44 (SHA1 pinned) - Hourly find -mtime +1 -print -delete, deletions logged to stdout - Lifecycle tied to docker-compose up/down, visible via docker logs Phase A cleanup mechanisms retained: - created_at columns + FUZZ_WKL.cleanup() + DBMS_SCHEDULER job - Consumer seq dict pruning (10min interval, 24h TTL) - validator.py purge_old_events() called after each one-shot cycle - SQLite busy_timeout=30000 on both consumer and validator
- Move FUZZ_* table cleanup into workload PL/SQL (every 5min) so DELETEs flow through CDC and get validated like normal DML. - Remove DBMS_SCHEDULER FUZZ_CLEANUP job from up/down (replaced by in-workload cleanup). - Fix grep -q + pipefail SIGPIPE false-failure in OLR/Debezium readiness waits.
…ator - fuzz-workload.sql: make FUZZ_WKL.run() idempotent across cycles. Repopulate per-node tracked-ID arrays from existing rows (parity- filtered), continue g_next_id from MAX(id) preserving parity, and continue g_event_seq from MAX numeric tail of existing event_ids so cycle N+1 event_ids never collide with cycle N. Seed INSERTs only run on cold start (empty tables). - validator.py: accept START_CURSOR env and emit '[validator] final_cursor=...' (safe frontier only). Lets a soak loop resume past already-validated events without re-scanning. - fuzz-test.sh: forward START_CURSOR to the validator container; dump sqlplus output when FUZZ_DONE summary is missing (diagnostic). Verified: 3 back-to-back cycles with no teardown, 32k events, 0 mismatches, cursor advances each cycle.
|
Warning Rate limit exceeded
To continue reviewing without waiting, purchase usage credits in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a soak orchestration script that repeatedly runs fuzz cycles, validates results with resumable cursors, aggregates per-cycle logs/metrics, and advances state until a time deadline or a cycle failure. ChangesSoak orchestration
Sequence DiagramsequenceDiagram
participant soak as soak.sh
participant run as fuzz-test.sh (run)
participant validate as fuzz-test.sh (validate) / validator.py
participant logs as cycle-logs
soak->>run: start run for CYCLE_MIN, write cycle-###-run.log
run-->>logs: combined run output
soak->>validate: start validate with START_CURSOR, write cycle-###-validate.log
validate-->>logs: validation output + final_cursor
soak->>soak: extract final_cursor, update cursor for next cycle
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
tests/dbz-twin/rac/archive-cleanup/Dockerfile (1)
1-14: Consider adding a non-root user for security best practices.The container runs as root by default. While this is acceptable for test infrastructure, adding a non-root user would align with container security best practices.
🛡️ Optional: Add non-root user
RUN apk add --no-cache openssh-client curl \ && curl -fsSL -o /usr/local/bin/supercronic "${SUPERCRONIC_URL}" \ && echo "${SUPERCRONIC_SHA} /usr/local/bin/supercronic" | sha1sum -c - \ && chmod +x /usr/local/bin/supercronic \ - && apk del curl + && apk del curl \ + && adduser -D -u 1000 cleanup +USER cleanup ENTRYPOINT ["/usr/local/bin/supercronic", "-passthrough-logs"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/dbz-twin/rac/archive-cleanup/Dockerfile` around lines 1 - 14, Add a non-root user and switch to it at the end of the Dockerfile: create a group/user (e.g., addgroup -S super && adduser -S super -G super), change ownership of /usr/local/bin/supercronic (and any other runtime paths like /etc/crontab if needed) to that user, and add a USER super line before ENTRYPOINT/CMD; ensure permissions allow execution of /usr/local/bin/supercronic and that ENTRYPOINT ("/usr/local/bin/supercronic", "-passthrough-logs") continues to work under the non-root user.tests/dbz-twin/rac/fuzz-test.sh (1)
463-463: Quote$_SSH_OPTSto prevent word splitting.Shellcheck SC2086 correctly identifies that
$_SSH_OPTSshould be quoted. While it likely works in practice, quoting is safer.🔧 Suggested fix
- ssh $_SSH_OPTS "${VM_USER}@${VM_HOST}" \ + ssh "$_SSH_OPTS" "${VM_USER}@${VM_HOST}" \Note: If
$_SSH_OPTScontains multiple space-separated options, you may need to use an array instead:ssh "${_SSH_OPTS[@]}" ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/dbz-twin/rac/fuzz-test.sh` at line 463, The ssh invocation in fuzz-test.sh uses an unquoted variable $_SSH_OPTS which can undergo word splitting (SC2086); update the ssh call to quote the variable (use "${_SSH_OPTS}" or, if _SSH_OPTS is intended as an array, use "${_SSH_OPTS[@]}") so options are preserved correctly when invoking ssh "${_SSH_OPTS}" "${VM_USER}@${VM_HOST}" ...; ensure references to _SSH_OPTS, VM_USER, VM_HOST and the ssh command are updated accordingly.tests/dbz-twin/rac/validator.py (2)
434-434: Minor: Unused variablenfin unpacking.Prefix with underscore to indicate it's intentionally unused.
🧹 Fix unused variable warning
- (v, m, mm, mo, ml, to_, tl, lmc, oc, nf) = result + (v, m, mm, mo, ml, to_, tl, lmc, oc, _nf) = result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/dbz-twin/rac/validator.py` at line 434, The tuple unpacking currently binds an unused variable named `nf`; change the unpack target `nf` to `_nf` (or `_`) in the unpack expression `(v, m, mm, mo, ml, to_, tl, lmc, oc, nf) = result` so the unused value is clearly marked and silences the warning while leaving the rest of the bindings (`v, m, mm, mo, ml, to_, tl, lmc, oc`) unchanged.
358-361: Minor: Remove unnecessary f-string prefixes.Lines 359 and 382 use f-strings without placeholders.
🧹 Fix linting warnings
- print(f"\n{'='*60}", flush=True) - print(f" Fuzz Test Validation Summary", flush=True) - print(f"{'='*60}", flush=True) + print(f"\n{'='*60}", flush=True) + print(" Fuzz Test Validation Summary", flush=True) + print(f"{'='*60}", flush=True)And at line 382:
- print(f"Validator starting", flush=True) + print("Validator starting", flush=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/dbz-twin/rac/validator.py` around lines 358 - 361, Several print statements use f-strings without placeholders (for example the header prints using {'='*60} and the print that outputs total_validated), which is unnecessary; update those prints to plain string literals by removing the leading f (e.g., change print(f"\n{'='*60}", ...) and print(f" Total validated: {total_validated}", ...) to use non-f-prefixed strings where appropriate and only keep f-strings for prints that actually interpolate variables like total_validated). Ensure you only remove the f-prefix on lines that have no Python expression interpolation so formatting remains identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/dbz-twin/rac/soak.sh`:
- Line 6: The hardcoded absolute path in the source command inside
tests/dbz-twin/rac/soak.sh should be replaced with a path computed relative to
the script location (the source invocation in soak.sh), e.g., compute the script
directory via $0 or ${BASH_SOURCE[0]} and source the vm-env.sh from
../environments/rac/vm-env.sh relative to that directory so the script is
portable across machines.
---
Nitpick comments:
In `@tests/dbz-twin/rac/archive-cleanup/Dockerfile`:
- Around line 1-14: Add a non-root user and switch to it at the end of the
Dockerfile: create a group/user (e.g., addgroup -S super && adduser -S super -G
super), change ownership of /usr/local/bin/supercronic (and any other runtime
paths like /etc/crontab if needed) to that user, and add a USER super line
before ENTRYPOINT/CMD; ensure permissions allow execution of
/usr/local/bin/supercronic and that ENTRYPOINT ("/usr/local/bin/supercronic",
"-passthrough-logs") continues to work under the non-root user.
In `@tests/dbz-twin/rac/fuzz-test.sh`:
- Line 463: The ssh invocation in fuzz-test.sh uses an unquoted variable
$_SSH_OPTS which can undergo word splitting (SC2086); update the ssh call to
quote the variable (use "${_SSH_OPTS}" or, if _SSH_OPTS is intended as an array,
use "${_SSH_OPTS[@]}") so options are preserved correctly when invoking ssh
"${_SSH_OPTS}" "${VM_USER}@${VM_HOST}" ...; ensure references to _SSH_OPTS,
VM_USER, VM_HOST and the ssh command are updated accordingly.
In `@tests/dbz-twin/rac/validator.py`:
- Line 434: The tuple unpacking currently binds an unused variable named `nf`;
change the unpack target `nf` to `_nf` (or `_`) in the unpack expression `(v, m,
mm, mo, ml, to_, tl, lmc, oc, nf) = result` so the unused value is clearly
marked and silences the warning while leaving the rest of the bindings (`v, m,
mm, mo, ml, to_, tl, lmc, oc`) unchanged.
- Around line 358-361: Several print statements use f-strings without
placeholders (for example the header prints using {'='*60} and the print that
outputs total_validated), which is unnecessary; update those prints to plain
string literals by removing the leading f (e.g., change print(f"\n{'='*60}",
...) and print(f" Total validated: {total_validated}", ...) to use
non-f-prefixed strings where appropriate and only keep f-strings for prints that
actually interpolate variables like total_validated). Ensure you only remove the
f-prefix on lines that have no Python expression interpolation so formatting
remains identical.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ab34e29b-3e2d-4124-a533-7cdad113d68b
📒 Files selected for processing (8)
tests/dbz-twin/rac/archive-cleanup/Dockerfiletests/dbz-twin/rac/archive-cleanup/crontabtests/dbz-twin/rac/docker-compose-fuzz.yamltests/dbz-twin/rac/fuzz-test.shtests/dbz-twin/rac/kafka-consumer.pytests/dbz-twin/rac/perf/fuzz-workload.sqltests/dbz-twin/rac/soak.shtests/dbz-twin/rac/validator.py
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/dbz-twin/rac/soak.sh (1)
4-4: ⚡ Quick winConsider enabling undefined-variable protection.
Adding
set -ualongsideset -ewill catch typos in variable names and prevent silent failures from undefined variables.🛡️ Proposed enhancement
-set -e +set -euo pipefail🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/dbz-twin/rac/soak.sh` at line 4, The script currently has "set -e" which stops on errors but doesn't catch undefined variables; update the shell options by replacing or augmenting that line to enable undefined-variable protection (e.g., change "set -e" to "set -eu" or add "set -u" alongside "set -e") so typos in variable names will cause immediate failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/dbz-twin/rac/soak.sh`:
- Around line 36-39: Detect and handle a missing cursor after validation: after
extracting new_cursor from "$LOG_DIR/cycle-${num}-validate.log" (the new_cursor
variable), check if new_cursor is empty; if it is, write an error to the
log/console including the cycle number and the validate log path (use
"$LOG_DIR/cycle-${num}-validate.log"), append the summary line as context, and
exit with a non-zero status instead of assigning cursor="$new_cursor" so the
script fails fast rather than reprocessing from the beginning.
---
Nitpick comments:
In `@tests/dbz-twin/rac/soak.sh`:
- Line 4: The script currently has "set -e" which stops on errors but doesn't
catch undefined variables; update the shell options by replacing or augmenting
that line to enable undefined-variable protection (e.g., change "set -e" to "set
-eu" or add "set -u" alongside "set -e") so typos in variable names will cause
immediate failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 289dd34b-533c-43a2-b1cb-222a03665a0b
📒 Files selected for processing (1)
tests/dbz-twin/rac/soak.sh
- set -euo pipefail for undefined-variable protection - fail fast if validator cursor is missing
Summary
Test plan
run N+validate) still worksSummary by CodeRabbit
New Features
Improvements
Tests