Skip to content

Finished up the polishing of the EEE to finish up preparing for ver 0.1.4#31

Merged
OppaAI merged 3 commits intomainfrom
vcs
Feb 4, 2026
Merged

Finished up the polishing of the EEE to finish up preparing for ver 0.1.4#31
OppaAI merged 3 commits intomainfrom
vcs

Conversation

@OppaAI
Copy link
Owner

@OppaAI OppaAI commented Feb 4, 2026

Updated the following in scs branch (used wrong name: vcs)

  • Initial Implementation of SCS
  • Added Igniter (Temporary) as a placeholder for the future robot ignition sequence (bootstrap)
  • Added EEEAggregator TALLE error logger to centralize logging of emergency and exception events:
    • Injector: Collects log entries from various modules and ingests them into the logger
    • Filtrator: Standardizes log entry format
    • Propagator: Propagates log entries to multiple loggers/handlers based on severity level
  • TALLE uses 3 channels + 1 console output (for debugging) for logging and ledgering of emergency and exception events:
  • Master log: All logs (DEBUG and above)
  • Error log: Errors only (ERROR and above)
  • System log files: Separate log files for each system (DEBUG and above)
  • Console output: Info and above (INFO and above)
  • Added Deduplication mechanism to prevent log floods
  • Added Token Bucket Throttle to rate limit logging
  • Added plugin system to EEEAggregator to allow for easy extension
  • Added EOS ROS Bridge plugin to bridge EEE to ROS 2 diagnostics and rosout
  • Added Reflex Plugin to bridge EEE to ROS 2 diagnostics
  • Added Awarenes sPlugin to bridge EEE to ROS 2 rosout
  • Added Health query service to check the health of the robot
  • Add File rotation + gzip, Batch Ledger inserts, Richer health response, Per-proc throttling in Reflex, and STALE support
  • Updated to allow more robust CID in Awareness (/rosout) and added metrics topic (/eee/metrics)

Summary by CodeRabbit

  • New Features

    • Added system-wide metrics reporting for performance monitoring
    • Introduced stale component detection to identify unresponsive modules
    • Enhanced health checks with detailed status categories and richer diagnostic messages
  • Bug Fixes

    • Improved error and warning tracking in system health monitoring
    • Refined log parsing for better data integrity
  • Documentation

    • Updated system documentation to reflect semantic cognitive framework
    • Refined module naming and architecture descriptions for clarity
    • Enhanced logging with automatic file compression and batching

@coderabbitai
Copy link

coderabbitai bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

System renamed from "Vital Circulatory System (VCS)" to "Semantic Cognitive System (SCS)" throughout the changelog. New logging infrastructure added with gzip-enabled rotating file handlers and batch ledger persistence. Metrics publishing plugin introduced alongside stale module detection, publish throttling, and enhanced health query reporting in the ROS bridge.

Changes

Cohort / File(s) Summary
Documentation
AuRoRA/src/scs/scs/CHANGELOG.rst
System rename from VCS to SCS with updated module names, descriptions, roadmap sections, and future implementation planning; renamed module group from VTC to Igniter with new modules (Bio-Logic Clock, Memory) replacing legacy ones.
Logging Infrastructure
AuRoRA/src/scs/scs/eee.py
New GzipRotatingFileHandler for automatic compression of rotated logs; asynchronous ledger batching via deque, lock, and timer-driven flush mechanism; replacement of direct logging writes with batched handler emissions.
ROS Bridge & Monitoring
AuRoRA/src/scs/scs/eee_ros_bridge.py
New MetricsPlugin class for publishing numeric metrics; ReflexPlugin enhanced with publish throttling (PUBLISH_COOLDOWN), stale module detection (STALE_THRESHOLD), and periodic staleness checks; health query expanded to report ERRORS, WARNs, and STALE statuses with affected module lists; timestamp tracking in status messages for staleness detection.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant EEEROSBridge
    participant HealthHandler
    participant ReflexPlugin
    participant MetricsPlugin
    participant EEEAggregator

    Client->>EEEROSBridge: query health status
    EEEROSBridge->>HealthHandler: compute health state
    HealthHandler->>ReflexPlugin: check for ERRORS, WARNs, STALE modules
    ReflexPlugin->>ReflexPlugin: _check_stale_modules() via timer
    ReflexPlugin-->>HealthHandler: list of affected modules per category
    HealthHandler->>MetricsPlugin: extract numeric metrics
    MetricsPlugin-->>HealthHandler: metric values
    HealthHandler->>EEEAggregator: log enriched health response
    EEEAggregator->>EEEAggregator: batch to ledger queue
    EEEAggregator-->>Client: detailed health report (ERRORS, WARNs, STALE, metrics)
Loading
sequenceDiagram
    participant EEEAggregator
    participant LedgerQueue
    participant FlushTimer
    participant SQLiteDB

    EEEAggregator->>EEEAggregator: startup - init _ledger_flush_timer
    loop On log event
        EEEAggregator->>LedgerQueue: enqueue ledger entry
        LedgerQueue->>LedgerQueue: increment batch counter
        alt batch size reached
            LedgerQueue->>EEEAggregator: trigger immediate flush
        end
    end
    loop Timer interval (FLUSH_INTERVAL)
        FlushTimer->>EEEAggregator: periodic check
        EEEAggregator->>EEEAggregator: _flush_ledger_batch()
        EEEAggregator->>SQLiteDB: insert batch of ledger entries
    end
    EEEAggregator->>EEEAggregator: shutdown - final flush + cancel timer
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

documentation

Poem

🐰 A system reborn, from vital veins to cognitive dreams,
Batching logs with gzip gleams, metrics flow in streams,
Stale watchers stand tall, health checks tell all,
SCS awakens, no more missteps when we call!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Finished up the polishing of the EEE to finish up preparing for ver 0.1.4' is vague and generic, using non-descriptive terms like 'polishing' and 'finish up' that don't convey meaningful information about the specific changes made. Revise the title to specifically describe the main changes, such as 'Add gzip rotating logs, batch ledger inserts, metrics plugin, and stale module detection' or similar that reflects the actual technical improvements implemented.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch vcs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1578585fb6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +437 to +441
with EEEAggregator._ledger_lock:
EEEAggregator._ledger_queue.append(ledger_args)
# Immediate flush if batch full (for low-volume bursts)
if len(EEEAggregator._ledger_queue) >= EEEAggregator.BATCH_SIZE:
EEEAggregator._flush_ledger_batch()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid re-locking ledger mutex during batch flush

When the queue hits BATCH_SIZE, the worker thread calls _flush_ledger_batch() while holding _ledger_lock. _flush_ledger_batch() immediately tries to acquire the same non‑reentrant lock again, which deadlocks the logging thread the moment a batch fills. That stops further log processing and can stall shutdown. This only happens under bursty logging that reaches the batch threshold, but that’s exactly when batching is needed. Consider releasing the lock before flushing, or refactor _flush_ledger_batch() to assume the caller already holds the lock (or use an RLock).

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make an GitHub Issue for this item

@OppaAI OppaAI self-assigned this Feb 4, 2026
@OppaAI OppaAI added bug Something isn't working enhancement New feature or request labels Feb 4, 2026
@OppaAI OppaAI added this to the Semantic Cognitive System milestone Feb 4, 2026
@OppaAI OppaAI merged commit 56aa978 into main Feb 4, 2026
1 check passed
@OppaAI OppaAI deleted the vcs branch February 4, 2026 18:01
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@AuRoRA/src/scs/scs/CHANGELOG.rst`:
- Around line 7-8: Update the system description sentence in CHANGELOG.rst by
replacing the plural "feedbacks" with the correct uncountable noun "feedback" in
the string "**Description:** This system is the
thinking/reasoning/cognitive/guardrail, basically the command centre, of AGi
robots, that receive inputs from users, environments and sensors, and provides
feedbacks to different systems in the robots and to users." so it reads "...and
provides feedback to different systems in the robots and to users."
- Around line 66-69: Update the roadmap section heading and any matching
references from "Core VCS Architecture" to "SCS" for consistency; locate the
heading text "Core VCS Architecture" in CHANGELOG.rst and replace it with "SCS",
and scan the surrounding lines for other occurrences of "VCS" within that
section and update them accordingly to "SCS".
- Around line 12-14: The nested bullet uses "* -" which breaks RST rendering;
update the list under the "Add the following modules to Igniter:" entry so child
items are indented and use proper RST bullets (e.g., indent the module lines and
use "-" or "*" consistently), replacing the "* - Bio-Logic Clock: ..." and "* -
Memory: ..." lines with properly indented child bullets; edit the lines in
CHANGELOG.rst near the Igniter entry so the top-level line stays as "Add the
following modules to Igniter:" and the two module descriptions are indented
beneath it as nested bullets (keeping the same text for the module names and
descriptions).
- Around line 30-37: Fix the typos in the CHANGELOG entry: change "Awarenes
sPlugin" to "Awareness Plugin" in the line that begins "Added Awarenes sPlugin
to bridge EEE to ROS 2 rosout" (locate by that phrase), and remove the extra
space after "gzip," in the line that starts "Add File rotation + gzip,  Batch
Ledger inserts..." so it reads "gzip, Batch Ledger inserts, ...".

In `@AuRoRA/src/scs/scs/eee.py`:
- Around line 35-52: The doRollover implementation in GzipRotatingFileHandler
skips compressing the oldest rotated file because the loop uses
range(self.backupCount - 1, 0, -1) and never targets the .{backupCount} file;
update doRollover to explicitly target and compress the oldest backup
(baseFilename + f".{self.backupCount}") or change the loop to include
self.backupCount, check existence, gzip it to .gz using gzip and
shutil.copyfileobj, then unlink the original; reference
GzipRotatingFileHandler.doRollover, RotatingFileHandler, self.backupCount and
self.baseFilename when making the change.
- Around line 231-244: The code holds EEEAggregator._ledger_lock while calling
EEEAggregator._flush_ledger_batch(), causing a deadlock because threading.Lock
is not reentrant; modify _start_ledger_flush (flush_loop) so it does not call
_flush_ledger_batch() while holding EEEAggregator._ledger_lock (release the lock
before invoking _flush_ledger_batch()), and likewise update the batch-full path
that currently calls _flush_ledger_batch() inside the _ledger_lock context to
instead perform the flush after exiting the lock; keep the Timer setup
(threading.Timer, daemon, start) unchanged but ensure actual flush calls occur
outside the _ledger_lock scope.

Comment on lines +7 to +8
**Description:** This system is the thinking/reasoning/cognitive/guardrail, basically the command centre, of AGi robots, that receive inputs from users, environments and sensors, and provides feedbacks to different
systems in the robots and to users.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix grammar in the system description (“feedbacks” → “feedback”).

This is user‑facing documentation, so the grammar should be clean.

✏️ Proposed fix
-**Description:** This system is the thinking/reasoning/cognitive/guardrail, basically the command centre, of AGi robots, that receive inputs from users, environments and sensors, and provides feedbacks to different
+**Description:** This system is the thinking/reasoning/cognitive/guardrail, basically the command centre, of AGi robots, that receive inputs from users, environments and sensors, and provides feedback to different
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**Description:** This system is the thinking/reasoning/cognitive/guardrail, basically the command centre, of AGi robots, that receive inputs from users, environments and sensors, and provides feedbacks to different
systems in the robots and to users.
**Description:** This system is the thinking/reasoning/cognitive/guardrail, basically the command centre, of AGi robots, that receive inputs from users, environments and sensors, and provides feedback to different
systems in the robots and to users.
🤖 Prompt for AI Agents
In `@AuRoRA/src/scs/scs/CHANGELOG.rst` around lines 7 - 8, Update the system
description sentence in CHANGELOG.rst by replacing the plural "feedbacks" with
the correct uncountable noun "feedback" in the string "**Description:** This
system is the thinking/reasoning/cognitive/guardrail, basically the command
centre, of AGi robots, that receive inputs from users, environments and sensors,
and provides feedbacks to different systems in the robots and to users." so it
reads "...and provides feedback to different systems in the robots and to
users."

Comment on lines +12 to +14
* Add the following modules to Igniter:
* - Bio-Logic Clock: Use a centralized ROS timer for all modules and nodes in all systems in the whole robot
* - Memory: Add robot specs and user/admin configurations to YAML files and load during startup
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix nested bullet formatting for Igniter modules.

The current * - nesting renders incorrectly in RST.

✏️ Proposed fix
-* Add the following modules to Igniter:
-* - Bio-Logic Clock: Use a centralized ROS timer for all modules and nodes in all systems in the whole robot
-* - Memory: Add robot specs and user/admin configurations to YAML files and load during startup
+* Add the following modules to Igniter:
+  - Bio-Logic Clock: Use a centralized ROS timer for all modules and nodes in all systems in the whole robot
+  - Memory: Add robot specs and user/admin configurations to YAML files and load during startup
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* Add the following modules to Igniter:
* - Bio-Logic Clock: Use a centralized ROS timer for all modules and nodes in all systems in the whole robot
* - Memory: Add robot specs and user/admin configurations to YAML files and load during startup
* Add the following modules to Igniter:
- Bio-Logic Clock: Use a centralized ROS timer for all modules and nodes in all systems in the whole robot
- Memory: Add robot specs and user/admin configurations to YAML files and load during startup
🤖 Prompt for AI Agents
In `@AuRoRA/src/scs/scs/CHANGELOG.rst` around lines 12 - 14, The nested bullet
uses "* -" which breaks RST rendering; update the list under the "Add the
following modules to Igniter:" entry so child items are indented and use proper
RST bullets (e.g., indent the module lines and use "-" or "*" consistently),
replacing the "* - Bio-Logic Clock: ..." and "* - Memory: ..." lines with
properly indented child bullets; edit the lines in CHANGELOG.rst near the
Igniter entry so the top-level line stays as "Add the following modules to
Igniter:" and the two module descriptions are indented beneath it as nested
bullets (keeping the same text for the module names and descriptions).

Comment on lines +30 to +37
* Added Token Bucket Throttle to rate limit logging
* Added plugin system to EEEAggregator to allow for easy extension
* Added EOS ROS Bridge plugin to bridge EEE to ROS 2 diagnostics and rosout
* Added ReflexPlugin to bridge EEE to ROS 2 diagnostics
* Added AwarenessPlugin to bridge EEE to ROS 2 rosout
* Added Reflex Plugin to bridge EEE to ROS 2 diagnostics
* Added Awarenes sPlugin to bridge EEE to ROS 2 rosout
* Added Health query service to check the health of the robot
* Add File rotation + gzip, Batch Ledger inserts, Richer health response, Per-proc throttling in Reflex, and STALE support
* Updated to allow more robust CID in Awareness (/rosout) and added metrics topic (/eee/metrics)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Correct the plugin name typo and extra spacing.

“Awarenes sPlugin” and the double space after “gzip,” should be fixed for clarity.

✏️ Proposed fix
-* Added Awarenes sPlugin to bridge EEE to ROS 2 rosout
+* Added Awareness Plugin to bridge EEE to ROS 2 rosout
-* Add File rotation + gzip,  Batch Ledger inserts, Richer health response, Per-proc throttling in Reflex, and STALE support
+* Add File rotation + gzip, Batch Ledger inserts, Richer health response, Per-proc throttling in Reflex, and STALE support
🤖 Prompt for AI Agents
In `@AuRoRA/src/scs/scs/CHANGELOG.rst` around lines 30 - 37, Fix the typos in the
CHANGELOG entry: change "Awarenes sPlugin" to "Awareness Plugin" in the line
that begins "Added Awarenes sPlugin to bridge EEE to ROS 2 rosout" (locate by
that phrase), and remove the extra space after "gzip," in the line that starts
"Add File rotation + gzip,  Batch Ledger inserts..." so it reads "gzip, Batch
Ledger inserts, ...".

Comment on lines 66 to +69
1. Core VCS Architecture
--------------------------------

- Formalize VCS as a distributed heartbeat and network-awareness system
- Maintain a biologically-inspired design:
- Pump
- Oscillator
- Regulator
- Pacemaker
- Ensure deterministic fast-path for network liveness and feedback
- Separate fast diagnostics from deep historical analysis

--------------------------------
2. Robot-side VCS (VTC Node)
--------------------------------

Planned features for the Vital Terminal Core (robot side):

- Unified VTC node containing:
- Pump (raw metric gathering)
- Oscillator (tag generation, protobuf wrapping)
- Regulator (ROS publish / subscribe)
- Pacemaker (OPM calculation and network liveness)
- Baseline OPM (60) with adaptive scaling based on RTT
- Automatic fallback to baseline OPM when:
- Network timeout occurs
- RTT exceeds threshold
- Negative OPM persists for defined grace cycles
- Local lightweight model activation when server is offline
- Maximum OPM cap to avoid resource saturation
- Separate server_linked / network_state flag in feedback message

--------------------------------
3. Server-side VCS (VCC Nodes)
--------------------------------

Planned server architecture:

- Fast-path Analyzer Node:
- Decrypt incoming Vital Pulse
- Compute RTT
- Determine OPM and color state
- Send immediate Vital Feedback
- Slow Diagnostics Node (future):
- Historical analysis
- Statistical aggregation
- Network quality profiling
- Long-term health reporting

--------------------------------
4. Messaging and Data Model
--------------------------------

- ROS messages used only as transport envelopes
- Vital data payload encoded as Protobuf
- Encrypted payload carried inside ROS msg
- Stable ROS message fields:
- tag
- timestamp(s)
- seq
- encrypted protobuf payload
- OPM
- server_linked
- Tag generation based on:
- ROS time
- Sequence number
- Tag usable for:
- Request/response matching
- Encryption key derivation
- Logging index
- Visualization and replay

--------------------------------
5. Metrics Collection (Pump)
--------------------------------

Incremental metric rollout strategy:

- Initial metrics:
- CPU temperature
- GPU temperature
- Future metrics:
- Memory usage
- Battery level
- Voltage
- Sensor health flags
- Per-metric adjustable sampling frequency
- Pump acts as a data wrapper only (no decision logic)

--------------------------------
6. Configuration and Parameters
--------------------------------

- All constants externalized into YAML
- Separation of configuration domains:

Admin (Critical, reboot required):
- baseline_opm
- max_opm
- timeout thresholds
- jitter tolerance
- network classification logic

User (Non-critical, runtime adjustable):
- display colors
- refresh rates
- UI toggles

- ROS parameter enforcement:
- Critical parameters reject runtime modification
- UI parameters support live update
- Future WebUI support:
- User mode for UI tuning
- Admin mode for core tuning with reboot

--------------------------------
7. Logging and Storage Strategy
--------------------------------

Robot-side logging:

- Short-term memory (STM):
- ROS 2 bag with MCAP + Protobuf
- High-speed read for visualization and replay
- Logging includes:
- tag
- timestamps
- encrypted protobuf payload

Server-side logging:

- Short-term memory:
- Optional MCAP recording
- Long-term memory (LTM):
- PostgreSQL for aggregated and administrative data
- NAS usage:
- Backup and archival only
- Not used for live database storage

--------------------------------
8. Visualization and Replay
--------------------------------

Planned visualization features:

- Vital Display Console (VCS Heart View)
- Real-time OPM and network state visualization
- Historical replay:
- Animate Vital Pulse travel path
- Visualize RTT and pulse speed
- Doctor-style inspection mode:
- Live status
- Past time windows (hourly, daily, weekly)

--------------------------------
9. Security and Encryption
--------------------------------

- Dedicated CryptoCore node (future)
- Encryption applied after protobuf serialization
- Tag used as part of encryption context
- Encrypted data logged without decryption
- Decryption only performed by authorized analysis components

--------------------------------
10. Adaptive Intelligence (Future)
--------------------------------

- Adaptive Small Language Model (SLM) integration
- Learning acceptable RTT / jitter patterns
- Dynamic tuning of:
- OPM scaling
- Timeout thresholds
- Gradual migration from static rules to learned behavior
- Integration after LLM and core VCS stabilization

--------------------------------
11. System Evolution Principles
--------------------------------

- Architecture-first, implementation-second
- Fast-path determinism over feature density
- Clear separation of:
- Control vs visualization
- Runtime vs persistent configuration
- Short-term vs long-term memory
- Backward compatibility of recorded data
- Incremental, non-breaking extensions
- TBD
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Rename “Core VCS Architecture” to SCS for consistency.

The roadmap section still references VCS.

✏️ Proposed fix
-1. Core VCS Architecture
+1. Core SCS Architecture
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
1. Core VCS Architecture
--------------------------------
- Formalize VCS as a distributed heartbeat and network-awareness system
- Maintain a biologically-inspired design:
- Pump
- Oscillator
- Regulator
- Pacemaker
- Ensure deterministic fast-path for network liveness and feedback
- Separate fast diagnostics from deep historical analysis
--------------------------------
2. Robot-side VCS (VTC Node)
--------------------------------
Planned features for the Vital Terminal Core (robot side):
- Unified VTC node containing:
- Pump (raw metric gathering)
- Oscillator (tag generation, protobuf wrapping)
- Regulator (ROS publish / subscribe)
- Pacemaker (OPM calculation and network liveness)
- Baseline OPM (60) with adaptive scaling based on RTT
- Automatic fallback to baseline OPM when:
- Network timeout occurs
- RTT exceeds threshold
- Negative OPM persists for defined grace cycles
- Local lightweight model activation when server is offline
- Maximum OPM cap to avoid resource saturation
- Separate server_linked / network_state flag in feedback message
--------------------------------
3. Server-side VCS (VCC Nodes)
--------------------------------
Planned server architecture:
- Fast-path Analyzer Node:
- Decrypt incoming Vital Pulse
- Compute RTT
- Determine OPM and color state
- Send immediate Vital Feedback
- Slow Diagnostics Node (future):
- Historical analysis
- Statistical aggregation
- Network quality profiling
- Long-term health reporting
--------------------------------
4. Messaging and Data Model
--------------------------------
- ROS messages used only as transport envelopes
- Vital data payload encoded as Protobuf
- Encrypted payload carried inside ROS msg
- Stable ROS message fields:
- tag
- timestamp(s)
- seq
- encrypted protobuf payload
- OPM
- server_linked
- Tag generation based on:
- ROS time
- Sequence number
- Tag usable for:
- Request/response matching
- Encryption key derivation
- Logging index
- Visualization and replay
--------------------------------
5. Metrics Collection (Pump)
--------------------------------
Incremental metric rollout strategy:
- Initial metrics:
- CPU temperature
- GPU temperature
- Future metrics:
- Memory usage
- Battery level
- Voltage
- Sensor health flags
- Per-metric adjustable sampling frequency
- Pump acts as a data wrapper only (no decision logic)
--------------------------------
6. Configuration and Parameters
--------------------------------
- All constants externalized into YAML
- Separation of configuration domains:
Admin (Critical, reboot required):
- baseline_opm
- max_opm
- timeout thresholds
- jitter tolerance
- network classification logic
User (Non-critical, runtime adjustable):
- display colors
- refresh rates
- UI toggles
- ROS parameter enforcement:
- Critical parameters reject runtime modification
- UI parameters support live update
- Future WebUI support:
- User mode for UI tuning
- Admin mode for core tuning with reboot
--------------------------------
7. Logging and Storage Strategy
--------------------------------
Robot-side logging:
- Short-term memory (STM):
- ROS 2 bag with MCAP + Protobuf
- High-speed read for visualization and replay
- Logging includes:
- tag
- timestamps
- encrypted protobuf payload
Server-side logging:
- Short-term memory:
- Optional MCAP recording
- Long-term memory (LTM):
- PostgreSQL for aggregated and administrative data
- NAS usage:
- Backup and archival only
- Not used for live database storage
--------------------------------
8. Visualization and Replay
--------------------------------
Planned visualization features:
- Vital Display Console (VCS Heart View)
- Real-time OPM and network state visualization
- Historical replay:
- Animate Vital Pulse travel path
- Visualize RTT and pulse speed
- Doctor-style inspection mode:
- Live status
- Past time windows (hourly, daily, weekly)
--------------------------------
9. Security and Encryption
--------------------------------
- Dedicated CryptoCore node (future)
- Encryption applied after protobuf serialization
- Tag used as part of encryption context
- Encrypted data logged without decryption
- Decryption only performed by authorized analysis components
--------------------------------
10. Adaptive Intelligence (Future)
--------------------------------
- Adaptive Small Language Model (SLM) integration
- Learning acceptable RTT / jitter patterns
- Dynamic tuning of:
- OPM scaling
- Timeout thresholds
- Gradual migration from static rules to learned behavior
- Integration after LLM and core VCS stabilization
--------------------------------
11. System Evolution Principles
--------------------------------
- Architecture-first, implementation-second
- Fast-path determinism over feature density
- Clear separation of:
- Control vs visualization
- Runtime vs persistent configuration
- Short-term vs long-term memory
- Backward compatibility of recorded data
- Incremental, non-breaking extensions
- TBD
1. Core SCS Architecture
--------------------------------
- TBD
🤖 Prompt for AI Agents
In `@AuRoRA/src/scs/scs/CHANGELOG.rst` around lines 66 - 69, Update the roadmap
section heading and any matching references from "Core VCS Architecture" to
"SCS" for consistency; locate the heading text "Core VCS Architecture" in
CHANGELOG.rst and replace it with "SCS", and scan the surrounding lines for
other occurrences of "VCS" within that section and update them accordingly to
"SCS".

Comment on lines +35 to +52
class GzipRotatingFileHandler(RotatingFileHandler):
"""
RotatingFileHandler that gzips old backups on rotate.
"""
def doRollover(self):
super().doRollover()
# Gzip the oldest backup (backupCount keeps N files)
if self.backupCount > 0:
# Find the oldest .1 file (or whichever is now the "extra" one)
log_path = Path(self.baseFilename)
for i in range(self.backupCount - 1, 0, -1):
old_file = log_path.parent / f"{log_path.name}.{i}"
if old_file.exists():
with open(old_file, 'rb') as f_in:
with gzip.open(f"{old_file}.gz", 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
old_file.unlink() # Delete uncompressed
break
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import logging, tempfile, pathlib
from logging.handlers import RotatingFileHandler

tmp = tempfile.mkdtemp()
p = pathlib.Path(tmp) / "test.log"
h = RotatingFileHandler(p, maxBytes=1, backupCount=5)
log = logging.getLogger("t"); log.setLevel(logging.DEBUG); log.addHandler(h)

for _ in range(30):
    log.debug("x"*50)

h.close()
print("Files created:")
for f in sorted(pathlib.Path(tmp).iterdir()):
    print(f"  {f.name}")
PY

Repository: OppaAI/AGi-ROS

Length of output: 148


Compress the actual oldest rotated backup.

RotatingFileHandler keeps backups numbered 1 to backupCount, where .backupCount is the oldest. The current loop (range(self.backupCount - 1, 0, -1)) checks .1 through .(backupCount-1) but skips the oldest file .backupCount. Directly compress .backupCount instead.

🛠️ Proposed fix
-        if self.backupCount > 0:
-            # Find the oldest .1 file (or whichever is now the "extra" one)
-            log_path = Path(self.baseFilename)
-            for i in range(self.backupCount - 1, 0, -1):
-                old_file = log_path.parent / f"{log_path.name}.{i}"
-                if old_file.exists():
-                    with open(old_file, 'rb') as f_in:
-                        with gzip.open(f"{old_file}.gz", 'wb') as f_out:
-                            shutil.copyfileobj(f_in, f_out)
-                    old_file.unlink()  # Delete uncompressed
-                    break
+        if self.backupCount > 0:
+            log_path = Path(self.baseFilename)
+            old_file = log_path.parent / f"{log_path.name}.{self.backupCount}"
+            if old_file.exists():
+                with open(old_file, "rb") as f_in, gzip.open(f"{old_file}.gz", "wb") as f_out:
+                    shutil.copyfileobj(f_in, f_out)
+                old_file.unlink()  # Delete uncompressed
🤖 Prompt for AI Agents
In `@AuRoRA/src/scs/scs/eee.py` around lines 35 - 52, The doRollover
implementation in GzipRotatingFileHandler skips compressing the oldest rotated
file because the loop uses range(self.backupCount - 1, 0, -1) and never targets
the .{backupCount} file; update doRollover to explicitly target and compress the
oldest backup (baseFilename + f".{self.backupCount}") or change the loop to
include self.backupCount, check existence, gzip it to .gz using gzip and
shutil.copyfileobj, then unlink the original; reference
GzipRotatingFileHandler.doRollover, RotatingFileHandler, self.backupCount and
self.baseFilename when making the change.

Comment on lines +231 to +244
# Start periodic flush timer
def _start_ledger_flush():
with EEEAggregator._ledger_lock:
if EEEAggregator._ledger_flush_timer is None:
def flush_loop():
EEEAggregator._flush_ledger_batch()
EEEAggregator._ledger_flush_timer = threading.Timer(
EEEAggregator.FLUSH_INTERVAL, flush_loop
)
EEEAggregator._ledger_flush_timer.daemon = True
EEEAggregator._ledger_flush_timer.start()
flush_loop()

_start_ledger_flush()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cd AuRoRA/src/scs/scs && wc -l eee.py

Repository: OppaAI/AGi-ROS

Length of output: 68


🏁 Script executed:

python - <<'PY'
import threading
lock = threading.Lock()
acquired_first = lock.acquire(timeout=0.1)
print(f"First acquire succeeds: {acquired_first}")
if acquired_first:
    acquired_second = lock.acquire(timeout=0.1)
    print(f"Second acquire from same thread succeeds (non-blocking): {acquired_second}")
    lock.release()
PY

Repository: OppaAI/AGi-ROS

Length of output: 149


🏁 Script executed:

cd AuRoRA/src/scs/scs && sed -n '220,250p' eee.py

Repository: OppaAI/AGi-ROS

Length of output: 1519


🏁 Script executed:

cd AuRoRA/src/scs/scs && sed -n '430,445p' eee.py

Repository: OppaAI/AGi-ROS

Length of output: 770


🏁 Script executed:

cd AuRoRA/src/scs/scs && grep -n "_flush_ledger_batch" eee.py

Repository: OppaAI/AGi-ROS

Length of output: 287


🏁 Script executed:

cd AuRoRA/src/scs/scs && sed -n '450,475p' eee.py

Repository: OppaAI/AGi-ROS

Length of output: 1195


Deadlock risk: _flush_ledger_batch() acquires _ledger_lock while it's already held.

threading.Lock is non-reentrant. The code calls _flush_ledger_batch() in two places while already holding _ledger_lock:

  1. Lines 231-244: _start_ledger_flush() holds the lock, then flush_loop() immediately calls _flush_ledger_batch()
  2. Lines 437-441: the batch-full path calls _flush_ledger_batch() inside the lock context

Both will deadlock. Release the lock before calling _flush_ledger_batch().

🔧 Proposed fix
-        def _start_ledger_flush():
-            with EEEAggregator._ledger_lock:
-                if EEEAggregator._ledger_flush_timer is None:
-                    def flush_loop():
-                        EEEAggregator._flush_ledger_batch()
-                        EEEAggregator._ledger_flush_timer = threading.Timer(
-                            EEEAggregator.FLUSH_INTERVAL, flush_loop
-                        )
-                        EEEAggregator._ledger_flush_timer.daemon = True
-                        EEEAggregator._ledger_flush_timer.start()
-                    flush_loop()
+        def _start_ledger_flush():
+            with EEEAggregator._ledger_lock:
+                if EEEAggregator._ledger_flush_timer is not None:
+                    return
+
+                def flush_loop():
+                    EEEAggregator._flush_ledger_batch()
+                    with EEEAggregator._ledger_lock:
+                        if EEEAggregator._ledger_flush_timer is None:
+                            return
+                        EEEAggregator._ledger_flush_timer = threading.Timer(
+                            EEEAggregator.FLUSH_INTERVAL, flush_loop
+                        )
+                        EEEAggregator._ledger_flush_timer.daemon = True
+                        EEEAggregator._ledger_flush_timer.start()
+
+                EEEAggregator._ledger_flush_timer = threading.Timer(
+                    EEEAggregator.FLUSH_INTERVAL, flush_loop
+                )
+                EEEAggregator._ledger_flush_timer.daemon = True
+                EEEAggregator._ledger_flush_timer.start()
...
-                with EEEAggregator._ledger_lock:
-                    EEEAggregator._ledger_queue.append(ledger_args)
-                    # Immediate flush if batch full (for low-volume bursts)
-                    if len(EEEAggregator._ledger_queue) >= EEEAggregator.BATCH_SIZE:
-                        EEEAggregator._flush_ledger_batch()
+                with EEEAggregator._ledger_lock:
+                    EEEAggregator._ledger_queue.append(ledger_args)
+                    should_flush = len(EEEAggregator._ledger_queue) >= EEEAggregator.BATCH_SIZE
+                # Immediate flush if batch full (for low-volume bursts)
+                if should_flush:
+                    EEEAggregator._flush_ledger_batch()
🤖 Prompt for AI Agents
In `@AuRoRA/src/scs/scs/eee.py` around lines 231 - 244, The code holds
EEEAggregator._ledger_lock while calling EEEAggregator._flush_ledger_batch(),
causing a deadlock because threading.Lock is not reentrant; modify
_start_ledger_flush (flush_loop) so it does not call _flush_ledger_batch() while
holding EEEAggregator._ledger_lock (release the lock before invoking
_flush_ledger_batch()), and likewise update the batch-full path that currently
calls _flush_ledger_batch() inside the _ledger_lock context to instead perform
the flush after exiting the lock; keep the Timer setup (threading.Timer, daemon,
start) unchanged but ensure actual flush calls occur outside the _ledger_lock
scope.

@OppaAI OppaAI linked an issue Feb 4, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

Development

Successfully merging this pull request may close these issues.

❤️ SCS Phase 1 - Skeletal Framework

1 participant