-
Notifications
You must be signed in to change notification settings - Fork 2
ai safety and hallucination
Note
👋 Hey there! Siyarix is a personal passion project built by a single developer that is growing and under active development. Some of the architectural components and features described on this page might currently be Planned, Work in Progress, or basic implementations. Stay tuned as it evolves! 🚀
Welcome to the core of Siyarix's defense mechanism! When operating an autonomous or semi-autonomous AI system, safety and security are paramount.
Important
Siyarix implements a robust, multi-layered safety architecture designed to protect your host system, secure operator data, and maintain the absolute integrity of your audit trails. Every single command passes through stages of validation, danger classification, secret redaction, and interactive review before it ever executes.
Think of the safety pipeline as a series of rigorous checkpoints. Every tool and command must successfully pass through these stages before execution:
Note
Workflow: User input ➔ InputValidator ➔ PermissionGate ➔ DangerAnalyzer ➔ DLPEngine ➔ ShellReview ➔ AuditLogger
Located in src/siyarix/security_hardening.py, the InputValidator is your first line of defense. It thoroughly validates and sanitizes any user-supplied targets to ensure they are safe before reaching the executor.
The validator automatically detects and verifies the format of your targets:
from siyarix.security_hardening import validator
# Validates IP, hostname, or URL seamlessly
valid, msg = validator.validate_target("10.0.0.1")
valid, msg = validator.validate_ip("192.168.1.0/24")
valid, msg = validator.validate_hostname("example.com")
valid, msg = validator.validate_url("https://example.com")To prevent malicious activity, the validator actively looks for and blocks shell injection patterns:
Warning
Any command containing the following patterns will be immediately blocked to prevent shell injection attacks.
| Pattern | Example | Severity |
|---|---|---|
| Shell pipe/redirection | ` |
, ;, &, `` `` |
| Command substitution | $(...) |
⛔ Blocked |
| Path traversal |
../, ..\\, %2e%2e
|
⛔ Blocked |
| Null byte | \x00 |
⛔ Blocked |
| Newline injection | \r\n |
⛔ Blocked |
| Format string |
%x, %n
|
⛔ Blocked |
| SQL injection keywords |
SELECT, DROP, UNION + ' or "
|
⛔ Blocked |
| Backtick execution | `cmd` |
⛔ Blocked |
If you need to clean up an argument safely, the validator can strip out dangerous characters:
safe = validator.sanitize_arg("target; rm -rf /")
# Returns: "target rm -rf " (shell metacharacters stripped)Tip
This sanitisation removes null bytes, carriage returns, newlines, ANSI escape sequences, backticks, $(), ${}, |, ;, &, <, >, and collapses dangerous ../ path traversals.
Also found in src/siyarix/security_hardening.py, the DangerAnalyzer evaluates commands to determine how destructive they might be before they are allowed to run.
Commands are classified into six severity levels, guiding how the system responds:
| Severity | Recommendation | Example Patterns |
|---|---|---|
| Critical | ⛔ Blocked |
sudo rm -rf /, mkfs, dd if=, fork bombs, format drive, chmod 777 /, credential exfiltration |
| High | ✋ Confirm |
shutdown, reboot, halt, pipe curl/wget to shell, SQL DROP/DELETE without WHERE, Remove-Item -Recurse
|
| Medium | ⚡ Caution |
rm, killall, iptables -F, netcat listener, crontab edit, PowerShell encoded command |
| Low | ℹ️ Info |
chmod, chown, crontab
|
| Info | 📝 Note | sudo |
| Safe | ✅ — | No patterns matched |
Using the analyzer is straightforward:
from siyarix.security_hardening import danger_analyzer
report = danger_analyzer.analyze("rm -rf /tmp")
print(report.severity) # "medium"
print(report.is_dangerous) # True
print(report.recommendation) # "⚡ CAUTION — Review this command before execution."Note
The analyzer protects both Linux and Windows environments, covering destructive patterns like registry manipulation, shadow copy deletion, event log clearing, and scheduled task abuse.
You can also output these warnings directly to the console with beautiful, color-coded formatting:
from rich.console import Console
danger_analyzer.format_warning(report, Console())Located in src/siyarix/permission_gate.py, the PermissionGate acts as the bouncer for your runtime environment, providing a strict two-stage safety enforcement protocol.
- Stage 1 — Syntax Check: Ensures the command is not empty and is syntactically valid.
-
Stage 2 — Danger Analysis: Consults the
DangerAnalyzerand decides what action to take based on the severity:
| Danger Severity | Gate Result | Action |
|---|---|---|
critical |
FORBIDDEN |
Blocked with a clear reason. |
high / medium
|
REVIEW |
Allowed, but flagged with requires_review=True. |
low / info / safe
|
APPROVED |
Approved for execution. |
To prevent abuse or runaway scripts, the gate limits how often commands can be called:
gate = PermissionGate(rate_limit_calls=100, rate_limit_period=60.0)Tip
By default, the limit is 100 calls per 60 seconds. The state is saved in rate_limit.json in your config directory. Exceeding this limit results in a FORBIDDEN action.
If you pass context={"restricted_payload": True}, the gate proactively checks for highly destructive patterns (like rm -rf, mkfs, dd if=) before even applying the rate limit.
The gate returns a GateResult dataclass indicating the command's status:
| Stage | What it Means |
|---|---|
SYNTAX |
Failed basic syntax validation. |
FORBIDDEN |
Blocked either by danger analysis or rate limiting. |
PERMISSION |
Currently under permission evaluation. |
REVIEW |
Passed syntax checks, but requires manual user review. |
APPROVED |
Fully approved and ready for execution. |
Found in src/siyarix/dlp.py, the DLPEngine scans tool outputs and automatically redacts sensitive information to prevent leaks.
| Category | Patterns Handled |
|---|---|
| Secrets | AWS keys (AKIA...), GCP keys (AIza...), Slack tokens (xoxb-...), GitHub tokens (ghp_...), Bearer tokens, Private keys (PEM) |
| PII (Optional) | Email addresses, US Social Security numbers |
from siyarix.dlp import DLPEngine
dlp = DLPEngine(redact_secrets=True, redact_pii=False)
safe_output = dlp.redact("API key: AKIAIOSFODNN7EXAMPLE")
# Returns: "API key: [REDACTED AWS_KEY]"
safe_dict = dlp.redact_dict({"token": "ghp_xxxxxxxxxxxxxxxxxxxx"})Note
Secrets aren't just hidden; they are clearly labeled with their category name (e.g., [REDACTED AWS_KEY]) so you know exactly what was removed.
For even stricter redaction, security_hardening.py provides a SecretRedactor that covers over 20+ patterns, including AI API keys (OpenAI, Anthropic, DeepSeek, xAI, Mistral), cloud credentials, and generic password=value pairs.
from siyarix.security_hardening import redactor
safe = redactor.redact("Key: sk-ant-xxxxxxxxxxxxxxxxxxxx")
safe_env = redactor.redact_env() # Automatically masks secrets in os.environLocated in src/siyarix/shell_review.py, the ShellReview module pauses execution to let the human operator review what the AI wants to run.
When a command needs review, the operator sees a clean, interactive prompt:
╭──────────────── Command Execution Review ─────────────────╮
│ Tool: raw │
│ Reason: Raw shell command from LLM plan │
│ │
│ nmap -sS -sV -O -Pn example.com │
╰───────────────────────────────────────────────────────────╯
Review command [edit/run/step/cancel] (run):
The operator has four choices:
| Decision | Behavior |
|---|---|
run |
Execute the command exactly as shown. |
edit |
Interactively modify the command before running it. |
step |
Execute, but step through subsequent commands one by one. |
cancel |
Skip or cancel this command entirely. |
Tip
CI / Non-TTY Mode: If the system detects it's running in a non-interactive environment (like a CI pipeline), it will automatically approve commands to prevent the process from hanging indefinitely.
Located in src/siyarix/audit_log.py, the AuditLogger provides an solid audit trail with a tamper-evident chain of custody, ensuring absolute accountability.
Every single action generates a structured AuditEvent:
@dataclass
class AuditEvent:
event_id: str # Unique UUID hex
timestamp: datetime # UTC timestamp
event_type: str # Category of the event
severity: str # info / low / medium / high / critical
user: str # The user who triggered the event
session_id: str # Unique session identifier
source_ip: str # Originating IP address
target: str # What resource was targeted
action: str # The action performed
result: str # success / failure / denied
details: dict # Any extra structured data
hash_prev: str | None # Link to the previous event's hash
hash_current: str | None # This event's hashTo guarantee that logs haven't been altered, each event's hash incorporates the hash of the previous event, creating an unbreakable chain.
Important
You can verify the integrity of your entire audit chain at any time. If someone tries to modify a past log entry, the chain will break, and the system will alert you.
audit = AuditLogger()
result = audit.verify_chain() # Returns a validation dictionaryThere are 87 defined event types spanning across multiple categories, including:
-
Authentication:
auth_login,auth_logout -
Security:
security_approval,dlp_violation,rate_limit_hit -
System:
system_start,config_change
You can easily export your logs or check status via code or CLI commands:
| Command | Purpose |
|---|---|
/audit status |
View audit statistics and check chain integrity. |
/audit export |
Export logs to JSON or CSV formats. |
/audit verify |
Manually verify the tamper-evident chain. |
Note
By default, logs are retained for 365 days. In highly sensitive "OpSec Memory-Only" mode, events are tracked in memory and never written to disk.
Found in security_hardening.py, this module generates Docker-compatible seccomp profiles to heavily sandbox executions.
It proactively blocks over 50 dangerous system calls (like mount, ptrace, reboot, and add_key) while still allowing standard tools to function normally.
from siyarix.security_hardening import SeccompProfile
profile_path = SeccompProfile.generate_docker_seccomp()
# Returns the path to your secure JSON profileLocated in src/siyarix/validators.py, the Validator class focuses on ensuring that inputs and AI-generated plans are formatted correctly and make sense.
It handles strict formatting checks for elements like:
- IP Addresses (IPv4/IPv6) & CIDR blocks
- RFC-compliant Hostnames & URLs
- Ports, Port Ranges, and Emails
Before the AI executes a plan, the validator checks every PlanStep (e.g., verifying it has a tool, arguments, and a valid timeout).
Tip
If a command fails, plan_recovery() steps in to suggest smart, automated fixes. For example, if nmap reports all filtered ports, the recovery planner might automatically suggest adding the -Pn flag to try again.
Need to dive into the code? Here's where to find everything:
| Module | Location | Purpose |
|---|---|---|
| InputValidator | src/siyarix/security_hardening.py:88 |
Validates targets and detects shell injection. |
| DangerAnalyzer | src/siyarix/security_hardening.py:650 |
Classifies the danger level of commands. |
| SecretRedactor | src/siyarix/security_hardening.py:328 |
Masks API keys, tokens, and passwords. |
| PermissionGate | src/siyarix/permission_gate.py:49 |
Enforces syntax checks and danger policies. |
| DLPEngine | src/siyarix/dlp.py:29 |
Prevents sensitive data loss in outputs. |
| ShellReview | src/siyarix/shell_review.py:48 |
Human-in-the-loop interactive reviews. |
| AuditLogger | src/siyarix/audit_log.py:194 |
Tamper-evident, personal auditing. |
| Validator | src/siyarix/validators.py:598 |
Format validation and AI plan recovery. |
| SeccompProfile | src/siyarix/security_hardening.py:771 |
Docker syscall restriction profiles. |
This document serves as your guide to maintaining a secure, hallucination-resistant, and auditable Siyarix environment.
Note
👋 Welcome to Siyarix! This is a personal passion project built by a single developer. It's currently under active development and growing fast. Expect rough edges, but lots of love! ❤️
Welcome to the Siyarix Documentation Map! This page serves as your master compass for navigating the extensive documentation we have built for the platform.
Whether you are a brand new user, a seasoned security operator, or a developer looking to contribute to the core engine, you can find exactly what you need here.
Not sure where to start? Pick the path that best describes you:
Just getting started? We highly recommend following these guides in order:
- Installation Guide — Get Siyarix running on your machine.
- Onboarding Wizard — Let our interactive wizard help you set up your API keys and environment.
- Setup & Configuration — A deeper dive into customizing your setup.
- Your First Run — A gentle walkthrough of your very first Siyarix command.
Ready to put Siyarix to work? Dive into our operational guides:
- Interactive Chat (REPL) — Learn how to use the powerful interactive terminal.
- Security Workflows — Best practices for recon, vulnerability assessment, and incident response.
- Cloud & IaC Scanning — How to secure your cloud environments and infrastructure code.
- Compliance Frameworks — Map your scans to SOC 2, HIPAA, ISO 27001, and more.
Looking under the hood or wanting to write some code? Start here:
- Contribution Guide — Our workflow, standards, and how you can help!
- Codebase Overview — A comprehensive map of our 82+ source modules.
- Testing Standards — How we ensure reliability with pytest and CI/CD.
- Module Architecture — Component design and responsibilities.
If you prefer to browse the raw structure, here is a complete layout of the docs/ folder:
docs/
├── 🚀 getting-started/ # Installation, onboarding, and configuration
│ ├── installation.md # Multi-platform install (pip, brew, winget, docker)
│ ├── onboarding.md # The interactive 11-step setup wizard
│ ├── setup.md # Managing API keys, credentials, and settings
│ ├── first-run.md # A walkthrough of your first session
│ ├── configuration.md # A deep-dive into advanced settings
│ └── troubleshooting.md # Common issues and how to fix them instantly
│
├── 📖 user/ # Daily operations and workflows
│ ├── cli-commands.md # Reference for 50+ CLI commands across 12 groups
│ ├── interactive-chat.md # Mastering the AI REPL and 54+ slash commands
│ ├── security-workflows.md # Recon, vulnerability assessment, incident response
│ ├── cloud-scanning.md # Multi-cloud security scanning (under development)
│ ├── compliance.md # Framework mapping (SOC 2, NIST, GDPR, PCI-DSS)
│ ├── threat-intelligence.md# Integrations with OTX, NVD, and MITRE ATT&CK
│ ├── playbooks.md # Building automated YAML-based IR playbooks
│ ├── workflow-files.md # DAG workflow reference (programmatic API)
│ ├── reporting.md # Multi-format report generation
│ ├── offline-registry.md # Running without AI (Offline/Registry execution mode)
│ └── ai-workflows.md # Advanced AI-driven autonomous operations
│
├── 💻 developer/ # Building, testing, and extending Siyarix
│ ├── codebase-overview.md # Full module structure mapping
│ ├── contribution-guide.md # How to submit PRs and our coding standards
│ ├── module-architecture.md# Component design and responsibilities
│ ├── testing.md # Writing tests (pytest), coverage, and CI/CD
│ └── building.md # Packaging, distribution, and Docker builds
│
├── 🏗️ architecture/ # System design and core internals
│ ├── overview.md # High-level data flow and layered orchestration
│ ├── ai-agent-pipeline.md # The AgentCore reasoning and execution pipeline
│ ├── provider-abstraction.md# How we unify 26 different AI providers
│ ├── execution-engine.md # Plan-based step orchestration
│ ├── memory-and-state.md # Knowledge graph, session persistence, and learning
│ ├── security-model.md # The Permission Gate, DLP, audit logging, and OPSEC
│ └── intent-routing.md # Semantic intent classification and routing
│
├── 🧠 ai/ # Deep dive into the AI provider & agent systems
│ ├── routing.md # Managing 26 providers, failovers, and circuit breakers
│ ├── persona-system.md # Overview of our 10 security personas
│ ├── agent-reasoning.md # The Observe-Reason-Act loop and tool call repair
│ ├── tool-execution.md # The tool registry, capability graph, and parsers
│ ├── ensemble.md # Parallel LLM voting strategies
│ ├── multi-wave.md # Iterative goal execution with context carry-over
│ ├── prompt-architecture.md# System prompt design and management
│ └── safety.md # Our rigorous 8-layer hallucination mitigation system
│
├── 🛡️ security/ # Safety, ethics, and threat models
│ ├── reporting.md # How to safely report vulnerabilities to us
│ ├── threat-model.md # System threat model and our mitigations
│ ├── operational-security.md# TOR routing, stealth modes, and OPSEC controls
│ ├── ethical-policy.md # Mandatory rules of engagement for all users
│ └── abuse-prevention.md # How we prevent misuse of the AI engine
│
└── ⚖️ legal/ # Licensing and governance
├── agpl-guide.md # A plain-English overview of the AGPL-3.0-or-later license
├── why-agpl.md # The philosophy behind our license choice
├── trademark-policy.md # Branding and trademark guidelines
├── responsible-ai.md # Our framework for ethical AI usage
├── disclaimer.md # Important legal disclaimers
└── plugin-exception.md # The license exception for building custom plugins
As you read through the documentation, you might encounter some specific terms. Here is a quick cheat sheet:
| Term | What It Means |
|---|---|
| Provider | The backend AI engine powering Siyarix (e.g., OpenAI, Anthropic, Ollama). |
| Tool | A traditional security executable installed on your system (e.g., nmap, nuclei). |
| Plan | A step-by-step sequence of tool commands intelligently generated by the AI. |
| Workflow | A hardcoded, predefined execution path (usually defined in YAML/JSON) that doesn't require AI generation. |
| Persona | A specialized behavioral profile given to the AI (e.g., instructing it to act specifically as a "Network Recon Specialist"). |
| Knowledge Graph | Siyarix's internal memory where it stores findings (like IP addresses, open ports) to contextually inform future steps. |
Need help finding something specific? Feel free to use the search bar at the top of the documentation site, or open a discussion on our GitHub!