Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/secret-scan.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ on:
pull_request:
branches: [ main ]

permissions:
contents: read

jobs:
secret-scan:
name: Native GitHub Secret Scan Proxy
Expand Down
21 changes: 21 additions & 0 deletions CHANGELOG.it.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,27 @@ Le versioni seguono il [Semantic Versioning](https://semver.org/).

## [Non rilasciato]

## [0.6.1rc2] — 2026-04-16 — Obsidian Bastion (Hardened)

### SICUREZZA: Risultati Operation Obsidian Stress

- **Shield: bypass tramite caratteri Unicode di formato (ZRT-006).** Caratteri
Unicode invisibili (ZWJ U+200D, ZWNJ U+200C, ZWSP U+200B) inseriti all'interno
di un token potevano eludere il pattern matching. Il normalizzatore ora rimuove
tutti i caratteri Unicode di categoria Cf prima della scansione.
- **Shield: bypass tramite offuscamento con entità HTML (ZRT-006).** I riferimenti
a caratteri HTML (`AK` → `AK`) potevano nascondere i prefissi delle
credenziali. Il normalizzatore ora decodifica le entità `&#NNN;`/`&#xHH;`
tramite `html.unescape()`.
- **Shield: bypass tramite interleaving di commenti (ZRT-007).** Commenti HTML
(`<!-- -->`) e commenti MDX (`{/* */}`) inseriti all'interno di un token
potevano interrompere il pattern matching. Il normalizzatore ora rimuove
entrambe le forme di commento.
- **Shield: rilevamento token spezzati tra righe (ZRT-007).** Aggiunto un buffer
lookback di 1 riga tramite `scan_lines_with_lookback()` per rilevare segreti
suddivisi su due righe consecutive (es. scalari YAML folded). I duplicati sono
soppressi tramite il set di tipi già rilevati sulla riga precedente.

### Aggiunto

- **`--format json` sui comandi di controllo singoli.** `check links`, `check orphans`,
Expand Down
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,25 @@ Versions follow [Semantic Versioning](https://semver.org/).

## [Unreleased]

## [0.6.1rc2] — 2026-04-16 — Obsidian Bastion (Hardened)

### SECURITY: Operation Obsidian Stress Findings

- **Shield: Unicode format character bypass (ZRT-006).** Zero-width Unicode
characters (ZWJ U+200D, ZWNJ U+200C, ZWSP U+200B) inserted mid-token could
break regex matching. The normalizer now strips all Unicode category Cf
characters before scanning.
- **Shield: HTML entity obfuscation bypass (ZRT-006).** HTML character
references (`&#65;&#75;` → `AK`) could hide credential prefixes. The
normalizer now decodes `&#NNN;`/`&#xHH;` entities via `html.unescape()`.
- **Shield: comment-interleaving bypass (ZRT-007).** HTML comments
(`<!-- -->`) and MDX comments (`{/* */}`) inserted mid-token could break
pattern matching. The normalizer now strips both comment forms.
- **Shield: cross-line split-token detection (ZRT-007).** Added a 1-line
lookback buffer via `scan_lines_with_lookback()` to detect secrets split
across two consecutive lines (e.g. YAML folded scalars). Suppresses duplicates
via previous-line seen set.

### Added

- **`--format json` on individual check commands.** `check links`, `check orphans`,
Expand Down
4 changes: 2 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ abstract: >-
Markdown-based documentation. Zenzic introduces Universal Discovery,
VCS-aware exclusion mapping, and the Sentinel Shield middleware to provide
a deterministic Safe Harbor for complex documentation lifecycles.
version: 0.6.1rc1
date-released: 2026-04-15
version: 0.6.1rc2
date-released: 2026-04-16
url: "https://zenzic.dev"
repository-code: "https://github.com/PythonWoods/zenzic"
repository-artifact: "https://pypi.org/project/zenzic/"
Expand Down
1 change: 1 addition & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,7 @@ a follow-up issue for the refactor.
## Security & Compliance

- **Security First:** Any new path resolution MUST be tested against Path Traversal. Use `PathTraversal` logic from `core`.
- **Shield Obfuscation Tests:** Every new Shield pattern or normalizer rule MUST include obfuscation regression tests: Unicode format characters (category Cf), HTML entity encoding, comment interleaving (HTML `<!-- -->` and MDX `{/* */}`), and cross-line split tokens. See `tests/test_shield_obfuscation.py` for reference.
- **Bilingual Parity:** Documentation lives in [zenzic-doc](https://github.com/PythonWoods/zenzic-doc). Refer documentation contributors there.

---
Expand Down
8 changes: 4 additions & 4 deletions README.it.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ SPDX-License-Identifier: Apache-2.0
</p>

```bash
╭─────────────────────── 🛡 ZENZIC SENTINEL v0.6.1rc1 ───────────────────────╮
╭─────────────────────── 🛡 ZENZIC SENTINEL v0.6.1rc2 ─────────────────-────╮
│ │
│ docusaurus • 38 files (18 docs, 20 assets) • 0.9s │
│ │
Expand Down Expand Up @@ -77,12 +77,12 @@ dimostrabile, e la CLI è 100% subprocess-free.

## Capacità Principali

- **Sicurezza** — Shield (8 famiglie di credenziali, Exit 2) & Sentinella di Sangue (path traversal verso directory di sistema, Exit 3). Regex ReDoS-safe (F2-1), protezione jailbreak (F4-1). Nessuno dei due è sopprimibile con `--exit-zero`.
- **Sicurezza** — Shield (9 famiglie di credenziali, Exit 2) con resistenza all'offuscamento Unicode, decodifica entità HTML, difesa da comment-interleaving e lookback per token spezzati tra righe. Sentinella di Sangue (path traversal verso directory di sistema, Exit 3). Regex ReDoS-safe (F2-1), protezione jailbreak (F4-1). Nessuno dei due è sopprimibile con `--exit-zero`.
- **Integrità** — Rilevamento link circolari O(V+E), Virtual Site Map con cache content-addressable, punteggio qualità deterministico 0–100.
- **Intelligenza** — Multi-engine: MkDocs, Docusaurus v3, Zensical e Vanilla. Cache adapter a livello di modulo. Gli adapter di terze parti si installano come pacchetti Python tramite entry point.
- **Discovery** — Iterazione file universale VCS-aware (zero `rglob`), `ExclusionManager` obbligatorio su ogni entry point, gerarchia di Esclusione a 4 livelli, parser `.gitignore` pure-Python.

> 🚀 **Ultima Release: v0.6.1rc1 "Obsidian Bastion"** — vedi [CHANGELOG.md](CHANGELOG.md) per i dettagli.
> 🚀 **Ultima Release: v0.6.1rc2 "Obsidian Bastion"** — vedi [CHANGELOG.md](CHANGELOG.md) per i dettagli.

---

Expand Down Expand Up @@ -642,7 +642,7 @@ nox -s preflight # pipeline CI completa (lint + test + self-check)
L'audit completo della Sentinella — banner, rilevamento engine e verdetto:

```bash
╭─────────────────────── 🛡 ZENZIC SENTINEL v0.6.1rc1 ───────────────────────╮
╭─────────────────────── 🛡 ZENZIC SENTINEL v0.6.1rc2 ───────────────────────╮
│ │
│ docusaurus • 38 files (18 docs, 20 assets) • 0.9s │
│ │
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ SPDX-License-Identifier: Apache-2.0
</p>

```bash
╭─────────────────────── 🛡 ZENZIC SENTINEL v0.6.1rc1 ───────────────────────╮
╭─────────────────────── 🛡 ZENZIC SENTINEL v0.6.1rc2 ───────────────────────╮
│ │
│ docusaurus • 38 files (18 docs, 20 assets) • 0.9s │
│ │
Expand Down Expand Up @@ -75,12 +75,12 @@ engine identity must be provable, and the CLI is 100% subprocess-free.

## Core Capabilities

- **Security** — Shield (8 credential families, Exit 2) & Blood Sentinel (host-path traversal, Exit 3). ReDoS-safe regex (F2-1), jailbreak protection (F4-1). Neither is suppressed by `--exit-zero`.
- **Security** — Shield (9 credential families, Exit 2) with Unicode obfuscation resistance, HTML entity decoding, comment-interleaving defense, and cross-line split-token lookback. Blood Sentinel (host-path traversal, Exit 3). ReDoS-safe regex (F2-1), jailbreak protection (F4-1). Neither is suppressed by `--exit-zero`.
- **Integrity** — O(V+E) circular link detection, Virtual Site Map with content-addressable cache, deterministic 0–100 quality score.
- **Intelligence** — Multi-engine: MkDocs, Docusaurus v3, Zensical, and Vanilla. Module-level adapter cache. Third-party adapters install as Python packages via entry points.
- **Discovery** — Universal VCS-aware file iteration (zero `rglob`), mandatory `ExclusionManager` on every entry point, 4-level Layered Exclusion hierarchy, pure-Python `.gitignore` parser.

> 🚀 **Latest Release: v0.6.1rc1 "Obsidian Bastion"** — see [CHANGELOG.md](CHANGELOG.md) for details.
> 🚀 **Latest Release: v0.6.1rc2 "Obsidian Bastion"** — see [CHANGELOG.md](CHANGELOG.md) for details.

---

Expand Down Expand Up @@ -634,7 +634,7 @@ nox -s preflight # full CI pipeline (lint + test + self-check)
The full Sentinel audit — banner, engine detection, and pass/fail verdict:

```bash
╭─────────────────────── 🛡 ZENZIC SENTINEL v0.6.1rc1 ───────────────────────╮
╭─────────────────────── 🛡 ZENZIC SENTINEL v0.6.1rc2 ───────────────────────╮
│ │
│ docusaurus • 38 files (18 docs, 20 assets) • 0.9s │
│ │
Expand Down
48 changes: 28 additions & 20 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,32 @@
<!-- SPDX-FileCopyrightText: 2026 PythonWoods <dev@pythonwoods.dev> -->
<!-- SPDX-License-Identifier: Apache-2.0 -->

# Zenzic v0.6.1rc1 — Obsidian Bastion Release Protocol
# Zenzic v0.6.1rc2 — Obsidian Bastion (Hardened) Release Protocol

**Prepared by:** S-1 (Auditor)
**Date:** 2026-04-15
**Status:** RELEASE CANDIDATE — All gates passed
**Date:** 2026-04-16
**Status:** RELEASE CANDIDATE 2 — Security audit completed
**Branch:** `main`
**Codename:** Obsidian Bastion — The Fortress Architecture
**Codename:** Obsidian Bastion (Hardened) — Post-Stress-Test Seal

> **Tech Lead note:** This RC1 marks the culmination of 5 alpha releases since
> The Sentinel (v0.5.0a4). Zenzic has evolved from a MkDocs-specific linter into
> an **engine-agnostic Documentation Platform Analyser** with 4 adapters, Layered
> Exclusion, and zero subprocesses. All gates below have been verified.
> **Tech Lead note:** RC2 follows Operation Obsidian Stress — a controlled
> siege by Red/Blue/Purple teams. The Red Team found 4 Shield bypass vectors
> (Unicode Cf, HTML entities, comment-interleaving, cross-line split). All
> have been sealed. The Purple Team identified 6 documentation drift items
> including a phantom `serve` command. All corrected. 1046 tests pass.

---

## 1. Version Anchors

| Location | Expected | Status |
| :--- | :--- | :---: |
| `src/zenzic/__init__.py` | `0.6.1rc1` | ✅ |
| `pyproject.toml` `[project]` | `0.6.1rc1` | ✅ |
| `pyproject.toml` `[tool.bumpversion]` | `0.6.1rc1` | ✅ |
| `CITATION.cff` | `0.6.1rc1` | ✅ |
| `CHANGELOG.md` top entry | `[0.6.1rc1]` | ✅ |
| `CHANGELOG.it.md` top entry | `[0.6.1rc1]` | ✅ |
| `src/zenzic/__init__.py` | `0.6.1rc2` | ✅ |
| `pyproject.toml` `[project]` | `0.6.1rc2` | ✅ |
| `pyproject.toml` `[tool.bumpversion]` | `0.6.1rc2` | ✅ |
| `CITATION.cff` | `0.6.1rc2` | ✅ |
| `CHANGELOG.md` top entry | `[0.6.1rc2]` | ✅ |
| `CHANGELOG.it.md` top entry | `[0.6.1rc2]` | ✅ |

**Not tracked** (Clean Harbor):

Expand Down Expand Up @@ -95,6 +96,12 @@
- [x] **F4-1:** `_validate_docs_root()` rejects `docs_dir` escaping repo root (Exit Code 3)
- [x] **Adapter Cache:** Module-level dict keyed by `(engine, docs_root, repo_root)`, thread-safe
- [x] **Shield IO Middleware:** Frontmatter lines scanned before any parser processes them
- [x] **ZRT-006:** Unicode Cf character stripping in Shield normalizer (zero-width bypass)
- [x] **ZRT-006:** HTML entity decoding in Shield normalizer (`&#NNN;` bypass)
- [x] **ZRT-007:** HTML/MDX comment stripping in Shield normalizer (interleaving bypass)
- [x] **ZRT-007:** 1-line lookback buffer `scan_lines_with_lookback()` (split-token bypass)
- [x] **Red Team:** 11 Blood Sentinel jailbreak vectors tested — all blocked
- [x] **Red Team:** DoS resilience verified (10MB lines, 5000 files, 50-level nesting)

---

Expand Down Expand Up @@ -125,11 +132,11 @@

## 8. Quality Gates

- [x] `pytest` — 929 tests passing, 0 failed
- [x] `pytest` — 1046 tests passing, 0 failed
- [x] `ruff check src/` → 0 violations
- [x] `reuse lint` → compliant
- [x] `pip install -e .` → `zenzic --help` outputs usage
- [x] `uv run zenzic --version` → `Zenzic v0.6.1rc1`
- [x] `uv run zenzic --version` → `Zenzic v0.6.1rc2`

---

Expand All @@ -151,16 +158,17 @@

---

## 11. RC1 Gate Decision
## 11. RC2 Gate Decision

- [x] All gates (§§ 2–9) verified
- [x] Benchmark § 10 within acceptable thresholds
- [x] No open blocking issues
- [x] Operation Obsidian Stress completed — 4 Shield bypasses sealed
- [x] Documentation Reality Sync — 6 drift items corrected
- [x] CI pipeline green on `main`

**Decision:** ✅ RC1 approved — `v0.6.1rc1` tagged and published to PyPI
**Decision:** ✅ RC2 approved — `v0.6.1rc2` tagged and published to PyPI

---

*"La Sentinella non rilascia sulla fiducia, rilascia sull'evidenza."*
*"Il Bastione non si fida dell'assenza di attacchi — si fida della resistenza verificata."*
— Senior Tech Lead
2 changes: 1 addition & 1 deletion examples/vcs-aware-project/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ SPDX-License-Identifier: Apache-2.0
# VCS-Aware Project Example

This example demonstrates Zenzic's **VCS-aware exclusion** features introduced
in v0.6.1rc1 "Obsidian Bastion".
in v0.6.1rc2 "Obsidian Bastion".

## What this example shows

Expand Down
20 changes: 14 additions & 6 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ build-backend = "hatchling.build"

[project]
name = "zenzic"
version = "0.6.1rc1"
version = "0.6.1rc2"
description = "Engineering-grade, engine-agnostic linter and security shield for Markdown documentation"
readme = "README.md"
requires-python = ">=3.11"
Expand Down Expand Up @@ -183,7 +183,7 @@ pytest_add_cli_args = ["--import-mode=prepend"]
# ─── Version bumping ───────────────────────────────────────────────────────────

[tool.bumpversion]
current_version = "0.6.1rc1"
current_version = "0.6.1rc2"
commit = true
tag = true
tag_name = "v{new_version}"
Expand Down Expand Up @@ -216,18 +216,26 @@ filename = "CITATION.cff"
search = "version: {current_version}"
replace = "version: {new_version}"

[[tool.bumpversion.files]]
filename = "CITATION.cff"
search = "date-released: \\d{{4}}-\\d{{2}}-\\d{{2}}"
replace = "date-released: {now:%Y-%m-%d}"
regex = true

[[tool.bumpversion.files]]
# CHANGELOG uses PEP 440 normalized form: 0.5.0-a2 (hyphen before pre-release label).
# The serialize pattern below produces that form; pyproject uses 0.5.0a2 (no hyphen).
filename = "CHANGELOG.md"
search = "[{current_version}]"
replace = "[{new_version}]"
# Cerca solo l'header di secondo livello — evita match su sezioni storiche e link di riferimento
search = "## [{current_version}]"
replace = "## [{new_version}]"
serialize = ["{major}.{minor}.{patch}{pre_l}{pre_n}", "{major}.{minor}.{patch}"]

[[tool.bumpversion.files]]
filename = "CHANGELOG.it.md"
search = "[{current_version}]"
replace = "[{new_version}]"
# Cerca solo l'header di secondo livello — evita match su sezioni storiche e link di riferimento
search = "## [{current_version}]"
replace = "## [{new_version}]"
serialize = ["{major}.{minor}.{patch}{pre_l}{pre_n}", "{major}.{minor}.{patch}"]

[[tool.bumpversion.files]]
Expand Down
2 changes: 1 addition & 1 deletion src/zenzic/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
# SPDX-License-Identifier: Apache-2.0
"""Zenzic — engine-agnostic linter and security shield for Markdown documentation."""

__version__ = "0.6.1rc1"
__version__ = "0.6.1rc2"
2 changes: 1 addition & 1 deletion src/zenzic/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -1595,7 +1595,7 @@ def _init_standalone(repo_root: Path, force: bool) -> None:
"# Zenzic Shield — built-in credential scanner (always active, no config required).\n"
"# Detected pattern families: openai-api-key, github-token, aws-access-key,\n"
"# stripe-live-key, slack-token, google-api-key, private-key,\n"
"# hex-encoded-payload (3+ consecutive \\xNN sequences).\n"
"# hex-encoded-payload (3+ consecutive \\xNN sequences), gitlab-pat.\n"
"# All lines including fenced code blocks are scanned. Exit code 2 on detection.\n"
"\n"
"# Declare project-specific lint rules (no Python required):\n"
Expand Down
9 changes: 4 additions & 5 deletions src/zenzic/core/scanner.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
)
from zenzic.core.reporter import Finding
from zenzic.core.rules import AdaptiveRuleEngine, BaseRule
from zenzic.core.shield import SecurityFinding, scan_line_for_secrets, scan_url_for_secrets
from zenzic.core.shield import SecurityFinding, scan_lines_with_lookback, scan_url_for_secrets
from zenzic.core.validator import LinkValidator
from zenzic.models.config import ZenzicConfig
from zenzic.models.references import IntegrityReport, ReferenceFinding, ReferenceMap
Expand Down Expand Up @@ -637,10 +637,9 @@ def harvest(self) -> Generator[HarvestEvent, None, None]:
secret_line_nos: set[int] = set()
shield_events: list[HarvestEvent] = []
with self.file_path.open(encoding="utf-8") as fh:
for lineno, line in enumerate(fh, start=1): # ALL lines, no filter
for finding in scan_line_for_secrets(line, self.file_path, lineno):
shield_events.append((lineno, "SECRET", finding))
secret_line_nos.add(lineno)
for finding in scan_lines_with_lookback(enumerate(fh, start=1), self.file_path):
shield_events.append((finding.line_no, "SECRET", finding))
secret_line_nos.add(finding.line_no)

# ── 1.b Content pass: harvest ref-defs and alt-text (fences skipped) ─
content_events: list[HarvestEvent] = []
Expand Down
Loading
Loading