Security-native LLM system for the AI-generated software era.
Nullsec S1 is being built to analyze, explain, patch, and gate AI-generated applications before they reach production.
It combines:
- a curated corpus of AI-generated application failures
- a specialized code-security model training pipeline
- structured JSON security verdicts
- a deterministic Security Alignment Layer
- secure patch generation
- CI / API / CLI enforcement
- a benchmark and release-validation framework
Current status: training-ready, corpus-complete, safety-layer-enforced. See Current verified state for the full, honest breakdown.
This is not a scanner. It is a full security system for AI-generated software — a specialized security model wrapped in a non-bypassable, deterministic enforcement layer.
A note on names. Nullsec S1 is the system. Its reference implementation ships as the
nullsec1Python package and CLI, and the model release identity isNullsec-1.0(python -m nullsec.core.version). You will see all three; they refer to the same project.
- Architecture
- Why Nullsec S1 exists
- What Nullsec S1 does
- How S1 is different
- Core system components
- Current verified state
- Training runs: RC1 → RC2
- The Security Alignment Layer
- Quickstart
- Corpus status
- Training workflow
- Training on GPU
- Benchmark workflow
- Release pipeline
- Repo structure
- Roadmap
- What Nullsec S1 does not claim
- Contributing
- Security
- License
Nullsec S1 is a pipeline, not a single model call. A security-tuned model proposes a verdict; two deterministic layers align and enforce it before anything is trusted.
flowchart TD
A["AI-generated app / repo / PR / MCP tool / wallet flow"] --> B["Nullsec S1 reasoning pipeline<br/>(security-tuned model: detect · classify · explain · patch)"]
B -->|raw output| C["Structured JSON verdict<br/>(verdict schema)"]
C --> D["Security Alignment Layer<br/>parse · schema-validate · normalize severities"]
D --> E["Nullsec Safety Layer<br/>deterministic enforcement R1–R6"]
E --> F["Enforced verdict<br/>(production_ready computed, never trusted from the model)"]
F --> G["Patch · Report · CI gate · API response"]
Plain-text view of the same flow:
AI-generated app / repo / PR / MCP tool / wallet flow
│
▼
Nullsec S1 reasoning pipeline (security-tuned model: detect · classify · explain · patch)
│ raw output
▼
structured JSON verdict (data/schemas/verdict.schema.json)
│
▼
Security Alignment Layer (parse · schema-validate · type-check · normalize severities)
│ structurally-valid verdict
▼
Nullsec Safety Layer (deterministic enforcement: rules R1–R6, severity/risk flooring)
│
▼
enforced verdict (production_ready recomputed, never trusted from the model)
│
▼
patch · report · CI gate · API response
The model's own production_ready claim is advisory only. The Safety Layer recomputes it and allows true only when all eight check dimensions pass with no HIGH/CRITICAL finding:
auth · secrets · input_validation · rate_limits · permissions · dangerous_exec · dependency_risk · environment_exposure
Full design: docs/SYSTEM_OVERVIEW.md and docs/architecture/nullsec1_architecture.md.
AI tools generate software faster than humans can review it. The bottleneck is no longer generation. The bottleneck is trust.
Generated apps routinely ship with:
- broken or missing authentication
- exposed secrets and credentials
- unsafe admin routes
- missing rate limits
- insecure file uploads
- unsafe wallet approvals and unbounded transactions
- over-permissioned MCP tools
- prompt-injection-to-tool-execution paths
- environment / configuration exposure
- production paths nobody actually verified
A general code model can describe these problems in prose. What it cannot do is give you a structured, schema-checked verdict whose production_ready decision is enforced deterministically rather than asserted by a model that can be prompt-injected. That gap — between opinion and enforced verdict — is the reason Nullsec S1 exists.
Given source code (a file, a directory, a PR, an MCP tool definition, or a generated Web3 flow), Nullsec S1:
- Detects vulnerabilities across a 16-category security taxonomy.
- Classifies severity and confidence per finding.
- Explains a concrete exploit scenario for each finding.
- Patches — emits a secure patch target (unified diff or corrected snippet) per finding.
- Reports on all eight check dimensions explicitly (
pass | fail | not_applicable | not_checked). - Gates production-readiness through the deterministic Safety Layer, never the model alone.
- Enforces in CI / API / CLI — the CLI exits non-zero when code is not production-ready, so it doubles as a build gate.
Output is always a single JSON object defined by data/schemas/verdict.schema.json, and it always passes through the same two deterministic layers — whether invoked through the server, the CLI, the benchmark suite, or the training data builder. There is exactly one enforcement path.
| Capability | Nullsec S1 | GPT / Claude / generic coding LLMs |
|---|---|---|
| Objective | Security verdicts for AI-generated apps | General code generation/reasoning |
| Corpus | AI-generated app failure corpus | Broad general-purpose data |
| Output | Structured JSON security verdicts | Free-form responses |
| Enforcement | Deterministic Safety Layer | User-defined prompting |
| Production readiness | Built-in release gating | Manual interpretation |
| MCP / tool abuse | First-class category | General reasoning |
| Wallet / Web3 risk | First-class category | General reasoning |
| Secure patches | Security-native patch targets | General code edits |
| Claim validation | Built-in honesty checks | External / manual |
Nullsec S1 is not trying to be better at everything. It is being built to be better at one job: verifying whether AI-generated software is safe to ship. General-purpose models remain excellent at generation and broad reasoning; Nullsec S1 is the specialized, enforceable reviewer that sits in front of them.
| Path | What it is |
|---|---|
corpus/ |
Curated training corpus — the single source of truth (authored/ + opt-in ingested/ + synthetic/). |
taxonomy/ |
The 16-category security taxonomy mapped to 8 check dimensions (taxonomy.json). |
nullsec/safety/ |
The Security Alignment Layer (alignment.py) + Nullsec Safety Layer (enforcement.py). |
nullsec/core/ |
Reasoning pipeline (engine.py), verdict models, canonical prompts, version/fingerprint. |
nullsec/ingest/ |
CVE/NVD, Semgrep, SARIF/CodeQL ingestion into the verdict contract. |
training/ |
Dataset prep, QLoRA training, corpus validation, release threshold, preflight. |
benchmarks/ |
Evaluation runners + adversarial Safety Layer probes. |
scripts/validate_claims.py |
Public claim validator — the honesty gate. |
scripts/release_candidate.py |
Release gate — builds a bundle only from real artifacts. |
serving/ |
FastAPI serving layer (/v1/model, /v1/analyze, /v1/patch, streaming). |
cli/ |
nullsec1 command-line analyzer + CI gate. |
reports/ |
Corpus curation sprint reports (auditable provenance). |
docs/ |
Technical documentation (system overview, safety layer, corpus, roadmap, non-claims). |
The corpus exceeds the v1.0 training threshold and the deterministic Safety Layer is fully enforced and tested. This source checkout contains no committed trained adapter or benchmark report — trained weights are published as GitHub Release assets, not committed to source (see Training runs). The repo enforces that distinction automatically through claim validation.
This snapshot reflects the artifacts on disk right now. Every number below is produced by a command in this repo — none are hand-entered. Run the commands in Quickstart to reproduce them.
| Fact | Value | Source command |
|---|---|---|
| Curated corpus | 556 examples (119 hand-authored + 437 curated-ingested) | training/dataset_stats.py --include-ingested |
| Train / eval split | 445 train / 111 eval (eval_frac 0.2, seed 42) | training/prepare_dataset.py --include-ingested |
| Taxonomy categories | 16 categories → 8 check dimensions | taxonomy/taxonomy.json |
| Per-category coverage | every category ≥ 31 curated (v1.0 needs ≥ 25; v1.1 target ≥ 60) | training/release_threshold.py --include-ingested |
| Safety Layer consistency | 100% (556 / 556) | training/dataset_stats.py --include-ingested |
| Benchmark suite | 111 labeled cases across all 16 categories | benchmarks/datasets/detection.json |
| Adversarial safety probes | 8 / 8 blocked, 0 bypassed | python -m benchmarks.safety_probes |
| Test suite | passing | pytest -q |
| Release threshold (v1.0) | PASS | training/release_threshold.py --include-ingested |
| Release threshold (v1.1 / RC2) | BLOCKED (corpus 556/1000) — honest gap | training/release_threshold.py --include-ingested --profile rc2 |
What the claim validator permits for this checkout (python scripts/validate_claims.py):
- ✅
training-ready— pipeline, schema, and a non-empty curated corpus are present - ✅
safety-layer-enforced— all adversarial Safety Layer probes are blocked - ⛔
trained model— gated here: no trained adapter is committed to this source repo (RC1's is a GitHub Release asset) - ⛔
benchmarked— gated here: no benchmark report is committed to this checkout - ⛔ real-model evaluation — gated here: same reason
- ⛔
release candidate— gated here: requires the trained adapter, benchmark, probes, and bundle to be present - ⛔
production-ready(the model) — gated here: the v1.0 bar; RC2 adds a stricter, separate gate (≥ 100 benchmark cases, F1 ≥ 0.90, zero false-safe) - ⛔
first / only / best— never auto-permitted from local artifacts
The honesty gate (scripts/validate_claims.py --check) scans this README and fails CI if it asserts anything the on-disk artifacts do not support. The wording here is written to pass that check truthfully — and it stays conservative by design even though RC1 has run, because RC1's artifacts are distributed as a Release rather than committed here.
RC1 — completed. The first real QLoRA training run was carried out on a RunPod
A100 80GB box against this corpus, using the hardened CUDA 12.1 stack
(torch==2.5.1+cu121). Its trained adapter, real-model benchmark report, and
adversarial safety-probe results are published as a GitHub Release, not
committed to this source repo — the repository stays lightweight and trained
weights ship as release assets. The RC1 stack fixes (the reason train_qlora.py
uses max_length, drops the legacy completion-only collator, and pins the cu121
torch build) are documented in RELEASE_TRAINING.md,
RUNPOD.md, and docs/ROADMAP.md.
Because those artifacts are intentionally absent from this checkout, the in-repo claim validator stays conservative here — it gates trained/benchmarked wording on what is present on disk. That is the point: the README never asserts more than this checkout can prove.
RC2 / v1.1 — in preparation. A stronger run: a larger curated corpus (target
1,000), the benchmark expanded to 100+ cases across all 16 categories (done: 111),
and a hardened, reproducible CUDA 12.1 training stack. RC2 has its own stricter
data gate (training/release_threshold.py --profile rc2) and a stricter,
separate production-ready claim gate (real-model release candidate, ≥ 100 benchmark
cases, zero false-safe rate, detection F1 ≥ 0.90). Both report BLOCKED today until
the corpus reaches the v1.1 target — see docs/ROADMAP.md.
The deterministic layer is the reason Nullsec S1 is a security system rather than a code model that emits opinions. It runs in two stages.
Stage 1 — Security Alignment Layer (nullsec/safety/alignment.py): extract the JSON object (tolerant of code fences, preamble, and trailing prose), validate it against the verdict schema, type-check it into the Verdict model, and normalize finding severities up to each category's taxonomy floor. Anything that cannot be aligned raises VerdictParseError instead of being guessed at.
Stage 2 — Nullsec Safety Layer (nullsec/safety/enforcement.py): take the structurally-valid verdict and deny production_ready if any of these hold:
| Rule | production_ready is denied when… |
|---|---|
| R1 | any required dimension is not_checked |
| R2 | any required dimension is fail |
| R3 | any finding is HIGH or CRITICAL |
| R4 | risk_score exceeds the production threshold (default 20) |
| R5 | a finding contradicts a dimension reported as pass |
| R6 | overall severity is HIGH or CRITICAL |
It also raises (never lowers) severity and risk_score to match the worst finding, so the model cannot under-report. Because enforcement is deterministic and independent of the model, an attacker who manipulates the model — e.g. via prompt injection embedded in the code under review — still cannot obtain a false production_ready: true. This is verified by adversarial probes in benchmarks/safety_probes.py, including a prompt-injection-in-prose probe.
Deep dive: docs/SECURITY_ALIGNMENT_LAYER.md.
Local CPU machines can verify the corpus, the deterministic layers, and the safety probes — no GPU required.
python3.11 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip setuptools wheel
python -m pip install -e ".[dev]"
python training/prepare_dataset.py --include-ingested --out data/processed
pytest -q
python training/validate_corpus.py --include-ingested
python training/release_threshold.py --include-ingested
python scripts/validate_claims.py --checkInspect model identity and the reproducible fingerprint at any time:
python -m nullsec.core.versioncorpus/ is the single source of truth for training data. The current curated corpus is 500 examples (63 hand-authored + 437 curated-ingested), every taxonomy category has ≥ 30 curated examples, and Safety Layer consistency is 100% — so the v1.0 training threshold (training/release_threshold.py) passes.
Provenance is tracked explicitly and never blurred:
hand_authored— original examples written for this repo (counts as curated).curated_ingested— CVE / scanner / real-failure records that passed human review and source-provenance enforcement (counts as curated).synthetic_variant— labeled, structure-preserving augmentations; never counts toward curated thresholds.
Raw and rejected candidates are tracked separately and are never training-eligible. The curation workflow, schema, and provenance rules are documented in docs/CORPUS.md, with auditable sprint reports in reports/.
The training targets are built from the corpus through the same alignment + safety layers used at serving time, so no malformed or gate-inconsistent verdict ever enters training.
# 1. build chat-formatted train/eval JSONL from the curated corpus
python training/prepare_dataset.py --include-ingested --out data/processed
# 2. confirm the corpus is genuinely v1.0-ready (exits non-zero until it is)
python training/release_threshold.py --include-ingested
# 3. (on a GPU box) preflight, then train the QLoRA adapter
python training/preflight_train.py
python training/train_qlora.py --config training/config.yamlThe default config targets a single 24GB GPU (e.g. RTX 3090/4090) with 4-bit NF4 QLoRA, completion-only loss on the verdict tokens. A 14B/A100 configuration is a commented config-only swap in training/config.yaml. Base model: Qwen/Qwen2.5-Coder-7B-Instruct (Apache 2.0).
Local CPU machines can verify the corpus and the safety layer, but cannot realistically train the model. QLoRA training requires a CUDA-capable NVIDIA GPU.
The end-to-end pipeline (prepare → preflight → train → merge → benchmark → release → validate) is wrapped in one script:
bash scripts/run_training_pipeline.shA complete, beginner-friendly walkthrough — choosing a GPU box, disk requirements, environment setup, expected artifacts, and how to collect outputs — is in GPU_QUICKSTART.md.
training/preflight_train.py checks the GPU stack before you spend money: it exits 2 when no CUDA GPU is available (the expected result on a laptop), so you never start a doomed run.
The benchmark suite measures detection accuracy, false-safe rate, hallucination rate, OWASP coverage, patch correctness (structural), and a secure-generation score. It reports numbers only from real runs — this repo ships none.
# against the live model (GPU):
python benchmarks/run_all.py --mode model --adapter outputs/nullsec-s1-qlora
# against captured real outputs (no GPU); reports are marked replay-only:
python benchmarks/run_all.py --mode replay --replay path/to/captured.jsonlA case with no model output is recorded as a real miss, never a synthetic pass. Until a real run exists, the claim validator forbids any "benchmarked" or "evaluated" wording.
A release bundle is produced only when the model, its outputs, and the Safety Layer are all real:
python scripts/release_candidate.py --adapter outputs/nullsec-s1-qlora --dataset detection.json
python scripts/validate_claims.py --adapter outputs/nullsec-s1-qlora \
--report releases/nullsec-1.0/benchmark/SUITE.json --checkrelease_candidate.py aborts (writing nothing) if the adapter is missing, the model fails to load, no outputs are produced, any report section is empty, or any Safety Layer probe is bypassed. Only after it succeeds with a trained adapter and a real-model report will validate_claims.py permit the corresponding public claims. The full path is documented in RELEASE_TRAINING.md.
README.md you are here
GPU_QUICKSTART.md beginner-friendly GPU training walkthrough
RELEASE_TRAINING.md training-to-release runbook
CONTRIBUTING.md how to contribute (corpus, taxonomy, probes, docs)
SECURITY.md vulnerability reporting & responsible disclosure
model_card/ Nullsec-1 model card (identity, intended use, limits)
taxonomy/ 16-category security taxonomy — single source of truth
corpus/ curated training corpus (authored/ + ingested/ + synthetic/)
data/ verdict schema (data/schemas) + processed datasets
training/ prepare_dataset · train_qlora · merge_adapter · validate_corpus
· release_threshold · preflight_train · config.yaml
nullsec/
core/ reasoning pipeline, verdict models, prompts, version/fingerprint
safety/ Security Alignment Layer + Nullsec Safety Layer
ingest/ CVE/NVD, Semgrep, SARIF/CodeQL ingestion
serving/ FastAPI serving layer
benchmarks/ benchmark suite + adversarial Safety Layer probes
scripts/ release_candidate.py · validate_claims.py · run_training_pipeline.sh
examples/ worked vulnerable cases + expected verdicts
releases/ generated release bundles (real artifacts only; ships empty)
cli/ nullsec1 CLI analyzer + CI gate
tests/ deterministic-layer test suite (no GPU)
docs/ architecture · system overview · safety layer · corpus · roadmap
.github/ CI security gate · issue templates · PR template
| Phase | Milestone | State |
|---|---|---|
| 0 | Corpus + safety foundation | ✅ done |
| 1 | First QLoRA training run | next |
| 2 | Real model evaluation | planned |
| 3 | Release candidate | planned |
| 4 | CLI / GitHub Action hardening | planned |
| 5 | Hosted Nullsec S1 API | planned |
| 6 | Dashboard / audit reports | planned |
| 7 | Larger corpus + stronger model variants | planned |
| 8 | Base/Web3-specific eval suites | planned |
| 9 | MCP / tool-risk runtime integration | planned |
Full detail and exit criteria per phase: docs/ROADMAP.md.
To keep this project rigorous rather than exaggerated, the following are explicitly not claimed today:
- Nullsec S1 is not trained from scratch (it fine-tunes an open code model).
- It is not yet a trained model release — no trained adapter exists.
- It is not yet benchmarked with real model outputs — no benchmark report exists.
- It is not a replacement for human security review, SAST/DAST, or a security team.
- It is not guaranteed to catch every vulnerability; a clean verdict reduces but does not eliminate risk.
- "First / only / best" claims cannot be validated from repo artifacts and are not made here.
Full statement: docs/NON_CLAIMS.md. These constraints are enforced in code by scripts/validate_claims.py.
Contributions to the corpus, taxonomy, safety probes, benchmark runners, docs, and CLI/API are welcome. Corpus examples must include vulnerable code, an exploit scenario, a taxonomy category and severity, a real secure patch, complete checks_performed, the expected Safety Layer behavior, and an auditable provenance reference. See CONTRIBUTING.md for the full requirements and the curated-ingestion workflow.
Please report vulnerabilities responsibly and never submit real secrets — use placeholders for any credential in examples or reports. See SECURITY.md.
Apache 2.0 — matching the Qwen2.5-Coder base model. See the license note in model_card/NULLSEC1.md.
