cstack

Multi-tenant Microsoft 365 operations toolkit for MSP engineers. Self-hosted, ML-augmented, open source.

cstack treats a fleet of M365 tenants the way an SRE treats a fleet of services: audit them with rules, score their traffic with per-tenant ML models, and surface findings with engineer-grade narratives instead of vendor portal screenshots.

What it does

The first tool in the toolkit, SignalGuard, ships two halves.

The CA audit half evaluates every tenant against a 15-rule catalogue (block legacy auth, MFA on admins, risk-based sign-in, break-glass exclusions, and others), plus a coverage-matrix layer that flags weak (user-segment x app-segment) cells, plus an exclusion-hygiene analyser that catches stale, orphaned, or undocumented CA policy exclusions. Every finding is deduplicated by content hash and persisted to DuckDB.

The anomaly half watches Entra sign-in events. A pooled Isolation Forest per tenant flags rows that look unlike the tenant's normal pattern, layered with four hybrid attack-pattern rules. SHAP attributions explain the top three contributing features on every flagged row. MLflow tracks every training run; a champion/challenger alias system gates promotion. Calibrated against synthetic fixtures at precision 0.245-0.275, recall 0.889-0.926 on injected attack scenarios. The detector targets SMB-tier tenants without Entra ID P2 licensing, where Microsoft's Identity Protection is unavailable.

A per-user IF topology and a per-user-anchored off-hours-admin rule ship as feature-flagged opt-ins (CSTACK_ML_TRAINING_TOPOLOGY=per_user, CSTACK_ML_OFF_HOURS_ADMIN_ENABLED=true). Sprint 3.5 added the infrastructure; Sprint 3.5b gated the activation pending real-tenant calibration in Sprint 7 because synthetic data could not demonstrate their precision lift.

Every finding (audit or anomaly) gets a four-section LLM narrative explaining why it fired, what it means, how to remediate, and when it might be a false positive. Narratives are content-addressed cached so two tenants with the same finding share a single generation. The provider layer abstracts Anthropic, OpenAI, and local Ollama behind one Protocol; tests register fakes via the same factory.

Status

V0.6.0-alpha.1 baseline (containerized stack, pre-live-tenant). See CHANGELOG.md for the per-sprint summary. Live tenant integration is Sprint 7, paused pending tenant access. Today the codebase ships:

266 Python tests across 8 packages and 2 apps
78 web tests across 28 files (Vitest + RTL, jsdom)
19 HTTP endpoints (15 read, 4 action), OpenAPI 3.1 contract committed
7 dashboard screens (Next.js 15 + Tailwind 4), tablet responsive at 768px
20-example hand-curated golden set + rubric-based LLM-as-judge eval harness
$4.30 of real Anthropic API spend during Sprint 6 calibration

Everything runs against three synthetic fixture tenants. None of it has touched a production Microsoft 365 tenant yet.

Tools

The cstack toolkit currently ships one tool with more planned.

SignalGuard identity security: CA audit + per-tenant behavioural sign-in anomaly detection with explainable ML scoring and LLM-narrated findings. Complete (against fixtures).
Future: LicenseLens, Driftwatch, ChangeRadar, CompliancePulse. Planned for V1.

Architecture at a glance

                   +----------------------- cstack monorepo -----------------------+
                   |                                                              |
fixtures load-all  |   +-- packages/ ----------------------------------+          |
or live extract -->|   | schemas, storage, graph-client, fixtures      |          |
                   |   | audit-{core,coverage,rules,exclusions}        |          |
                   |   | ml-{features,mlops,anomaly}                   |          |
                   |   | llm-{provider,narrative,eval}                 |          |
                   |   +----------------------+------------------------+          |
                   |                          |                                   |
                   |                          v                                   |
                   |   +--- DuckDB ---+   +--- mlruns ---+                        |
                   |   | tenants,     |   | per-tenant   |                        |
                   |   | ca_policies, |   | IF + SHAP    |                        |
                   |   | findings,    |   | @champion /  |                        |
                   |   | signins,     |   | @challenger  |                        |
                   |   | anomaly_     |   +--------------+                        |
                   |   | scores,      |                                           |
                   |   | narrative_   |                                           |
                   |   | cache        |                                           |
                   |   +------+-------+                                           |
                   |          |                                                   |
                   |          v                                                   |
                   |   +------+-----------------+   +----- Anthropic / OpenAI /   |
                   |   | apps/signalguard-api   |--+      Ollama (LLM provider)   |
                   |   | (FastAPI, X-API-Key)   |                                 |
                   |   +------+-----------------+                                 |
                   |          |                                                   |
                   |          v                                                   |
                   |   +------+-----------------+                                 |
                   |   | apps/signalguard-web   |                                 |
                   |   | (Next.js 15 + Tailwind)|                                 |
                   |   +------------------------+                                 |
                   |                                                              |
                   +--------------------------------------------------------------+

For the deeper data flow including LLM cache lookup and bias mitigation, see docs/ARCHITECTURE.md.

Screenshots

Eight screens captured against fixture tenants. Full reference with captions: docs/SCREENSHOTS.md.

How it's built

Python 3.12 via uv workspaces. 8 internal packages, 2 apps (CLI + API).
Next.js 15 + Tailwind 4 for the dashboard. Server Components first, TanStack Query for client interactions, typed @hey-api client generated from OpenAPI 3.1.
FastAPI + DuckDB for the backend. Per-request DuckDB connections, RFC 7807 problem-details on every error, correlation ids on every request and log line.
scikit-learn + SHAP + MLflow for the anomaly detector. Pipeline of StandardScaler + IsolationForest, SHAP only on flagged rows for runtime budget, MLflow registry aliases for promotion gating.
Provider-agnostic LLM layer with adapters for Anthropic Claude, OpenAI, and Ollama behind a single Protocol. Content-addressed prompt cache, budget caps, pointwise + pairwise eval harness with position-swap bias mitigation.

The full stack rationale lives in docs/ARCHITECTURE.md.

Running locally

The fastest path is the Docker stack. Prerequisites: Docker Desktop (or any Docker Compose v2 runtime).

git clone <this repo>
cd cstack
docker compose -f infra/docker/compose.yaml up --build

First run takes 60 to 90 seconds (fixtures load, audit runs on all three tenants, anomaly model trains on tenant-a). Subsequent runs come up in seconds.

Visit http://localhost:3000 and enter dev-secret when the API key gate prompts. See infra/docker/README.md for troubleshooting, the fixtures-only override, and how to enable the LLM narrative pass.

Running from source

For local hacking on the Python or TypeScript code without rebuilding the container on every change. Prerequisites: Python 3.12, Node 22 LTS, uv, pnpm.

uv sync
pnpm install

uv run cstack fixtures load-all
uv run cstack audit all --tenant tenant-b --no-narratives

echo 'SIGNALGUARD_API_DEV_API_KEY=dev-secret' >> .env
uv run signalguard-api --port 8000

# In a second shell.
pnpm --filter signalguard-web dev
# Visit http://localhost:3000/dashboard, enter "dev-secret".

Optional: drop --no-narratives and add ANTHROPIC_API_KEY=sk-ant-... to .env to generate LLM narratives during the audit. Default budget is $1 per run.

Optional: ASN feature extraction uses a MaxMind GeoLite2-ASN database when CSTACK_GEOIP_ASN_DB points at a valid .mmdb file. The Docker stack handles this automatically via the geoipupdate service; for local-from-source dev, either download the database manually (free account at https://www.maxmind.com/) and point the env var at it, or skip it — the ASN lookup falls back to a deterministic prefix table that the synthesizer's fixture IPs already use.

To run the anomaly detector end-to-end against fixtures:

uv run cstack signins extract --tenant tenant-a --scenario baseline
uv run cstack anomaly train --tenant tenant-a --lookback-days 365
uv run cstack anomaly promote --tenant tenant-a --force
uv run cstack signins extract --tenant tenant-a --scenario replay-attacks
uv run cstack anomaly score --tenant tenant-a
uv run cstack anomaly alerts --tenant tenant-a --n 20

The CLI is a thin layer over the same packages the API uses; see apps/cstack-cli/ for the full subcommand catalogue.

Documentation

Start at docs/INDEX.md. The major docs:

ARCHITECTURE.md system design, repo layout, data flow
API.md REST API, auth model, error format, OpenAPI pointer
MLOPS.md anomaly detection lifecycle, calibration results
LLM_OPS.md narrative generation and eval harness
DESIGN_TOKENS.md visual decisions, single source of truth
DESIGN_SYSTEM.md component patterns, screen blueprints
RULES.md CA audit rule catalogue
SCREENSHOTS.md UI reference
CONTRIBUTING.md local dev, conventions
SPRINT_NOTES.md per-sprint calibration outcomes
BACKLOG.md parked work

Engineering decisions worth calling out

Per-tenant pooled IF with planned cold-start fallback for sub-P2 tenants. Per-user models would be more sensitive but cold-start a new user every join. Sprint 3.5 will layer per-user models with a pooled fallback for users below the sample threshold.
Content-addressed prompt cache for cross-tenant narrative reuse. Cache key is SHA-256(rule_id, canonicalised(evidence), prompt_version, model), excluding tenant id; identical findings across tenants share one generation.
Custom LLM provider abstraction (not LiteLLM). One Protocol, three adapters, ~250 lines. LiteLLM ships its own opinions about retries and observability that conflict with ours; owning the abstraction means we can ship adapter-level fixes immediately (Claude 4.7 deprecating temperature mid-sprint was a real test of this).
Pairwise LLM-as-judge eval harness with bias mitigation. Different judge model from generator (sonnet judges opus output), position-swap on every pairwise comparison with the result downgraded to tie when the judge flips on swap, low-temperature judging. Pointwise scoring alone misled us in Sprint 6 calibration; the pairwise check caught it.
OpenAPI-first contract. The web client is generated from apps/signalguard-api/openapi.json and CI fails on drift. The web app cannot ship a request shape the backend does not support.

License

MIT.

Contributing

cstack is a personal portfolio project that welcomes external contribution. See docs/CONTRIBUTING.md for the local dev workflow, conventional-commit rules, and the project's hard rules on code style and tone.

Security

Security issues go to leunis@vanlabs.dev per SECURITY.md, not GitHub issues.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.github		.github
.husky		.husky
apps		apps
docs		docs
infra		infra
packages		packages
scripts		scripts
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
commitlint.config.mjs		commitlint.config.mjs
eslint.config.mjs		eslint.config.mjs
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
pyproject.toml		pyproject.toml
tsconfig.base.json		tsconfig.base.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cstack

What it does

Status

Tools

Architecture at a glance

Screenshots

How it's built

Running locally

Running from source

Documentation

Engineering decisions worth calling out

License

Contributing

Security

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cstack

What it does

Status

Tools

Architecture at a glance

Screenshots

How it's built

Running locally

Running from source

Documentation

Engineering decisions worth calling out

License

Contributing

Security

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages