Add CWE ingestion prompt and skill for per-domain taxonomy generation by Alan-Jowett · Pull Request #229 · microsoft/PromptKit

Alan-Jowett · 2026-04-08T15:17:41Z

Summary

Adds a reusable /ingest-cwe-taxonomies slash command (prompt file) and matching CLI skill that implements the CWE taxonomy ingestion pipeline from #228.

Files Added

File	Purpose
`.github/prompts/ingest-cwe-taxonomies.prompt.md`	Copilot Chat slash command (`/ingest-cwe-taxonomies`)
`.github/skills/ingest-cwe-taxonomies/SKILL.md`	Copilot CLI skill (same name)

What the Prompt Does

When invoked, the prompt guides an agent through a 6-phase pipeline:

Acquisition — Downloads/locates official MITRE CWE XML (version 4.x)
Normalization — Writes scripts/ingest-cwe.py to parse XML into structured JSON
Domain Mapping — Maps 900+ CWEs to 13 audit domains using a 4-priority algorithm:
- Priority 1: CWE View membership (language-specific views like View 658/C, View 919/Mobile)
- Priority 2: Applicable_Platforms field matching
- Priority 3: Consequence and context analysis
- Priority 4: PromptKit manual override rules
Taxonomy Generation — Produces 13 taxonomies/cwe-<domain>.md files in PromptKit format
Integration — Updates manifest.yaml, generates version diffs
Verification — Domain exclusion sanity checks (no XSS in kernel, no IRQL in web, etc.)

Domain Registry (13 domains)

kernel-mode-c-cpp, native-user-mode-c-cpp, managed-dotnet, web-js-ts, web-backend, cloud-service, iac, irmware-embedded, crypto-protocols, data-processing, cli-tools, mobile-app, container-k8s

Components Assembled

Persona: security-auditor (CWE/vulnerability expertise)
Protocols: nti-hallucination + self-verification + operational-constraints (all verbatim)
Format: multi-artifact (adapted for taxonomy + metadata output)
Template: Custom 6-phase CWE ingestion workflow

Re-run Workflow

Download new CWE XML → invoke /ingest-cwe-taxonomies → review diff report → commit.

Testing

python tests/validate-manifest.py passes ✅
Prompt file and skill follow existing conventions (see �oot.prompt.md, promptkit/SKILL.md)

Partially addresses #228 — template integration (domain parameter in security audit templates) is tracked as a follow-up.

Copilot

Pull request overview

Adds a new Copilot Chat slash command and matching Copilot CLI skill to run a reproducible, multi-phase workflow for ingesting the official MITRE CWE XML corpus and generating per-domain PromptKit CWE taxonomy artifacts (per Issue #228’s ingestion pipeline concept).

Changes:

Added /ingest-cwe-taxonomies prompt file that specifies a 6-phase ingestion → normalization → mapping → taxonomy generation → integration → verification workflow.
Added a corresponding Copilot CLI skill that delegates execution to the prompt file.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`.github/skills/ingest-cwe-taxonomies/SKILL.md`	New CLI skill entrypoint that directs users/agents to run the CWE ingestion prompt.
`.github/prompts/ingest-cwe-taxonomies.prompt.md`	New slash-command prompt implementing the end-to-end CWE ingestion and taxonomy generation pipeline.

.github/prompts/ingest-cwe-taxonomies.prompt.md

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

.github/prompts/ingest-cwe-taxonomies.prompt.md

Adds a reusable /ingest-cwe-taxonomies slash command (prompt file) and matching CLI skill that implements the CWE taxonomy ingestion pipeline from Issue microsoft#228. The prompt guides an agent through: - Downloading and parsing official MITRE CWE XML - Mapping 900+ CWEs to 13 audit domains using a 4-priority algorithm - Generating per-domain PromptKit taxonomy files - Updating manifest.yaml and running sanity checks - Producing version diffs for incremental CWE updates Closes microsoft#228 (partially — template integration is a follow-up) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Replace hardcoded CWE XML namespace with runtime namespace detection - Resolve CWE views by name first, use IDs as optional hints only - Remove Java-to-.NET analogy from core view table (now requires explicit override rule if desired) - Replace underspecified Priority 3 context analysis with deterministic keyword-based scoring heuristic with exact term lists and thresholds - Scope .gitignore pattern from *.xml.zip to cwec_*.xml.zip Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Replace 'header comment or frontmatter' with provenance metadata guidance: top-level _meta object for JSON, comment/frontmatter for Markdown/YAML - Drop Rarely/Undetermined prevalence as secondary/optional entries; simply ignore them for domain assignment (deterministic, no schema ambiguity) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Phase 2 now instructs the agent to run the existing script first, with the full parsing/mapping specification preserved as a reference for validation and regeneration. Phase 3 reframed as algorithm documentation rather than imperative instructions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Copilot · 2026-04-08T16:10:39Z

.github/prompts/ingest-cwe-taxonomies.prompt.md

+description: 'Ingest CWE XML from MITRE and generate per-domain security audit taxonomies for PromptKit'
+agent: 'agent'
+tools: ['search/codebase', 'edit']
+argument-hint: 'Path to CWE XML file, or "latest" to download cwec_latest.xml.zip from MITRE'


Frontmatter limits tools to ['search/codebase', 'edit'], but the workflow supports latest (download from MITRE) and instructs running python scripts/ingest-cwe.py and extracting a zip. With these tools, an agent typically can’t fetch remote content or execute shell commands, so the latest path (and “run the script” steps) may be non-actionable. Consider either (a) adding the appropriate toolset(s) (e.g., web/fetch) and explicitly describing how to handle zip extraction/execution, or (b) removing/softening the latest automation and requiring the user to provide a local XML path + run commands themselves.

Copilot · 2026-04-08T16:10:40Z

.github/prompts/ingest-cwe-taxonomies.prompt.md

+2. **Update `.gitignore`.** Add an entry to exclude the raw downloaded
+   CWE XML file (but NOT the `data/cwe/` processed artifacts):
+
+   ```
+   # Raw CWE XML downloads (large, available from MITRE)
+   cwec_*.xml.zip
+   cwec_*.xml
+   ```


The .gitignore instruction says to “add” cwec_*.xml.zip / cwec_*.xml, but this repo already has a broad *.xml.zip ignore (see .gitignore:8). If a user follows these steps, they’ll likely end up with both patterns and still ignore all *.xml.zip artifacts. Consider updating the instruction to replace any existing *.xml.zip rule with the CWE-scoped patterns to avoid unintentionally ignoring unrelated XML zip files.

Copilot · 2026-04-08T16:10:40Z

.github/skills/ingest-cwe-taxonomies/SKILL.md

+- **CWE source**: A local path to a CWE XML file, or `latest` to download
+  the current release from `https://cwe.mitre.org/data/xml/cwec_latest.xml.zip`.
+- Any overrides to the domain registry or mapping rules the user specifies.
+
+## Output
+
+- Per-domain taxonomy files at `taxonomies/cwe-<domain>.md` (13 domains)
+- Normalized CWE data at `data/cwe/<version>/`
+- Updated `manifest.yaml` with new taxonomy entries
+- Reusable ingestion script at `scripts/ingest-cwe.py`
+- Diff report if a previous CWE version exists


This skill advertises a latest mode that downloads from cwe.mitre.org and outputs multiple repo modifications (new/updated taxonomy files, data/cwe/..., manifest.yaml, and scripts/ingest-cwe.py). It would be helpful to explicitly call out the required permissions/capabilities (network access for download, filesystem write access, and the ability to run Python) so users aren’t surprised when invoking the skill in constrained environments.

Copilot AI review requested due to automatic review settings April 8, 2026 15:17

Copilot started reviewing on behalf of Alan-Jowett April 8, 2026 15:19 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

abeltrano reviewed Apr 8, 2026

View reviewed changes

.github/prompts/ingest-cwe-taxonomies.prompt.md Show resolved Hide resolved

abeltrano reviewed Apr 8, 2026

View reviewed changes

.github/prompts/ingest-cwe-taxonomies.prompt.md Show resolved Hide resolved

Alan-Jowett requested a review from Copilot April 8, 2026 15:40

Copilot started reviewing on behalf of Alan-Jowett April 8, 2026 15:42 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

.github/prompts/ingest-cwe-taxonomies.prompt.md Show resolved Hide resolved

.github/prompts/ingest-cwe-taxonomies.prompt.md Outdated Show resolved Hide resolved

Alan-Jowett mentioned this pull request Apr 8, 2026

One-pass CAPEC ingestion → generate per-domain attack pattern taxonomies for PromptKit security audits #231

Open

Alan Jowett and others added 4 commits April 8, 2026 08:53

Alan-Jowett force-pushed the cwe-ingestion-prompt branch from ad0d505 to 4714930 Compare April 8, 2026 15:57

Copilot AI review requested due to automatic review settings April 8, 2026 15:57

Alan-Jowett merged commit f300bb1 into microsoft:main Apr 8, 2026
3 checks passed

Alan-Jowett deleted the cwe-ingestion-prompt branch April 8, 2026 15:59

Copilot started reviewing on behalf of Alan-Jowett April 8, 2026 16:00 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CWE ingestion prompt and skill for per-domain taxonomy generation#229

Add CWE ingestion prompt and skill for per-domain taxonomy generation#229
Alan-Jowett merged 4 commits intomicrosoft:mainfrom
Alan-Jowett:cwe-ingestion-prompt

Alan-Jowett commented Apr 8, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Alan-Jowett commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files Added

What the Prompt Does

Domain Registry (13 domains)

Components Assembled

Re-run Workflow

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Alan-Jowett commented Apr 8, 2026 •

edited

Loading