[REVIEW] secure-code-review: add multi-language XXE parser evidence gates

## Skill Being Reviewed
**Skill name:** `secure-code-review`
**Skill path:** `skills/appsec/secure-code-review/`

## False Positive Analysis

**Benign XML parsing that can be over-reported if every XML parser use is flagged:**
```python
from defusedxml.ElementTree import fromstring

def parse_metadata(xml_body: bytes):
    return fromstring(xml_body)
```

**Why this is a false positive:**

The parser is from `defusedxml`, which is designed to disable dangerous XML features such as external entities and entity expansion attacks. XML parsing itself is not automatically XXE. A review needs evidence that untrusted XML reaches a parser with DTDs, external entities, external schemas, or network/entity resolvers enabled before reporting CWE-611.

## Coverage Gaps

**Missed variant 1: Java parser leaves external entities and DTD loading enabled**
```java
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(request.getInputStream());
```

**Why it should be caught:**

The main secure-code-review skill does not require reviewers to inspect XML parser configuration. In Java, Python, .NET, PHP, Ruby, and Node, unsafe parser defaults or explicit resolver settings can expose local files, internal HTTP endpoints, or trigger denial-of-service through entity expansion.

**Missed variant 2: external schema or XSLT resolution is enabled even when entities appear disabled**
```java
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(new StreamSource(request.getInputStream()));
```

**Why it should be caught:**

XXE-style risk is not limited to `<!DOCTYPE>` in ordinary document parsing. External schema, XSLT, XInclude, and resolver APIs can still perform filesystem or network access. The skill should require parser, schema, transformer, and resolver evidence rather than only looking for `DocumentBuilderFactory`.

## Edge Cases

- JSON-only APIs do not need XML parser findings unless XML is accepted through alternate content types, uploads, SOAP/SAML, webhooks, or import jobs.
- Internal-only XML processing is still risky when files originate from partners, tenants, support bundles, SAML responses, or uploaded documents.
- A parser may disable external entities but still permit large entity expansion or external schema fetches unless those features are separately disabled.
- Some legacy integrations require DTDs. The finding should then document source trust, resolver allowlists, network isolation, size limits, and why the exception is needed.

## Remediation Quality

- [x] Fix resolves the vulnerability
- [x] Fix doesn't introduce new security issues
- [x] Fix doesn't break functionality
- **Issues found:** Add a multi-language XXE evidence gate to the main secure-code-review skill. Require reviewers to record XML input source, parser type, DTD/entity settings, external resolver/schema/XSLT settings, size limits, and false-positive rationale for safe parsers such as `defusedxml` or explicitly hardened framework settings.

## Comparison to Other Tools

| Tool | Catches this? | Notes |
|------|:---:|-------|
| Semgrep | Partial | Can match unsafe parser constructors and missing features, but needs language-specific safe-parser exceptions. |
| CodeQL | Partial | Can model unsafe XML parser configuration in supported languages, but reviewers still need source trust and resolver evidence. |
| OWASP Cheat Sheet Series | Yes | Provides language-specific guidance for disabling XXE and related external resolution features. |
| CWE | Yes | CWE-611 tracks improper restriction of XML external entity references. |

## Overall Assessment

**Strengths:**

The skill has good coverage of common injection, authentication, authorization, logging, file upload, deserialization, and SSRF paths.

**Needs improvement:**

The main multi-language secure-code-review process does not include an XML parser evidence gate. That can cause false negatives for unsafe XML parsers and false positives for hardened parser libraries or configurations.

**Priority recommendations:**

1. Add CWE-611 to Step 2 or Step 8 coverage.
2. Add vulnerable and safe examples for XML parser configuration.
3. Add language-aware search patterns for Java, Python, .NET, PHP, Ruby, and Node XML parsers.
4. Add report evidence fields for parser type, source, DTD/entity state, external resolver/schema/XSLT state, size limits, and false-positive rationale.

## References

- CWE-611: https://cwe.mitre.org/data/definitions/611.html
- OWASP XML External Entity Prevention Cheat Sheet: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
- OWASP WSTG XML Injection: https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/07-Input_Validation_Testing/07-Testing_for_XML_Injection

## Bounty Info
- [x] I have read and agree to the [CONTRIBUTING.md](../../CONTRIBUTING.md) bounty terms
- **Preferred payment method:** Payment details can be provided privately after maintainer acceptance.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] secure-code-review: add multi-language XXE parser evidence gates #882

Skill Being Reviewed

False Positive Analysis

Coverage Gaps

Edge Cases

Remediation Quality

Comparison to Other Tools

Overall Assessment

References

Bounty Info

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tool	Catches this?	Notes
Semgrep	Partial	Can match unsafe parser constructors and missing features, but needs language-specific safe-parser exceptions.
CodeQL	Partial	Can model unsafe XML parser configuration in supported languages, but reviewers still need source trust and resolver evidence.
OWASP Cheat Sheet Series	Yes	Provides language-specific guidance for disabling XXE and related external resolution features.
CWE	Yes	CWE-611 tracks improper restriction of XML external entity references.

[REVIEW] secure-code-review: add multi-language XXE parser evidence gates #882

Description

Skill Being Reviewed

False Positive Analysis

Coverage Gaps

Edge Cases

Remediation Quality

Comparison to Other Tools

Overall Assessment

References

Bounty Info

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions