Skip to content

feat: XSD substitution groups, complexContent extension, and cross-namespace resolution#23

Open
AlexanderWillner wants to merge 5 commits into
jonwiggins:mainfrom
AlexanderWillner:pr/substitution-groups
Open

feat: XSD substitution groups, complexContent extension, and cross-namespace resolution#23
AlexanderWillner wants to merge 5 commits into
jonwiggins:mainfrom
AlexanderWillner:pr/substitution-groups

Conversation

@AlexanderWillner
Copy link
Copy Markdown

Summary

Three related XSD validation features that enable correct validation of real-world multi-namespace schemas (tested against German AAA/ALKIS cadastral data with WFS+GML namespaces).

1. Element substitution groups (d285d4d)

Parses <xsd:element substitutionGroup="..."> declarations and builds a substitution group index. During element validation, if a child element name does not match the declaration directly, it is checked against the substitution group headed by that declaration (including transitive members).

2. complexContent extension base merging (78f4b7b, 2733b7c)

Implements merge_extension_bases() which flattens complexContent extension chains. When a type extends a base type, the base type's particles are merged before the extension's own particles. Also fixes a bug where resolve_type_name() matched types by local name only, ignoring namespace — now correctly checks resolved namespace against the schema's own targetNamespace.

3. Cross-namespace substitution + element ref resolution (9f15149, d453c29)

  • build_substitution_index() scans imported schemas for substitution group members
  • is_substitution_member() looks up transitive members across imports
  • element_matches_decl() resolves namespace of ref= attributes (e.g. ref="wfs:FeatureCollection") via prefix map lookup (resolve_element_namespace())
  • Unqualified child elements accepted for element_ref declarations

Files changed

  • src/validation/xsd.rs — ~400 lines added across 5 commits

Testing

  • All 1071 existing tests pass
  • Validated against real NAS/ALKIS AAA files with WFS+GML cross-namespace substitution groups

Add XSD 1.0 section 3.3.6 substitution group support to the XSD validator.

When element B declares substitutionGroup='A', B can appear anywhere A is
expected in a content model. This is transitive: if C substitutes for B,
C also substitutes for A.

Changes:
- Add substitution_group and is_abstract fields to XsdElement
- Add substitution_groups index to XsdSchema (head -> members map)
- Parse substitutionGroup/abstract attributes in parse_element_decl
- Build substitution index after schema parse via build_substitution_index
- Extend element_matches_decl to accept substitution group members
- Add is_substitution_member for transitive chain resolution
- Resolve instance element type in validate_sequence_element for correct
  content validation of substituted elements
Parse <xs:complexContent><xs:extension base='...'> in complex type
definitions. After all schemas are loaded, merge base-type content
model particles with extension particles in derivation order.

Post-processing step merge_extension_bases() resolves the full
inheritance chain recursively (with cycle detection) and prepends
base-type particles to the derived type's sequence.

Adds parse_complex_content() handler, extension_base field on
ComplexType, resolve_base_particles_impl() with visited-set guard,
and 3 unit tests covering simple extension, multi-level chains,
and empty-base extension.
When a schema uses targetNamespace and elementFormDefault='qualified',
type references like adv:DerivedType now correctly resolve to local
types instead of only searching imported namespaces.

Adds targetNamespace self-check in resolve_type_name and
resolve_element_ref, plus a last-resort local-name fallback in
resolve_type_name. Also adds find_complex_type helper that searches
both local and imported types for base particle resolution.

New tests: complex content extension with targetNamespace,
optional element ordering detection.
Three bugs prevented substitution group members declared in imported
schemas from being recognized during XSD validation:

1. build_substitution_index() only scanned local schema.elements,
   missing imported elements that declare substitutionGroup membership.
   Fix: also iterate imported_namespaces.*.elements.

2. element_matches_decl() rejected same-named elements from different
   namespaces without checking substitution group membership.
   Fix: when namespace differs but local name matches, fall back to
   is_substitution_member() check.

3. is_substitution_member() only looked up transitive member
   declarations in local schema.elements.
   Fix: also search imported_namespaces.*.elements for member decls.

Fixes: FeatureCollection substitution group, AbstractCRS abstract element.
element_matches_decl() now resolves the namespace of element
declarations referenced via ref= attributes (e.g. ref="wfs:FeatureCollection")
instead of always checking against the main schema's targetNamespace.

This fixes validation of documents where imported elements have
different namespaces than the main schema, such as WFS FeatureCollection
in NAS/AAA schemas.

Also:
- Allow unqualified child elements for element_ref declarations
- build_substitution_index scans imported elements
- is_substitution_member looks up transitive members in imports
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant