Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# data/

MusicXML test inputs and expected-output fixtures.

## `.invalid` marker convention

A file `foo.xml` (or `foo.musicxml`) that is **not** valid MusicXML must be
accompanied by a sibling marker file `foo.xml.invalid` whose contents
describe, in prose, why the file is invalid.

The marker tells schema-strict consumers (notably the **core roundtrip**
suite at `src/private/mxtest/corert/`) to skip the file. `mx::core` is a
strongly-typed DOM generated from the MusicXML XSD and cannot round-trip
elements or attributes that are not in the schema; running such a file
through `fromXDoc` → `toXDoc` is guaranteed to lose content.

The marker is descriptive only — its body is for humans. Presence of the
file is the signal.

Other suites (e.g. **api import** under `src/private/mxtest/import/`) may
still include invalid files when they exercise reader leniency on
malformed input. Those suites ignore the marker.
14 changes: 14 additions & 0 deletions data/foundsuite/Deutscher Tanz D.820.1.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violations:

- midi-program with value 0 (schema midi-128: must be 1..128) at line 49.
- sound/@dynamics='-1.11' (minInclusive 0) at lines 167 and 469.

The core roundtrip suite trips on the midi-program=0 case: mx::core clamps
to the schema-legal range (writes 1), causing a text mismatch against the
original 0. Excluding this file from the schema-strict core roundtrip is
the right call — the library is conforming to the schema and the input is
the deviation.

See data/README.md for the .invalid marker convention.
15 changes: 15 additions & 0 deletions data/foundsuite/O_Holy_Night-Adam-1871.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violations include:

- midi-channel with value 0 (schema midi-16: must be 1..16) at line 23.
- Many duration elements with value 0 (minExclusive 0), starting at line 9706
and continuing for dozens of occurrences.

The core roundtrip suite trips on midi-channel=0: mx::core clamps to the
schema-legal range (writes 1), causing a text mismatch against the
original 0. Excluding this file from the schema-strict core roundtrip is
the right call — the library is conforming to the schema and the input is
the deviation.

See data/README.md for the .invalid marker convention.
20 changes: 20 additions & 0 deletions data/foundsuite/O_Holy_Night.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violations include:

- midi-channel with value 0 (schema midi-16: must be 1..16) at lines 22, 32,
42, 52.
- midi-program with value 0 (schema midi-128: must be 1..128) at lines 23,
33, 43, 53.
- display-step with value '=' (enum is A..G) at lines 408, 711, 1337, 1791,
2025.
- display-octave with value -104 (minInclusive 0) at adjacent lines.
- Many duration elements with value 0 (minExclusive 0) starting at line 2584.

The core roundtrip suite trips on midi-channel=0: mx::core clamps to the
schema-legal range (writes 1), causing a text mismatch against the
original 0. Excluding this file from the schema-strict core roundtrip is
the right call — the library is conforming to the schema and the input is
the deviation.

See data/README.md for the .invalid marker convention.
14 changes: 14 additions & 0 deletions data/foundsuite/Rimsky-Korsakov Op11 No4.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violations:

- sound/@dynamics with value '-1.11' (minInclusive 0) at lines 6224, 8231,
11057, 12922, 13151, 13389, 14002.

The core roundtrip suite trips on this attribute: mx::core clamps the
negative value to the schema-legal range, causing an attribute mismatch
against the original. Excluding this file from the schema-strict core
roundtrip is the right call — the library is conforming to the schema and
the input is the deviation.

See data/README.md for the .invalid marker convention.
14 changes: 14 additions & 0 deletions data/lysuite/ly01e_Pitches_ParenthesizedAccidentals.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violations:

- accidental element text 'double-flat' is not in the schema enumeration at
lines 138, 149, 160, 171. The schema-correct spelling is 'flat-flat'.

The core roundtrip suite trips on this: mx::core sees an unknown enum value
and falls back to a default ('sharp'), producing a text mismatch against the
original 'double-flat'. Excluding this file from the schema-strict core
roundtrip is the right call — the input is using a name that is not in the
schema enumeration.

See data/README.md for the .invalid marker convention.
13 changes: 13 additions & 0 deletions data/lysuite/ly32a_Notations.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violations:

- <fret/> empty at line 900; schema requires xs:nonNegativeInteger.
- <string/> empty at line 924; schema requires string-number.

The core roundtrip suite trips on the empty <fret/>: mx::core defaults the
missing integer to 0, producing a text mismatch ('' vs '0') against the
original empty element. Excluding this file from the schema-strict core
roundtrip is the right call — the input is omitting required content.

See data/README.md for the .invalid marker convention.
12 changes: 12 additions & 0 deletions data/lysuite/ly41g_PartNoId.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violation:

- <part> at line 16 is missing the required 'id' attribute.

The filename literally says PartNoId. mx::core's fromXDoc rejects the
document with "PartAttributes: 'id' is a required attribute but was not
found" — i.e. the parser is correctly enforcing the schema. Excluding this
file from the schema-strict core roundtrip is the right call.

See data/README.md for the .invalid marker convention.
13 changes: 13 additions & 0 deletions data/lysuite/ly74a_FiguredBass.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violation:

- <figured-bass> at line 90 is missing the required child element <figure>
(figured-bass requires at least one figure child per schema).

The core roundtrip suite trips on this: the empty figured-bass round-trips
to a different child count than the (canonicalized) input. Excluding this
file from the schema-strict core roundtrip is the right call — the input
is omitting a required child.

See data/README.md for the .invalid marker convention.
15 changes: 15 additions & 0 deletions data/lysuite/ly75a_AccordionRegistrations.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violations:

- <accordion-middle/> empty at line 291 (must be 1..3).
- <accordion-middle>test</accordion-middle> at line 307 (must be 1..3).
- <accordion-middle>0</accordion-middle> at line 323 (minInclusive 1).
- <accordion-middle>5</accordion-middle> at line 339 (maxInclusive 3).

The core roundtrip suite trips on the empty accordion-middle: mx::core
defaults the missing integer to 0, producing a text mismatch ('' vs '0').
Excluding this file from the schema-strict core roundtrip is the right
call — the input is using values outside the schema-allowed range.

See data/README.md for the .invalid marker convention.
15 changes: 15 additions & 0 deletions data/musuite/testInvalid.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
testInvalid.xml is intentionally not a valid MusicXML file.

Per its embedded <miscellaneous-field name="description">: "This file is almost
identical to testHello.xml, but contains two additional invalid elements to
test MusicXML validation." Specifically:

- <invalid_element/> appears inside <identification>
- <another_invalid_element/> appears inside <attributes>

Neither element exists in the MusicXML XSD. mx::core is a strongly-typed DOM
generated from the schema and has no slot for unknown elements, so fromXDoc
discards them. The file therefore cannot round-trip losslessly through
mx::core and is excluded from the core roundtrip suite via this marker.

See data/README.md for the .invalid marker convention.
18 changes: 18 additions & 0 deletions data/musuite/test_harmony.xml.invalid
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Not valid against the MusicXML XSD.

Confirmed by xmllint --schema docs/musicxml.xsd. Violations include:

- <kind> values not in the schema enumeration: '' (empty, line 73), ' '
(whitespace, line 82), 'minor-major' (195), 'dominant-seventh' (204, 488,
502, 522, 546), 'neapolitan' (316), 'italian' (326), 'french' (335),
'german' (344), 'maj69' (355), 'augmented-ninth' (364), 'altered' (374).
- <degree-type> child appearing where <degree-alter> is expected, at lines
388, 401, 597, 610, 614 — element order violates the schema sequence.

The core roundtrip suite trips on the empty <kind/>: mx::core defaults the
missing enum to 'major', producing a text mismatch ('' vs 'major').
Excluding this file from the schema-strict core roundtrip is the right
call — the input is full of values and orderings that are not in the
schema.

See data/README.md for the .invalid marker convention.
65 changes: 37 additions & 28 deletions docs/ai/projects/gen/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,28 @@
created: 2026-05-18
m1_completed: 2026-05-21
m2_completed: 2026-05-22
m3_completed: 2026-05-22
---

# gen

## Goal

Reverse engineer the codegen process that produced `mx/core` from MusicXML XSD. Build a generator
that re-produces the existing C++ code from `docs/musicxml.xsd`, then point it at MusicXML 4.0 to
generate updated types.
Reverse engineer the codegen process that produced `mx/core` from MusicXML XSD. Build a
generator that re-produces the existing C++ code from `docs/musicxml.xsd`, then point it
at MusicXML 4.0 to generate updated types.

## Files

- `plan.md` - milestones and exit criteria
- `state.md` - current session state and next-session instructions
- `log.md` - append-only session log
Standard project layout (see the `/project` skill).

## Repo conventions introduced by this project

- `{file}.invalid` marker (sibling file, human-readable body) marks a MusicXML input that
is intentionally not XSD-valid. Honored by the corert suite, ignored by api import.
Documented in `data/README.md`. Introduced in M3.
- The `src/private/mxtest/corert/` core-roundtrip harness (built out during M2/M3) is
part of `make test-all` and is the daily driver via `make test-core-dev`.

## Generator location and how to run

Expand All @@ -27,40 +34,42 @@ python3 gen/generate.py # regenerates C++ into src/private/mx/core/elem
python3 gen/eval.py # scores diff against checked-in mx/core (secondary signal)
```

Workflow: `python3 gen/generate.py && make fmt && make test-all`, then reset:
Workflow: `python3 gen/generate.py && make fmt && make test-all`. To reset:
`git checkout -- src/private/mx/core/ && git clean -fd src/private/mx/core/`.

Quality gates: `make fmt && make check && make test-all`.

## Fitness function
## Fitness function and gates

`make test-all` pass/fail. **Always use `make test-all`, never `make test`** — the latter builds
`dev` and skips the slow `mxtest/file/` round-trippers, under-reporting failures. `make test-all`
takes >10 minutes. Every iteration ends with a recorded `make test-all` result.
`make test-all` pass/fail is authoritative. **Use `make test-all`, never `make test`** —
the latter builds `dev` and skips the slow `mxtest/file/` round-trippers. Full gate:
`make fmt && make check && make test-all`. Every iteration that touches
`src/private/mx/core/*` ends with a recorded `make test-all` result.

## Cardinal rules

- Never change tests.
- Never change test cases. Test harness code is fair game with user authorization.
- Never change `mx/api`.
- Minimize changes to `mx/impl`; prefer fixing the generator.
- Do not autonomously edit `gen/eval_config.yaml`. Flag patterns to the user with sample diff + reasoning.
- Reset generated `mx/core/` before commit.
- Do not autonomously edit `gen/eval_config.yaml`. Flag patterns to the user with sample
diff + reasoning.
- Reset generated `mx/core/` before commit if a generator change was tried and reverted.

## Bespoke generator functions
## Bespoke generator handlers

Six bespoke handlers exist (credit, lyric, part-list, harmony, score-wrapper-family, note,
direction), registered in `BESPOKE_ELEMENTS` in `gen/generate.py`. They are acceptable when an
element's shape cannot be expressed by the shared rule-based path, but they must still read the
parsed XSD model so spec changes propagate. Pattern: "custom algorithm, schema-driven data" — not
a hardcoded string dump.
Six bespoke handlers in `BESPOKE_ELEMENTS` (`gen/generate.py`): credit, lyric, part-list,
harmony, score-wrapper-family, note, direction. Acceptable when an element's shape cannot
be expressed by the shared rule-based path, but they must still read the parsed XSD model
so spec changes propagate. Pattern: "custom algorithm, schema-driven data" — not a
hardcoded string dump.

Always prefer extending a reusable rule-based path (`TREE_ELEMENTS`, `TREE_ELEMENT_CONFIG` flags,
shared helpers) over a new bespoke handler. If a pattern could plausibly recur for another
element, the fix belongs in the shared path with a config-driven flag.
Always prefer extending a reusable rule-based path (`TREE_ELEMENTS`,
`TREE_ELEMENT_CONFIG` flags, shared helpers) over a new bespoke handler. If a pattern
could plausibly recur for another element, the fix belongs in the shared path with a
config-driven flag.

## Key external files

- `gen/generate.py` — the generator (~2800 lines of Python)
- `gen/generate.py` — the generator (~14k lines of Python)
- `gen/eval.py`, `gen/eval_config.yaml` — diff scoring
- `docs/musicxml.xsd` — input schema (currently MusicXML 3.x; swapped to 4.0 in M5)
- `src/private/mx/core/elements/` — target output (591 .h/.cpp pairs)
- `docs/musicxml.xsd` — input schema (currently MusicXML 3.x; swap to 4.0 in M6)
- `src/private/mx/core/elements/` — target output (~590 .h/.cpp pairs)
- `src/private/mxtest/corert/` — core-roundtrip harness
Loading
Loading