Default Latin table glyph-faithful to EN 300 468 Figure A.1 by fishloa · Pull Request #2 · fishloa/rust-dvb

fishloa · 2026-06-04T06:18:59Z

Resolves the deferred audit finding on text/mod.rs 0xA8 by reading the vendored PDF directly (Annex A Figure A.1, p. 159, V1.19.1 — the 2025 edition includes Unicode equivalents in the figure).

Verdict on the disputed byte: 0xA8 = ¤ U+00A4 (existing code was right; auditor conflated it with combining 0xC8 diaeresis). The real bug was 0xA4 → € U+20AC (DVB's superset addition).

Scope: full GR-area rewrite — the old other as char Latin-1 fallback was wrong across the A/B/D/E/F rows. Full non-spacing diacritic row with precomposed forms + base-plus-combining-mark fallback. figure_a1_* tests pin every defined position. Figure A.1 hand-transcribed into docs/en_300_468.md with proper citation.

🤖 Generated with Claude Code

…e A.1 Verified against the vendored PDF (specs/etsi_en_300_468_v01.19.01_dvb_si.pdf, Annex A Figure A.1, p. 159 — 'Character code table 00 - Latin alphabet with Unicode equivalents'), which resolves the audit dispute: - 0xA8 = U+00A4 currency sign — existing mapping CONFIRMED correct (the auditor's 'diaeresis' claim conflated 0xA8 with combining prefix 0xC8) - 0xA4 = U+20AC € — the actual bug in that pair (DVB superset addition; was decoding as ¤) Full GR-area rewrite (the old Latin-1 fallback was wrong across the A/B/D/E/F rows — quotes, arrows, ×/÷, ™/♪, fractions, Ø/Œ/Þ/ŧ/ŋ/SHY…): - iso_6937_single: exhaustive per-byte table with Unicode codepoints, undefined (grey) positions → U+FFFD - combining_mark + extended combine(): full non-spacing row (grave, acute, circumflex, tilde, macron, breve, dot, diaeresis, ring, cedilla, double acute, ogonek, caron) with precomposed forms; unmatched pairs emit base + Unicode combining mark (canonically equivalent); undefined prefixes 0xC0/0xC9/0xCC and dangling prefixes → U+FFFD TDD: figure_a1_* tests pin every defined GR position to its Figure A.1 codepoint (written first, RED, then implemented). Docs: Figure A.1 hand-transcribed into dvb-si/docs/en_300_468.md with PDF page cite + verbatim superset note; README + CHANGELOG updated. All tests pass (stable + MSRV 1.75), clippy -D warnings clean.

fishloa merged commit 9f1474f into main Jun 4, 2026
4 checks passed

fishloa deleted the text-figure-a1 branch June 4, 2026 06:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default Latin table glyph-faithful to EN 300 468 Figure A.1#2

Default Latin table glyph-faithful to EN 300 468 Figure A.1#2
fishloa merged 1 commit into
mainfrom
text-figure-a1

fishloa commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fishloa commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant