Skip to content

Nemeth#1

Merged
NSoiffer merged 29 commits intomainfrom
Nemeth
Sep 22, 2021
Merged

Nemeth#1
NSoiffer merged 29 commits intomainfrom
Nemeth

Conversation

@NSoiffer
Copy link
Collaborator

Merge in first attempt at Nemeth generation. Still plenty to do, but the results aren't too bad with this merge.

This gets a surprising amount of Nemeth generated correctly. It is not ready for someone to use though.

Still much work to do:
  various indicators (numeric, punctuation, ...) need work
  modifiers (munder/mover) and enclosures are not implemented
  typeface changes (should add to unicode.yaml)
  tables
Getting the indicators right is probably the hardest part.

At some point, a useful speedup up for loading would be to split the Unicode tables (there are now one for speech and one for braille) into primary and secondary parts. The primary table would be the 300 - 500 chars that are most used and the secondary one the rest of the chars. Loading unicode.yaml takes about 50ms, and doing the split will take it down to under 10ms. Maybe just 5ms. With two tables to load, that could save 90ms on startup. A flag would need to be added to SpeechRules to indicate whether the secondary table was loaded and a check that if no match was found on lookup, it should load the secondary table.

Added some basic tests for tags along with SRE tests.

The SRE tests are dubious in many cases in that some are geared towards text (footnotes, etc) and also contain bugs. They need trimming.

Probably should add a test for every rule that matters for math. The green book as about 200 rules, although many are not about math. On the other, many of the rules have subparts, so there might need to be 500 or so tests.
The tests from SRE (AataNemeth, SRE_Nemeth72, SRE_NemethBase) might have errors in them. Also, some of tests deal with text/math changes and are not really appropriate in translating MathML. These files need clean up.

There is still much work to be done to get the tests to all work!
Added mixed fractions and munder, etc.
Added lots more Unicode chars (from SRE), but they need work/vetting
Now using ASCII indicators for braille chars which get cleaned in a pass after the rules are run. Regenerated chars to make use of this and fixed up existing chars from MS.

Added (commented out) a list of Nemeth functions to definitions.yaml to be used when I redo definitions.rs
Fix a character that was being overwritten by MS translation
@NSoiffer NSoiffer merged commit 1c78c77 into main Sep 22, 2021
NSoiffer pushed a commit that referenced this pull request Sep 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant