-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode equivalents of ASCII characters #690
Comments
Hi, |
@NewmanJ1987 thanks for offering! I added this character in commit 36654b3, but there are others that would be good to have if you want to help. For example, there are many characters that look like |
Sure that looks like a good entry task. Thanks. |
It looks like String.normalize can do a lot of the work. |
Hi, I am new to open source and would like to contribute to this issue if it is unassigned. However, I am unsure how you would like me to implement String.normalize given that currently synonyms are hard coded in a dictionary in the variable opSynonyms. |
@abhijeetsharma200 |
Hi there! I am looking to contribute to open source for Hacktoberfest 2020. Is this issue still up for grabs? |
We should also accept the mathematical alphanumeric symbols, which are used by MathJax. |
@grplyler sorry I didn't reply earlier - yes, this issue is still open, and there are plenty of easy things you can add in. Please submit a pull request! |
I've added all the left and right parenthesis characters that I could find on graphemica to Parser.(left|right)_parentheses. They're all synonyms for the ASCII parenthesis characters. see #690, and thanks to maths/moodle-qtype_stack#860 for pointing them out!
There are fullwidth equivalents of lots of punctuation marks, such as U+FF0C "Fullwidth Comma" |
We had a student using a fullwidth parenthesis: ( (But that's already supported, so that's not the bug I was looking for!) |
We have had a spate of students who wrote Greek letters as unicode and whose expressions were marked incorrect, because Numbas doesn't consider e.g. |
@christianp, we already have this issue and I plan to implement something this summer: maths/moodle-qtype_stack#860 Would you be interested in sharing lists of unicode between our projects on this issue, perhaps with a JSON file of "known equivalents"? |
@sangwinc - good idea! I'm now looking at typing out a big list of character mappings. The generic Unicode normalisation algorithms don't really help, because they ignore some differences that are mathematically significant, or don't consider equivalent some things that would be convenient for us. I think we have to do it character-by-character (or character-class by character class, at least) |
This big list of LaTeX to unicode mappings that I made for mathstodon, based on unicodeit.net, might help: https://github.com/christianp/mastodon/blob/mathstodon-4.1.0/app/javascript/mastodon/features/compose/util/autolatex/data.js |
I'm working on this at https://github.com/numbas/unicode-math-normalization. I've produced a set of files giving explicit mappings from some Unicode characters to JME syntax, and identified some things that can be normalized using the standard normalization algorithm. Tomorrow I'll try to integrate this with the JME parser. |
A student typed
2ˆ3
, which was not interpreted as valid. The characterˆ
is a modifier, but looks very similar to the 'real' character,^
.Numbas should consider
ˆ
to be a synonym of^
.The dictionaries
opSynonyms
andfuncSynonyms
inruntime/scripts/jme.js
map alternative names for operators and functions onto their canonical names. Add entries to these dictionaries mapping unicode symbols onto their equivalents.A good place to find unicode symbols is graphemica.com
The text was updated successfully, but these errors were encountered: