-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up chapter 'Unit Expressions' #3013
Conversation
chapters/unitexpressions.tex
Outdated
|
||
unit_prefix: | ||
Y | Z | E | P | T | G | M | k | h | da | d | c | m | u | n | p | f | a | z | y | ||
UNIT-PREFIX = "Y" | "Z" | "E" | "P" | "T" | "G" | "M" | "k" | "h" | "da" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why UNIT-PREFIX must be written in capital letters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And similarly for UNIT-SYMBOL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that it is strictly necessary. It's just matching the style of how we define similar lexical units for the Modelica language.
For me, the unit syntax becomes easier to read if I can tell by the capitalization which production rules that correspond to the lexical units I have in mind.
Of course, none of this really makes complete sense anyway, since the separation into UNIT-PREFIX
and UNIT-SYMBOL
is just conceptual in the grammar – in reality we all know they will be parsed as one lexical unit which is later split in tool-specific ways based on currently available unit definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes in the other grammar it is used to lexical units (in all caps) from grammar constructs (in all lower case).
That's important for a number of reasons, including white-space handling. But here there is no need for such a separation, so I don't see that treating UNIT-PREFIX (similar to type-prefix in some weird sense) special gives us anything, and the down-side is that it seems we are shouting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, to begin with I just changed UNIT-PREFIX
to unit-prefix
.
The reason I hesitate more regarding UNIT-SYMBOL
is that this one should really have been expressed using the lexing language. Wouldn't it be easiest to just write out the rule for UNIT-SYMBOL
, which would also make clear precisely what is the set of legal characters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good if we could avoid that tools introduce other prefixes than the predefined ones, but I realize it will be hard to verify in practice. I mean, ideally tools would only extend the set of known unit symbols, but how can you tell that "μOhm"
is a sign a a tool-defined prefix and not just the tool-defined unit symbol "μOhm"
that happens to equal "1uOhm"
?
I am also a bit scared of the possibility that different tools allow units to be introduced in ways that cause conflicts when Modelica code is moved from one tool to another. For example, say tool A introduces the symbol "metre"
and tool B introduces "etre"
. What is a valid "milli-etre" in tool B will then be mistaken for a "metre" in tool A. Unfortunately, I guess it's too late for this as it would close the door for units such as the "mile"
or "pt"
. A more realistic alternative would probably be to require additional units to be defined somewhere in a standardized annotation, which would allow a tool to take proper action when having custom definitions of both "metre"
and "etre"
in scope for the same unit string.
What I want to say with all of this is really just that I don't think we should formulate the specification as if tools can introduce custom unit prefixes as well as custom unit symbols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But a common request would be currency symbols like: €£$ (and the Yen-symbol that I don't have on my keyboard).
Right, these are good examples of potentially useful unit symbols.
I think there are good technical reasons to take the same care of unit symbols as we do of identifiers. That is, play conservative in order to stay away from Unicode canonicalization issues etc, and give an explicit list of allowed characters similar to Q-CHAR
.
Considering that just showing the glyph of a unicode symbol isn't very helpful i all cases, it would probably be best to make a table with all the characters allowed in addition to NON-DIGIT
:
unit-char : NON-DIGIT | UNIT-UNICODE-CHAR
Initial content of the UNIT-UNICODE-CHAR
would include:
- °
- Currency symbols: €, £, $, ¢, ¥, ₽,₨, … (just to mention a few from a list of about 40 symbols I have here)
- Anything else?
Some symbols I would like to not see in the UNIT-UNICODE-CHAR
because I'd like them to be reserved for future use:
- Single quote: '
- Double quote: "
- Basic arithmetic operators: +, -, *
- Grouping constructs and structure: [ ] { } < > : ; ,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to not go into such details on these symbols; possibly just saying that only NON-DIGIT is allowed. The reason is that even if we sort of accepted unit="$" I don't think we should encourage it by listing it in the standard. (But the unit-symbol : unit-char { unit-char }
sort of make sense.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should even leave the restriction to NON-DIGIT
out of this PR, so I opened #3020 for what isn't merely cleanup.
We now at least have a (lowercase) unit-symbol
defined in terms of some undefined unit-char
.
Will this do for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think so.
I've now also reformulated the part which was previously stated regarding some base version, which sounded like a trace of how it may have been formulated when unit strings were discussed for their original inclusion in the specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Just some stuff I encountered while preparing #3012.