New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve frac parsing #23694
Improve frac parsing #23694
Conversation
✅ Hi, I am the SymPy bot (v167). I'm here to help you write a release notes entry. Please read the guide on how to write release notes. Your release notes are in good order. Here is what the release notes will look like:
This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.11. Click here to see the pull request description that was parsed.
Update The release notes on the wiki have been updated. |
sympy/parsing/latex/LaTeX.g4
Outdated
frac: | ||
CMD_FRAC L_BRACE upper = expr R_BRACE L_BRACE lower = expr R_BRACE; | ||
frac: CMD_FRAC L_BRACE upper = expr R_BRACE L_BRACE lower = expr R_BRACE | ||
| CMD_FRAC NUMBER ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this PR makes sense.
I'm not sure it's the best way to define syntax. CMD_FRAC NUMBER
does more than a fraction should do, and that's why you have to deal with \frac1234
differently in convert_frac
.
I think it should be something like \frac upper lower
, where both upper
and lower
are either a single digit or a {expr}
.
That way you immediately fix \frac1{2}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick response!
I totally agree on principal, but is it even possible to do this in ANTLR?
The problem here is that the lexer and the parser really have to talk to each other, because 12
should almost always be lexed as a single NUMBER token, unless it immediately follows a \frac
, in which case it would be better for it to be lexed as two separate DIGIT tokens. I guess you would need a predicate that can refer to the previous token.
Does ANTLR support this sort of thing? I am new to ANTLR myself.
edit: Thinking again, I'm not positive you need the lexer and parser to communicate, being able to save some state in the lexer might be sufficient. Still, I'm not sure if ANTLR supports this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm giving a try myself and it seems harder than I supposed.
I'll update this PR if I succeed.
Actually I don't know much about ANTLR either, just try it intuitively using what I remember from the dragon book.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stuff mentioned here about 'predicates' seems relevant: https://stackoverflow.com/questions/62469159/antlr-matlab-grammar-lexing-conflict
This seems to allow some sort of custom code execution during lexing, which is what we would need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we make DIGIT
terminal and NUMBER
non-terminal, then we can use DIGIT
in frac
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clever! I love it.
73c4120
to
a01c3d2
Compare
frac's numerator or denominator accepts 1 digit only, and we have to use a terminal or non-terminal to represent it. |
Benchmark results from GitHub Actions Lower numbers are good, higher numbers are bad. A ratio less than 1 Significantly changed benchmark results (PR vs master) Significantly changed benchmark results (master vs previous release) before after ratio
[77f1d79c] [f4691d30]
<sympy-1.10.1^0>
+ 96.5±0.4ms 175±0.6ms 1.81 sum.TimeSum.time_doit
Full benchmark results can be found as artifacts in GitHub Actions |
I'm going to merge if no objections. |
References to other Issues or PRs
Brief description of what is fixed or changed
In the LaTeX parser, add support for bare numbers without brackets, like
\frac12
for the fraction1 / 2
.Also handles cases like turning
\frac12y
into1 / 2 * y
.Other comments
Does not handle more complex cases where
\frac
is used without bracketing or with partial bracketing, such as\frac1{2}
,\frac{\sin{x}}2
, etc. I may handle this in another PR if this one is well received.Release Notes