Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrades to the LaTeX grammar and Parser #19982

Merged
merged 8 commits into from Oct 5, 2020

Conversation

iammosespaulr
Copy link
Contributor

References to other Issues or PRs

Brief description of what is fixed or changed

PR for presenting changes made @ https://github.com/iammosespaulr/sympy
Lot of Improvements to the LaTeX grammar and parser

Other comments

@bhpayne @msgoff you both have access to this branch. feel free to push changes.

Release Notes

  • parsing
    • Upgraded Relational Operator support
    • Bra-Ket Notation support
    • Improved Grammar

iammosespaulr and others added 3 commits August 18, 2020 20:05
Co-authored-by: Ben <ben.is.located@gmail.com>
Co-authored-by: Michael Goff <mike@mathrules.us>
Co-authored-by: Ben <ben.is.located@gmail.com>
Co-authored-by: Michael Goff <mike@mathrules.us>
Co-authored-by: Ben <ben.is.located@gmail.com>
Co-authored-by: Michael Goff <mike@mathrules.us>
@sympy-bot
Copy link

sympy-bot commented Aug 18, 2020

Hi, I am the SymPy bot (v161). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.7.

Click here to see the pull request description that was parsed.
<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->


#### Brief description of what is fixed or changed

PR for presenting changes made @ https://github.com/iammosespaulr/sympy
Lot of Improvements to the LaTeX grammar and parser

#### Other comments

@bhpayne @msgoff you both have access to this branch. feel free to push changes.

#### Release Notes

<!-- Write the release notes for this release below. See
https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more information
on how to write release notes. The bot will check your release notes
automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
* parsing
  * Upgraded Relational Operator support
  * Bra-Ket Notation support
  * Improved Grammar
<!-- END RELEASE NOTES -->

Update

The release notes on the wiki have been updated.

@iammosespaulr
Copy link
Contributor Author

@sylee957 @asmeurer @jksuom thoughts and comments ?

Co-authored-by: Ben <ben.is.located@gmail.com>
Co-authored-by: Michael Goff <mike@mathrules.us>
@codecov
Copy link

codecov bot commented Aug 18, 2020

Codecov Report

Merging #19982 into master will decrease coverage by 0.134%.
The diff coverage is 0.627%.

@@              Coverage Diff              @@
##            master    #19982       +/-   ##
=============================================
- Coverage   75.861%   75.726%   -0.135%     
=============================================
  Files          670       670               
  Lines       173674    173975      +301     
  Branches     40974     40986       +12     
=============================================
- Hits        131751    131746        -5     
- Misses       36188     36490      +302     
- Partials      5735      5739        +4     

@asmeurer
Copy link
Member

Would it be easier to get this merged first or #19825? CC @costrouc

@iammosespaulr
Copy link
Contributor Author

Would it be easier to get this merged first or #19825? CC @costrouc

This already works with the existing implementation.
As far as I see, the lark parser still has some distance to cover.
Once it has been implemented successfully I can help with porting the upgrades I've made here over to the lark parser.

@iammosespaulr iammosespaulr marked this pull request as ready for review August 28, 2020 17:15
@oscarbenjamin
Copy link
Contributor

This has merge conflicts now

@iammosespaulr
Copy link
Contributor Author

Ahaa @sylee957 merged the other PR that added \leq and stuff,

@sympy-bot
Copy link

sympy-bot commented Sep 1, 2020

🟠

Hi, I am the SymPy bot (v160). I've noticed that some of your commits add or delete files. Since this is sometimes done unintentionally, I wanted to alert you about it.

This is an experimental feature of SymPy Bot. If you have any feedback on it, please comment at sympy/sympy-bot#75.

The following commits delete files:

  • fad04fe:
    • sympy/parsing/latex/.antlr/LaTeX.interp
    • sympy/parsing/latex/.antlr/LaTeX.tokens
    • sympy/parsing/latex/.antlr/LaTeXLexer.interp
    • sympy/parsing/latex/.antlr/LaTeXLexer.py
    • sympy/parsing/latex/.antlr/LaTeXLexer.tokens
    • sympy/parsing/latex/.antlr/LaTeXParser.py

If these files were added/deleted on purpose, you can ignore this message.

@iammosespaulr
Copy link
Contributor Author

@oscarbenjamin I think this is good to go once the tests are done

@iammosespaulr
Copy link
Contributor Author

what's up with the codecov tests, they've been running for 22hrs now

("\\prod_{a = b}^{c} x", Product(x, (a, b, c))),
("\\prod_{a = b}^c x", Product(x, (a, b, c))),
("\\prod^{c}_{a = b} x", Product(x, (a, b, c))),
("\\prod^c_{a = b} x", Product(x, (a, b, c))),
("\\Pi^c_{a = b} x", Product(x, (a, b, c))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some discussion in another PR about this. I'm not sure if we want to parse \Pi as a product or \Sigma as a sum. What affect does that have on an expression that uses \Pi as a symbol?

@eric-wieser @asmeurer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the scope of the sympy latex parser, but my assumption would be that we only care about parsing best-practice latex, not playing guessing games for nonsense input.

With that frame of mind, \Pi should just be parsed as a symbol called Pi. I used that as a variable only yesterday in a latex document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using \Pi as a variable is inherently misleading.
Screen Shot 2020-09-02 at 2 27 16 PM
https://www.wikiwand.com/en/Multiplication#/Products_of_sequences

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but yeah we could remove that from this PR and add it once we resolve this issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to your credit @eric-wieser I toyed around with the idea of using using \Gamma to represent the gamma function in sympy and realised it wouldn't be worth messing with the symbolic usage.
what do you guys think about using \operatorname{\Gamma} and \operatorname{\Sigma} instead.
allowing them to be used as a variable \Gamma and a function ☝️ if needed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why anyone would write \operatorname{\Sigma} instead of \sum, while also wanting it to mean sum in sympy. The gamma case seems fine to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah \operatorname{\Sigma} is a bad example cause we already have \sum,
but the idea of using symbols as both functions and variables when used appropriately seems appealing . like the gamma case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LaTeX parser is inherently heuristic. If there is an ambiguity, it should ideally try to pick the case that is more often used. I think for \Pi and \Sigma, it should be possible to detect in the parser if it is being used to represent a product/sum or if they are symbols.

The fact is that a lot of people write bad LaTeX, and we should be able to parse it if we can. \Pi^c_{a = b} is bad because it does the wrong thing in display mode, but I guarantee people write it because they don't know about \prod.

Regarding variable names that represent functions, I think we need to support overriding the namespace. Even something like \pi could be a variable name in some contexts, but there's no way to tell the parser that. The best way would be to use something like parse_latex(..., {'pi': Symbol('pi')}).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact is that a lot of people write bad LaTeX, and we should be able to parse it if we can. \Pi^c_{a = b} is bad because it does the wrong thing in display mode, but I guarantee people write it because they don't know about \prod.

I agree!

Regarding variable names that represent functions, I think we need to support overriding the namespace. Even something like \pi could be a variable name in some contexts, but there's no way to tell the parser that. The best way would be to use something like parse_latex(..., {'pi': Symbol('pi')}).

yeah! we could have something like that, we could specify tokens and give an explicit directive to the parser to treat them differently

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the reasons I'm excited about moving to Lark is it should make it a lot easier to allow users to extend the parser themselves. I'm not sure if that's even possible with the current antlr parser, and if it is, it wouldn't be easy.

@oscarbenjamin
Copy link
Contributor

I don't know this part of the codebase so well but this looks good to me. The Pi/Sigma stuff has been removed.

Does anyone have any opinions on this or can it be merged?

@iammosespaulr
Copy link
Contributor Author

I don't know this part of the codebase so well but this looks good to me. The Pi/Sigma stuff has been removed.
Does anyone have any opinions on this or can it be merged?

well, there hasn't been any progress on the lark parser end afaik. so why not have an improved version of something that already works rather than wait for an overhaul that seems to be taking quite a while

@oscarbenjamin
Copy link
Contributor

Looks good to me

@oscarbenjamin oscarbenjamin merged commit 49ef62b into sympy:master Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants