-
Notifications
You must be signed in to change notification settings - Fork 20
[FEATURE] modify is_sif,to_sif,sif4sci #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## dev #79 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 46 46
Lines 1365 1371 +6
=========================================
+ Hits 1365 1371 +6
Continue to review full report at Codecov.
|
|
It seems that this PR is relevant to issue #37 ? |
|
examples need to be changed if the codes are approved |
EduNLP/SIF/sif.py
Outdated
| return sif_item | ||
|
|
||
|
|
||
| def sif4sci(item: str, figures: (dict, bool) = None, safe=True, symbol: str = None, tokenization=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you sepecify some modes such as ast in tokenzation_params, is it possible to set check_formula as False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is ok, but tokenzation_params mainly work during 'tokenize', and check_formula is used in is_sif before 'tokenize'. Maybe it is better to use seperately for clear direction.
Or we can use it within 'safe':
when safe = 0, don't use is_sif and don't check anything
when safe = 1, use is_sif but don't check formula
when safe = 2, use is_sif and check formula
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
safe looks like a boolean variable, maybe safe_mode is a better choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can also set a wise safe_mode?
EduNLP/SIF/sif.py
Outdated
| return True | ||
| return False | ||
| return True, item | ||
| return False, item_parser.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return text?
EduNLP/SIF/sif.py
Outdated
|
|
||
|
|
||
| def is_sif(item): | ||
| def is_sif(item, check_formula=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is sif should only return True or False. If it is necessary to modify the item for reducing duplicate opertaion in the following procedure, an extra argument is expected.
EduNLP/SIF/sif.py
Outdated
| True if check the validity of formulas in item | ||
| False if not check the validity of formulas in item, which is faster | ||
| cache: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache is not intuitive, maybe return_parser is a better choice
EduNLP/SIF/sif.py
Outdated
| figures: | ||
| when it is a dict, it means the id-to-instance information for figures in 'FormFigureID{...}' format, | ||
| when it is a bool, it means whether to instantiate figures in 'FormFigureBase64{...}' format | ||
| safe_mode: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe mode is better than safe_mode
EduNLP/SIF/sif.py
Outdated
|
|
||
|
|
||
| def to_sif(item): | ||
| def to_sif(item, check_formula=True, cache_parser: Parser = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache_parser -> parser
Thanks for sending a pull request!
Please make sure you click the link above to view the contribution guidelines,
then fill out the blanks below.
Description
What does this implement/fix? Explain your changes.
Pull request type
Changes
EduNLP/SIF/parser/parser.py
EduNLP/SIF/sif.py
tests/test_sif/test_sif.py
Does this close any currently open issues?
Y, #37 ([FEATURE] Add an option for checking formulas in is_sif)
Any relevant logs, error output, etc?
N
Checklist
Before you submit a pull request, please make sure you have to following:
Essentials
Comments