-
Notifications
You must be signed in to change notification settings - Fork 25.1k
Fix inconsistent results of string split
func on JIT mode
#38772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit a2b0b17 (more details on the Dr. CI page): ✅ None of the CI failures appear to be your fault 💚
❄️ 1 failure tentatively classified as flakybut reruns have not yet been triggered to confirm:
|
a5adf8b
to
f0c3c1e
Compare
In order to handle the empty separator scenario as Python split function, the default value of the separator is changed from the
|
split
func on JIT modesplit
func on JIT mode
I am not sure defaulting it to empty string is correct. In Python, an empty separator raises an error, and we should probably follow that behavior. |
@ezyang Thanks for your kind reply. The current PR fixes the inconsistent behavior for JIT mode when the separator is not specified, In order to follow the same behavior of Python for empty separators, defaulting the separator parameter to >>> test = 'a a'
>>> test.split()
['a', 'a']
>>> test.split('')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: empty separator I tried setting the default value of the separator to None via new API.
However, the following error is thrown
Please kindly share suggestions on how to default the separator to None. |
you'd have to make the string optional. I'm not sure this is supported by JIT. (@suo, I'm looking to you to find some JIT side to review this, if you won't review it yourself.) |
It should work to make the argument optional. You do it by adding a |
5623cc2
to
6a115de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes! lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@houseroad: how should we resolve the BC incompatibility errors? |
This isn't BC incompatible, but it is forward incompatible, you might run into issues there. |
@RockingJavaBean to get the BC check to succeed, please add this schema to the whitelist here: https://github.com/pytorch/pytorch/blob/master/test/backward_compatibility/check_backward_compatibility.py#L19. Then we should be good to go. Thanks! |
ae84c84
to
0c19d3f
Compare
@suo Thanks for sharing how to get the BC checks to succeed, and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Signed-off-by: Xiong Wei <xiongw.fnst@cn.fujitsu.com>
This commit is to fix the following issue ``` { path: 'torch/csrc/jit/runtime/register_string_ops.cpp', start_line: 577, end_line: 577, start_column: 22, end_column: 22, annotation_level: 'failure', message: "[performance-unnecessary-value-param] warning: the parameter 'string' is copied for each invocation but only used as a const reference; consider making it a const reference" } ``` Signed-off-by: Xiong Wei <xiongw.fnst@cn.fujitsu.com>
Resolve the conflict of check_backward_compatibility.py from pull request pytorch#39933 Signed-off-by: Xiong Wei <xiongw.fnst@cn.fujitsu.com>
0c19d3f
to
3d30a4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@suo thanks you so much for the help on landing this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Yep, the PR is ready merge! Each time you update it, we need to sync it with our internal CI system and run tests. Then when they pass, we should be good to land. I'll try to land it tonight :) |
thank you so much throughout this pull request. |
…38772) Summary: Resolve pytorch#38207 Below is the description of split function according to [Python doc](https://docs.python.org/3.8/library/stdtypes.html?highlight=split#str.split). ``` If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. ``` The logic to handle both none and empty separators is added in register_string_ops.cpp as fix. Signed-off-by: Xiong Wei <xiongw.fnst@cn.fujitsu.com> Pull Request resolved: pytorch#38772 Differential Revision: D21789612 Pulled By: suo fbshipit-source-id: 4dfd74eda71e0bfd757378daedc927a4a63ec0e4
Resolve #38207
Below is the description of split function according to Python doc.
The logic to handle both none and empty separators is added in register_string_ops.cpp as fix.
Signed-off-by: Xiong Wei xiongw.fnst@cn.fujitsu.com