-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Deprecate ser.str.subword_tokenize
API in favor of cudf.core.subword_tokenizer.SubwordTokenizer
#8604
Comments
ser.str.subword_tokenize
API in favor of cudf.core.subword_tokenizer.SubwordTokenizerser.str.subword_tokenize
API in favor of cudf.core.subword_tokenizer.SubwordTokenizer
This PR adds deprecationWarning to `ser.str.subword_tokenize` . Checkout related issue for details: #8604 Authors: - Vibhu Jawa (https://github.com/VibhuJawa) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #8603
@VibhuJawa can we close this issue after #8603? |
Nope, we still need to depreciate it. I just added that warning last release so that users have a 2 release window to fix it . Will depreciate next release. |
This issue has been labeled |
This PR resolves #8604 and #9447 Authors: - Vibhu Jawa (https://github.com/VibhuJawa) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - David Wendt (https://github.com/davidwendt) - Christopher Harris (https://github.com/cwharris) URL: #9968
Is your feature request related to a problem? Please describe.
Now that we have added cudf.core.subword_tokenizer.SubwordTokenizer with PR. We should deprecate the
ser.str.subword_tokenize
API in favor of it in the future release.Reasons to switch to this API are:
HuggingFace
so makes switching easierstride
etc. that can cause difficult bugs to find.Additional context
Have added PR to provide warnings for users
Also have added a PR to clx to use the new tokenizer. (rapidsai/clx#430 )
The text was updated successfully, but these errors were encountered: