Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for number_to_word test #352

Closed
alytarik opened this issue Apr 27, 2023 · 2 comments · Fixed by #377 or #410
Closed

Add support for number_to_word test #352

alytarik opened this issue Apr 27, 2023 · 2 comments · Fixed by #377 or #410
Assignees
Labels
⭐ Feature Indicates new feature requests 🔔 Good First Issue Good first issue for new contributors

Comments

@alytarik
Copy link
Contributor

Integrate the following module in nlptest for NER and text classification: https://github.com/GEM-benchmark/NL-Augmenter/tree/main/nlaugmenter/transformations/number-to-word

It will fall under the Robustness category

Make sure to watch out for changes in Span start and end indexes when swapping words

@alytarik alytarik added 🔔 Good First Issue Good first issue for new contributors ⭐ Feature Indicates new feature requests labels Apr 27, 2023
@RakshitKhajuria
Copy link
Contributor

According to the issue "implement the following module in nlptest for NER and text classification: https://github.com/GEM-benchmark/NL-Augmenter/tree/main/nlaugmenter/transformations/number-to-word" it uses inflect lib that helps in the natural language generation of English words and phrases based on numerical input but it doesn't work with all the test cases.

In text-classification
For Eg
It will work with this
Input : Virat Kohli hits 150 in world cup.
Expected Output : Virat Kohli hits one hundred and fifty in world cup.
Output : Virat Kohli hits one hundred and fifty in world cup.

Input : My brother is 12 years old
Expected Output : My brother is twelve years old
Output : My brother is twelve years old

Where it will fail
Input : "The price of the product is $10"
Expected Output : "The price of the product is ten dollars"
Output : The price of the product is $10
(It will give the expected output when there will be a space between $ and 10)

Input : "The price of the product is $10.99"
Expected Output : "The price of the product is ten dollars and ninety nine cents"
Output : The price of the product is $10.99

There are cases where this approach using inflect lib might not work correctly. For example, if the input contains decimal numbers or if there is no space between number and text.

@luca-martial
Copy link
Contributor

Thanks for the detailed comment @Ryzxxl - @alytarik can you check to see if a custom implementation of inflect is worth it (effort vs impact)? If we can get rid of that dependency and transform those edge cases (numbers with special characters) then it will be worth implementing it ourselves.

For now, @Ryzxxl you can ignore this issue since it does not really affect the context in which we run tests

@luca-martial luca-martial linked a pull request May 4, 2023 that will close this issue
4 tasks
@luca-martial luca-martial linked a pull request May 12, 2023 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⭐ Feature Indicates new feature requests 🔔 Good First Issue Good first issue for new contributors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants