Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added subject_object_switch #159

Merged
merged 11 commits into from
Sep 16, 2021

Conversation

Sotwi-zz
Copy link
Contributor

Naively switches the position of the subject and object of an English sentence

},
"outputs": [
{
"sentence": "the French book has not returned Andrew to the library."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion, The first letter of the sentence should be in the capital. (the --> The). At the time of swapping we can check for the title() case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the sentence in the test. It made me think and I also added a basic check in the transformation, if the original sentence starts with a capital letter but the transformed one doesn't the first letter of the transformation is uppercased. This still leaves the problem of some initially uppercased letters ending up in the final transformed sentence, but I'm trying to keep the transformation as simple as I can. Thanks for your review.

### Sentence Pair Operation
| Transformation | paraphrase-xlm-r-multilingual-v1-MSRP | paraphrase-xlm-r-multilingual-v1-PAWS |
|:---------------------------------|:--------------------------------------|:--------------------------------------|
| PairSubjectObjectSwitch | 69.0->30.0 (-39.0) | 44.0->24.0 (-20.0) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! This is an incredibly large difference!

@kaustubhdhole
Copy link
Collaborator

Okay! Here is my overall review: I think the PR's robustness numbers speak for themselves! I hope we have taken care of all aspects of code to not find a bug cause a 30% drop is quite impressive! :) Without looking at the actual subjects and objects, my best guess would be that the changing of subject and object might have had sometimes been an interchange of a noun and pronoun.. it would be interesting to hear from @sotwi if he has looked at some examples randomly to see why this performance gap exists!
The robustness evaluation is conducted.
(If the test cases don't work on try testing this command locally pytest -s --t=light --f=light . If they still don't, feel free to add a blank test.json and add them to the README.)

All changes look fine to me :)

@Sotwi-zz
Copy link
Contributor Author

Sotwi-zz commented Sep 16, 2021

Hello @kaustubhdhole and thank you for your review and your work.

Without looking at the actual subjects and objects, my best guess would be that the changing of subject and object might have had sometimes been an interchange of a noun and pronoun.. it would be interesting to hear from @sotwi if he has looked at some examples randomly to see why this performance gap exists!

I haven't looked at the specific data in this case, but if I have some time I will definitely give it a look. That being said, I've been working on paraphrase evaluation for the last couple of months and a problem we encountered on models for semantic similarity was that similar inputs with slight changes that affected the meaning (like this one) would obtain high semantic similarity all the same. The PAWS dataset is an attempt to solve that by presenting very similar sentences with high semantic similarity variation. Hopefully this transformation can help produce more of those samples and impact semantic similarity models in the future.

@kaustubhdhole kaustubhdhole merged commit 5d9ccb5 into GEM-benchmark:main Sep 16, 2021
@Mayukhga83
Copy link

Mayukhga83 commented Sep 19, 2021

Hi @sotwi , I have been assigned a reviewer to this PR. I would brief my findings

Correctness: merged
Interface: I think you have chosen the correct interface(s).
Applicable Tasks & Keywords: relevant
Specificity: sentence with subject and object
Novelty: new
Adding New Libraries: No
Description: The README is clear and explains what the transformation is aiming to do.
Data and code source: Data and code provided
Test Cases: well documented
Evaluating Robustness: good performance
Languages other than English: No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants