-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added subject_object_switch #159
Added subject_object_switch #159
Conversation
}, | ||
"outputs": [ | ||
{ | ||
"sentence": "the French book has not returned Andrew to the library." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion, The first letter of the sentence should be in the capital. (the --> The). At the time of swapping we can check for the title() case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the sentence in the test. It made me think and I also added a basic check in the transformation, if the original sentence starts with a capital letter but the transformed one doesn't the first letter of the transformation is uppercased. This still leaves the problem of some initially uppercased letters ending up in the final transformed sentence, but I'm trying to keep the transformation as simple as I can. Thanks for your review.
### Sentence Pair Operation | ||
| Transformation | paraphrase-xlm-r-multilingual-v1-MSRP | paraphrase-xlm-r-multilingual-v1-PAWS | | ||
|:---------------------------------|:--------------------------------------|:--------------------------------------| | ||
| PairSubjectObjectSwitch | 69.0->30.0 (-39.0) | 44.0->24.0 (-20.0) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow! This is an incredibly large difference!
Okay! Here is my overall review: I think the PR's robustness numbers speak for themselves! I hope we have taken care of all aspects of code to not find a bug cause a 30% drop is quite impressive! :) Without looking at the actual subjects and objects, my best guess would be that the changing of subject and object might have had sometimes been an interchange of a noun and pronoun.. it would be interesting to hear from @sotwi if he has looked at some examples randomly to see why this performance gap exists! All changes look fine to me :) |
Hello @kaustubhdhole and thank you for your review and your work.
I haven't looked at the specific data in this case, but if I have some time I will definitely give it a look. That being said, I've been working on paraphrase evaluation for the last couple of months and a problem we encountered on models for semantic similarity was that similar inputs with slight changes that affected the meaning (like this one) would obtain high semantic similarity all the same. The PAWS dataset is an attempt to solve that by presenting very similar sentences with high semantic similarity variation. Hopefully this transformation can help produce more of those samples and impact semantic similarity models in the future. |
Hi @sotwi , I have been assigned a reviewer to this PR. I would brief my findings Correctness: merged |
Naively switches the position of the subject and object of an English sentence