Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: add BLEU #2535

Merged
merged 29 commits into from
Jul 22, 2020
Merged

metrics: add BLEU #2535

merged 29 commits into from
Jul 22, 2020

Conversation

ydcjeff
Copy link
Contributor

@ydcjeff ydcjeff commented Jul 7, 2020

What does this PR do?

Fixes #1301 (issue)

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented Jul 7, 2020

Hello @ydcjeff! Thanks for updating this PR.

Line 27:49: E203 whitespace before ':'

Line 22:98: E231 missing whitespace after ','

Comment last updated at 2020-07-20 15:03:11 UTC

@mergify mergify bot requested a review from a team July 7, 2020 07:19
@ydcjeff
Copy link
Contributor Author

ydcjeff commented Jul 7, 2020

Hello @justusschock,
I have opened the PR. Also, where should I put my code under metrics package as I wrote it in a separate file?

@codecov
Copy link

codecov bot commented Jul 7, 2020

Codecov Report

Merging #2535 into master will increase coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #2535   +/-   ##
======================================
  Coverage      91%     91%           
======================================
  Files          70      72    +2     
  Lines        5778    5831   +53     
======================================
+ Hits         5270    5323   +53     
  Misses        508     508           

@Borda Borda added the feature Is an improvement or enhancement label Jul 7, 2020
@Borda Borda added this to in Progress in Metrics package via automation Jul 7, 2020
@Borda Borda added this to the 0.8.x milestone Jul 7, 2020
@SkafteNicki
Copy link
Member

Since bleu is a metric specific to nlp, could you move your code to a file called nlp.py. I guess we down the line will get more field-specific metrics.
Additionally,

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there another standard implementation so we can compare our results with theirs in tests?

@mergify mergify bot requested a review from a team July 7, 2020 08:01
Copy link
Member

@justusschock justusschock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I requested some changes. It is basically all your math, that should use torch operations instead of math package ops.

Your current implementation should go under metrics/functional/sequence.py

Once we finished iterating over the functional interface, we also need to add a module interface.

@williamFalcon is there a way to directly calculate these on tensors? if you have to convert it back to strings first, we always have a GPU sync, which we want to avoid.

return bleu


# t = "the FAST brown fox jumped over the lazy dog"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove these lines?

pytorch_lightning/metrics/bleu.py Outdated Show resolved Hide resolved
pytorch_lightning/metrics/bleu.py Outdated Show resolved Hide resolved
pytorch_lightning/metrics/bleu.py Outdated Show resolved Hide resolved
pytorch_lightning/metrics/bleu.py Outdated Show resolved Hide resolved
Metrics package automation moved this from in Progress to in Review Jul 7, 2020
@mergify mergify bot requested a review from a team July 7, 2020 09:10
@ydcjeff
Copy link
Contributor Author

ydcjeff commented Jul 7, 2020

I have refactored with torch.Tensor, added smooth argument and tested with nltk.

And also I added nltk in test.txt for testing.

requirements/test.txt Outdated Show resolved Hide resolved
@mergify mergify bot requested a review from a team July 7, 2020 20:09
def test_with_sentence_bleu():
nltk_output = sentence_bleu([reference1, reference2, reference3], hypothesis1, weights=(1, 0, 0, 0))
pl_output = bleu_score([hypothesis1], [[reference1, reference2, reference3]], n=1).item()
assert round(pl_output, 4) == round(nltk_output, 4)
Copy link
Member

@Borda Borda Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather use torch.allclose(...)

@mergify mergify bot requested a review from a team July 7, 2020 20:10
@mergify mergify bot requested a review from a team July 7, 2020 20:12
@mergify mergify bot requested a review from a team July 7, 2020 20:13
@mergify mergify bot requested a review from a team July 7, 2020 20:14
@mergify mergify bot requested a review from a team July 7, 2020 20:14
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls see my comments, but we are on a good way...

@mergify mergify bot requested a review from a team July 7, 2020 20:15
@mergify mergify bot requested a review from a team July 8, 2020 08:54
@mergify mergify bot requested a review from a team July 16, 2020 17:24
@Borda
Copy link
Member

Borda commented Jul 16, 2020

this circleci test failed as there is only installing from base.txt and test.txt while torchtext is in extra.txt

I thought that we are renaming this sequence module to nlp? @justusschock
you shall add this module do ignore for Conda and CIrcleCI tests :]

@ydcjeff
Copy link
Contributor Author

ydcjeff commented Jul 16, 2020

I thought that we are renaming this sequence module to nlp? @justusschock

so nlp or sequence?

@ydcjeff
Copy link
Contributor Author

ydcjeff commented Jul 16, 2020

you shall add this module do ignore for Conda and CIrcleCI tests :]

how shall I add?

@williamFalcon
Copy link
Contributor

this should be:

from p...metrics.nlp import Bleu

@ydcjeff
Copy link
Contributor Author

ydcjeff commented Jul 17, 2020

this should be:

from p...metrics.nlp import Bleu

Okay

@ydcjeff
Copy link
Contributor Author

ydcjeff commented Jul 17, 2020

I guess this is ready to review/go. @williamFalcon @Borda

@williamFalcon
Copy link
Contributor

ummm. so this is a wrapper on torchtext? i think we need our own implementation

@ydcjeff
Copy link
Contributor Author

ydcjeff commented Jul 17, 2020

Ahhh, I have implemented from scratch before and referenced the implmentation a little from torchtext.

Then, @justusschock sugguested to base on torchtext since it is in pytorch ecosystem.

So I refactored with torchtext.
@williamFalcon

@justusschock
Copy link
Member

@williamFalcon : @Borda and me agreed, that we don't need to duplicate this if it is already present within torchtext since basically everyone that will use this will also have torchtext installed and it was already an optional dependency.

@williamFalcon
Copy link
Contributor

williamFalcon commented Jul 17, 2020

i guess the point of metrics here is to centralize all metrics. otherwise we coule have said the same about sklearn.

we want our metrics package to be the reference implementation for any metric.

So, i would say, implement it here from scratch and test against torchtext for performance? the reason is that we want to give our community the flexibility to modify it as best practices change and i know bleu is one of those hotly debated metrics in terms of implementation details

@justusschock
Copy link
Member

With sklearn it's more about GPU performance/syncs :)

But I see your point. Then I'm sorry @ydcjeff :D But your code should still be available here :) SO just copy/paste it in :D

@ydcjeff
Copy link
Contributor Author

ydcjeff commented Jul 17, 2020

@williamFalcon I have re-implemented from scratch, anything you like to add on?

pytorch_lightning/metrics/functional/nlp.py Outdated Show resolved Hide resolved
.circleci/config.yml Outdated Show resolved Hide resolved
@mergify mergify bot requested a review from a team July 18, 2020 02:51
@ydcjeff
Copy link
Contributor Author

ydcjeff commented Jul 20, 2020

It's ready to be reviewed

Copy link
Member

@justusschock justusschock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Just some minor comments (mainly on typing)

pytorch_lightning/metrics/functional/nlp.py Outdated Show resolved Hide resolved
pytorch_lightning/metrics/functional/nlp.py Outdated Show resolved Hide resolved
pytorch_lightning/metrics/functional/nlp.py Outdated Show resolved Hide resolved
pytorch_lightning/metrics/nlp.py Outdated Show resolved Hide resolved
@mergify mergify bot requested a review from a team July 20, 2020 14:45
@williamFalcon williamFalcon merged commit 0a65826 into Lightning-AI:master Jul 22, 2020
Metrics package automation moved this from in Review to Done Jul 22, 2020
@williamFalcon
Copy link
Contributor

@ydcjeff awesome!!

@ydcjeff ydcjeff deleted the metrics/add-bleu branch July 22, 2020 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

[Metrics] Bleu
7 participants