Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QQP test evaluation is extremely slow #209

Closed
sleepinyourhat opened this issue Jul 20, 2018 · 19 comments
Closed

QQP test evaluation is extremely slow #209

sleepinyourhat opened this issue Jul 20, 2018 · 19 comments
Labels
jiant-v1-legacy Relevant to versions <= v1.3.2 wontfix This will not be worked on

Comments

@sleepinyourhat
Copy link
Contributor

One run has been on QQP Test for 2.5 hours without signs of progress. GPU usage is non-zero but low. This seems to have changed since #121.

@Jan21 @iftenney, any guesses? Did you verify that QQP test works?

Dev also appears to be quite slow, but I don't have numbers yet.

@sleepinyourhat
Copy link
Contributor Author

(This is why we couldn't do the test set evaluation this week.)

@Jan21
Copy link
Contributor

Jan21 commented Jul 20, 2018

investigating

@sleepinyourhat
Copy link
Contributor Author

QQP test is quite big, so that's got to be most of it. a 100% slowdown in evaluation is okay, but would be pretty conspicuous here.

See also: #145

@sleepinyourhat
Copy link
Contributor Author

sleepinyourhat commented Jul 20, 2018

I can confirm that QQP test eval works, though, so it may be safe to ignore.

@sleepinyourhat sleepinyourhat added the low-priority Only if you're bored. Ask Sam/Ian/Alex before starting. label Jul 20, 2018
@Jan21
Copy link
Contributor

Jan21 commented Jul 20, 2018

ok

@W4ngatang
Copy link
Collaborator

W4ngatang commented Jul 23, 2018

Wow this is excruciatingly slow, 1hr+ for me on P100.

@W4ngatang W4ngatang mentioned this issue Jul 24, 2018
@sleepinyourhat
Copy link
Contributor Author

This is pretty bad—it seems to be a problem even on dev, and it's almost certainly a result of #121.
@W4ngatang - do you have any bandwidth to see if there's an easy fix? We can get by without one, but Jan, Ian, and I are all booked for today/tomorrow.

@W4ngatang
Copy link
Collaborator

Just occurred to me: we should use a smarter batcher / iterator during eval (and validation). IIRC we got pretty decent speed ups.

@sleepinyourhat
Copy link
Contributor Author

sleepinyourhat commented Jul 25, 2018 via email

@W4ngatang
Copy link
Collaborator

I think it would not be that bad; we already use it during training. Wouldn't break anything.

@W4ngatang
Copy link
Collaborator

Smart batching doesn't seem to help because the batch utilization was already pretty high, I guess because the QQP are pretty similar in length.

What we could do is jack up the batch size during evaluation only?

@sleepinyourhat
Copy link
Contributor Author

sleepinyourhat commented Jul 26, 2018 via email

@W4ngatang
Copy link
Collaborator

W4ngatang commented Jul 26, 2018

What GPUs were you running on previously?

For some reason I'm getting pretty relatively fast eval times (~10m) for QQP test on 1080s...
Running untrained encoder (2 layers, 1024d, attn, batch size 128)

@sleepinyourhat
Copy link
Contributor Author

sleepinyourhat commented Jul 26, 2018 via email

@Jan21
Copy link
Contributor

Jan21 commented Jul 26, 2018

Sorting is not cached...I was also waiting 2,5 hours on p100....I'm trying to debug it on CPU

@sleepinyourhat
Copy link
Contributor Author

sleepinyourhat commented Jul 26, 2018 via email

@W4ngatang
Copy link
Collaborator

W4ngatang commented Jul 26, 2018

Yes, on dev it's also very fast (~1m) and test is ~12m

Profiling now.

Maybe it's due to having an untrained encoder? But I'm running the same script on the p100 and it's much slower (though less than an hour).

@pitrack
Copy link
Contributor

pitrack commented Jul 26, 2018

Is this really slow? it seems to go through a rate of ~60 batches/30 seconds, which is the same rate as qnli or mnli.

qqp is just a big dataset, (test is ~390K. the rest of the test sets are ~20K or less; val is 40k, the rest of the val sets are also <20K).

The logging helps a lot.

EDIT: Not going to close because others might also wonder why qqp takes for ever

@sleepinyourhat
Copy link
Contributor Author

sleepinyourhat commented Jul 26, 2018 via email

@pitrack pitrack added wontfix This will not be worked on and removed low-priority Only if you're bored. Ask Sam/Ian/Alex before starting. labels Jul 26, 2018
@jeswan jeswan added the jiant-v1-legacy Relevant to versions <= v1.3.2 label Sep 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jiant-v1-legacy Relevant to versions <= v1.3.2 wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

5 participants