[Chunk teacher] Bug with distributed evaluation #2935

emilydinan · 2020-08-04T18:42:24Z

Patch description
Very subtle bug using chunk teacher with distributed evaluation. The count to place samples for different gpus was being reset every time a chunk was being loaded, which led to the wrong number of samples being placed on certain gpus IF the valid set had more than one chunk assigned to it. This led to the system hanging forever at the self.samples.get(), because num_episodes exceeded the actual number of samples on the queue.

💀 RIP a day of my life 💀

stephenroller

cheers

wyshi

Woohoo! It was fun debugging this one.

Emily Dinan added 2 commits August 4, 2020 11:32

attmpet1

8550906

oopsie

861c158

emilydinan requested a review from stephenroller August 4, 2020 18:42

facebook-github-bot added the CLA Signed label Aug 4, 2020

emilydinan requested a review from wyshi August 4, 2020 18:42

stephenroller approved these changes Aug 4, 2020

View reviewed changes

wyshi approved these changes Aug 4, 2020

View reviewed changes

emilydinan merged commit 77f34c9 into master Aug 4, 2020

emilydinan deleted the reallyfunbuggreattimes branch August 4, 2020 22:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Chunk teacher] Bug with distributed evaluation #2935

[Chunk teacher] Bug with distributed evaluation #2935

emilydinan commented Aug 4, 2020 •

edited

Loading

stephenroller left a comment

wyshi left a comment

[Chunk teacher] Bug with distributed evaluation #2935

[Chunk teacher] Bug with distributed evaluation #2935

Conversation

emilydinan commented Aug 4, 2020 • edited Loading

stephenroller left a comment

Choose a reason for hiding this comment

wyshi left a comment

Choose a reason for hiding this comment

emilydinan commented Aug 4, 2020 •

edited

Loading