issue with training/testing partition #2

rola93 · 2020-03-03T18:07:00Z

hey! I've been following your code, great work!

I assume you run all cells in order.

The only mistake I see is that, first, you apply the data augmentation technique, then, you split your data in train and test. This means that you may be adding an example on train set and its augmented version to the test set. This may lead to overestimated performance on your test set. You need to make sure that Train and Test sets are as independent as possible.

The easy fix is, fist split in train test, and then apply augmentation on both datasets.

robinreni96 · 2020-06-06T14:06:58Z

Thank you @rola93 for pointing out the error point . Ill fix it and update the repo soon.

robinreni96 closed this as completed Jun 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issue with training/testing partition #2

issue with training/testing partition #2

rola93 commented Mar 3, 2020

robinreni96 commented Jun 6, 2020

issue with training/testing partition #2

issue with training/testing partition #2

Comments

rola93 commented Mar 3, 2020

robinreni96 commented Jun 6, 2020