Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CachedSimulatedDataset mappable Dataset #760

Merged
merged 1 commit into from
May 15, 2023

Conversation

zhixiangteoh
Copy link
Contributor

@zhixiangteoh zhixiangteoh commented May 12, 2023

Partially fix bug where each epoch runs num_workers * n_batches + valid_n_batches iterations when it should be n_batches + valid_n_batches iterations. Partial because bug would remain for SimulatedDataset.

Associated sanity-check test results ensuring similar validation losses / speed of the two approaches:

dataset type mappable mappable iterable iterable
num workers 0 32 0 32
training loss (after 250 iterations) 1.82 1.82 1.92 1.9
val loss 1.8513 1.8513 1.90651 -
speed (median it/s) 4.98 1.4 4.92 3.98

@codecov
Copy link

codecov bot commented May 12, 2023

Codecov Report

Merging #760 (ff2f6f0) into master (3cd181c) will increase coverage by 0.21%.
The diff coverage is 94.44%.

@@            Coverage Diff             @@
##           master     #760      +/-   ##
==========================================
+ Coverage   88.53%   88.75%   +0.21%     
==========================================
  Files          14       14              
  Lines        1265     1254      -11     
==========================================
- Hits         1120     1113       -7     
+ Misses        145      141       -4     
Flag Coverage Δ
unittests 88.75% <94.44%> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
bliss/simulator/simulated_dataset.py 89.47% <92.85%> (-0.82%) ⬇️
bliss/generate.py 98.52% <100.00%> (+5.57%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@zhixiangteoh zhixiangteoh marked this pull request as ready for review May 15, 2023 14:59
Copy link
Contributor

@jeff-regier jeff-regier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@jeff-regier jeff-regier merged commit 760e909 into master May 15, 2023
@jeff-regier jeff-regier deleted the make-cachedsd-map-dataset branch May 15, 2023 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants