-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
R/W generated dataset to/from disk #738
Conversation
bbfaf2f
to
4933705
Compare
Codecov Report
@@ Coverage Diff @@
## master #738 +/- ##
==========================================
+ Coverage 88.33% 88.41% +0.07%
==========================================
Files 13 14 +1
Lines 1132 1252 +120
==========================================
+ Hits 1000 1107 +107
- Misses 132 145 +13
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Please see requested changes inline.
Also, please make sure training iterations / second is at least 5.5. |
8d62140
to
4a2cde0
Compare
Currently, averaging to about 4 training iterations / second (in each epoch, starts out slower, then speeds up to > 5 it/s). |
DiskDataset renders images (in disk_batch_size) chunks and writes/reads them to/from disk via pickle. Encoder uses this DiskDataset in ImagePrior.batch_size batches. Currently does not handle multiple workers—potentially want multiple workers to write/read at the same time.
Also fix formatting.
`case_studies/721_decoder_speedup/main.py mode=generate` renders images via decoder and writes to pkl files on disk. Later `.../main.py mode=train` uses cached image data for training.
DiskDataset renders images (in disk_batch_size) chunks and writes/reads them to/from disk via pickle. Encoder uses this DiskDataset in ImagePrior.batch_size batches. Currently does not handle multiple workers—potentially want multiple workers to write/read at the same time.
Also fix formatting.
`case_studies/721_decoder_speedup/main.py mode=generate` renders images via decoder and writes to pkl files on disk. Later `.../main.py mode=train` uses cached image data for training.
Shuffles disk-cached dataset on each epoch via `random.shuffle` when creating iterator in `__iter__`.
6e33e9d
to
dca91dc
Compare
With configuration as in
gives median 6.1 it/s. |
bliss/simulator/simulated_dataset.py
Outdated
return DataLoader(self.valid, batch_size=None, num_workers=0) | ||
|
||
def test_dataloader(self): | ||
return DataLoader(self, batch_size=None, num_workers=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the test set the same as the training set? That seems like a problem. We should probably have separate files that contain the test set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Excellent! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Two-step process.
generator.py
instantiates and uses aSimulatedDataset
object to render images incfg.simulator.prior.batch_size
minibatches, then processes these generated minibatches by concatenation and flattening, before writing out to.pkl
files viapickle
.training.use_cached_simulator=true
config causes the training step to usedatamodule=instantiate(cfg.cached_simulator)
in PyTorch Lightning'strainer.fit
.Addresses #721.