Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Conversation

@rahul-tuli
Copy link
Member

This pull request adds support for batched iteration in sparsezoo's Dataset class

@rahul-tuli rahul-tuli force-pushed the feature-iter-batches branch 4 times, most recently from 93edac5 to 4b968ac Compare July 16, 2021 20:42
@rahul-tuli rahul-tuli force-pushed the feature-iter-batches branch 2 times, most recently from aac3f46 to dda9c6c Compare July 18, 2021 14:05
* iter_batches function in Dataset class returns a BatchLoader object
* BatchLoader class added
* Moved utils.py
* Renamed utils.py
* Created test_data.py
* Cleanup
* Fix Typo
@rahul-tuli rahul-tuli force-pushed the feature-iter-batches branch from dda9c6c to 57325b4 Compare July 18, 2021 14:11
@rahul-tuli rahul-tuli requested review from dbarbuzzi and horheynm July 19, 2021 15:19
Copy link
Contributor

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice refactor @rahul-tuli
going to sync offline about the single input case

rahul-tuli and others added 2 commits July 19, 2021 12:19
Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
Address:PR review comments
@rahul-tuli rahul-tuli requested a review from bfineran July 19, 2021 17:03
Copy link
Contributor

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just one comment for the code. @rahul-tuli for tests/sparsezoo/utils.py can we rename it to helpers.py and then we need to change every instance of from tests.sparsezoo.utils import ... to from tests.sparsezoo.helpers import ...

Fix:Unwrapping Single Input Errors
@rahul-tuli rahul-tuli force-pushed the feature-iter-batches branch from 36e6fc6 to 477191e Compare July 19, 2021 19:14
@rahul-tuli rahul-tuli requested a review from bfineran July 19, 2021 19:17
bfineran
bfineran previously approved these changes Jul 19, 2021
Copy link
Contributor

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending tests!

Fix:Unwrapping Single Input Errors
Update:tests_data.py
bfineran
bfineran previously approved these changes Jul 19, 2021
iterations: int,
):
self._data = data
self._single_input = type(self._data[0]) is numpy.ndarray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would still be true if someone passed in a List[numpy.ndarray]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want it to be True in that case, maybe inverting the condition and renaming the variable will make more sense, check latest push.

caveat: this is to distinguish b/w [[np.ndarray]] and [np.ndarray]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we wrap inputs of the form List[np.array] in an extra outer list in the initializer, then unwrap it just before yielding a batch

Comment on lines 123 to 124
if self._single_input:
batch = batch[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does starting with a single input produce a single batch? this is ignoring the batch_size parameter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It produces more batches, we are unwrapping an extra outer list here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, if the dataset is of numpy arrays, we want to return batches of numpy arrays. this unwraps the array batch from a list as part of other shared logic

Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks Rahul

@rahul-tuli rahul-tuli merged commit 5ca44f8 into main Jul 20, 2021
@rahul-tuli rahul-tuli deleted the feature-iter-batches branch July 20, 2021 20:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants