Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support image labels in Torch #880

Merged
merged 5 commits into from
Aug 2, 2016

Conversation

gheinrich
Copy link
Contributor

Depends on #806 for tests.

@gheinrich
Copy link
Contributor Author

@TimZaman do you mind reviewing the last commit? I have rebased the change on top of your data augmentation changes. Thanks!

# choose random seed and add to userdata so it gets persisted
self.userdata['seed'] = random.randint(0, 1000)

random.seed(self.userdata['seed'])
Copy link
Contributor

@TimZaman TimZaman Jul 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, usage of the user seed. This reminds me however of the way shuffling, ofc an random op, is done elsewhere in DIGITS: This is a unrelated to this PR though.
In object detection datasets: https://github.com/NVIDIA/DIGITS/blob/master/digits/extensions/data/objectDetection/data.py#L235
and actually also in folder parsing
https://github.com/NVIDIA/DIGITS/blob/master/tools/parse_folder.py#L416
It's not clear to me if that seed from folder parsing is random or if the scope is persistent from somewhere else. In any case, in uploading a folder it might also be nice to be able to randomize or fix the seed, during shuffling in folder parsing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I need to save the seed here as I want to generate the same random indices from two separate processes (the process that creates the training DB and the process that creates the validation DB). Indeed this is different from the (many) other seeds that are used in the places you pointed at. Here a new seed is generated every time we create a dataset (however it's done only once per dataset).

@TimZaman
Copy link
Contributor

Briefly ran through it and looks fine, but I'd would have to run some different types of data through otherwise; but I recon you have done that yourselves.
I would also vote for a small doc/example on the side for some simple upscaling. And since it can now handle floating point: stock market predictions, anyone?

@gheinrich
Copy link
Contributor Author

I would also vote for a small doc/example on the side for some simple upscaling

Did you see the image segmentation PR where I show that we can do inference on a bigger image than we used during training?

@TimZaman
Copy link
Contributor

Did you see the image segmentation PR where I show that we can do inference on a bigger image than we used during training?

Oh yeah #830 forgot about that one. You've built in quite the dependency on #806, that's more where that comment should have gone under.

To be used with networks where the input and the output are images
Labels were restricted to scalars (classification) or vectors (regression).
This change extends supported label types to anything that fits in an LMDB.
Small refactoring of datum decoding.
@gheinrich
Copy link
Contributor Author

rebased and updated according to comment #880 (comment)

@gheinrich gheinrich merged commit 67ec262 into NVIDIA:master Aug 2, 2016
@gheinrich gheinrich deleted the dev/torch-image-labels branch November 30, 2016 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants