Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing realtime augmentation #8

Closed
awhitesong opened this issue Jan 5, 2016 · 4 comments
Closed

Implementing realtime augmentation #8

awhitesong opened this issue Jan 5, 2016 · 4 comments

Comments

@awhitesong
Copy link

Hi, i was going through your code for using realtime_augmentation.py,
I can see the training in all try* files. I am trying to understand how minibatch is augmented in training function. xs_shared is used for training but is initialized to zero? Saw the previous issue, you said the jpegs are loaded on the fly. So , trying to go through train_norm, but its pretty messed up. So not getting how ra is used. How should i be using functions in ra for realtime augmenting if i have a preloaded dataset trani_set_x like:

train_model = theano.function(inputs=[index], outputs=cost, updates=updates,
givens={
                x: train_set_x[index * batch_size:(index + 1) * batch_size],
                y: train_set_y[index * batch_size:(index + 1) * batch_size]
})

If not, i can load the image dataset on the fly, then where its been done in the code?

@benanne
Copy link
Owner

benanne commented Jan 5, 2016

This is the part of the code that loads up the JPEG images and perturbs them: https://github.com/benanne/kaggle-galaxies/blob/master/realtime_augmentation.py#L185-L194

you can simply replace the load_data.load_img() call by something else that returns a numpy array and then you're all set.

@awhitesong
Copy link
Author

  1. Can i include my mini_batch instead of image_index in the function?

  2. load_and_process_image is being called in LoadAndProcess class.
    So if i call LoadAndProcess(img_index, ds_transforms, augmentation_params, target_sizes), it will return me the augmented data. This is being returned in any try* file as:

    augmented_data_gen = ra.realtime_augmented_data_gen(num_chunks=NUM_CHUNKS, chunk_size=CHUNK_SIZE,
                                                    augmentation_params=augmentation_params, ds_transforms=ds_transforms,
                                                    target_sizes=input_sizes, processor_class=ra.LoadAndProcessPysexCenteringRescaling)
    

    But how is this augmented data in any try* file, taken as input to CNN in the training?

  3. Also the load_and_process_image calls random_perturbation_transform through perturb_and_dscrop, which does transformations. And the augmenting functions are also implemented in load_data.py. So do i need to include load_data.py as well in my code?

I am just trying to understand what all files do i need to include in my code and how should i augment and include the data in my training loop. How should i use augmented_data_gen(pt.2) in my theano training function?

@benanne
Copy link
Owner

benanne commented Jan 5, 2016

  1. sure, just modify it to suit your needs.
  2. I don't understand the question.
  3. I suppose so.

I should clarify that I am unable to give extensive support for this code. It's provided as-is. If you want to use it or parts of it for anything else, you will have to figure things out on your own. Thanks for your understanding.

@awhitesong
Copy link
Author

Actually i think i am understanding the codebase now. I figured it out somehow. Thanks a lot anyway for the help and a great code. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants