Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameter Optimization for Keras model with large dataset #67

Closed
NTNguyen13 opened this issue Aug 21, 2018 · 7 comments
Closed

Hyperparameter Optimization for Keras model with large dataset #67

NTNguyen13 opened this issue Aug 21, 2018 · 7 comments
Labels
user support nothing is wrong with Talos

Comments

@NTNguyen13
Copy link

NTNguyen13 commented Aug 21, 2018

Condition Check:

  • [x ] I'm up-to-date with the latest release:

    pip install -U talos
    
  • [x ] I've confirmed that my Keras model works outside of Talos.

If you still have an error, please submit complete trace and a code with:

  • output of shape for x and y e.g. (212,12)
  • Talos params dictionary
  • The Keras model wired for Talos
  • Description of extra variables in the model

You can provide the code in pastebin / gist or any other format you like.


I want to perform Hyperparameter Optimization on my Keras Model. The problem is the dataset is quite big, normally in training I usefit_generatorto load the data in batch from disk, but the Talos only support fit method.

I tried to load the whole data to memory, by using this:

train_generator = train_datagen.flow_from_directory(
    original_dir,
    target_size=(img_height, img_width),
    batch_size=train_nb,
    class_mode='categorical')
X_train,y_train = train_generator.next()

But the when performing talos.Scan(), the OS kills it because of large memory usage. I also tried to undersampling my dataset to only 10%, but it's still too big.

I saw that the issue #11 is being working on, but I wonder is there any workaround strategy to perform Hyperparameter Opimization for large dataset in this case?

@matthewcarbone
Copy link
Collaborator

@NTNguyen13 Indeed it is something being worked on. Just curious: how large is your dataset exactly?

@NTNguyen13
Copy link
Author

Hi, @x94carbone my dataset has 11000 images, each with 30-40 KB

@matthewcarbone
Copy link
Collaborator

Hmm ok. Does your model work with Keras' fit on its own? Without using fit_generator?

@NTNguyen13
Copy link
Author

I tried it with small random data and it works, it didn't work in the real case because of the large dataset

@mikkokotila mikkokotila added the investigation gathering information label Aug 28, 2018
@matthewcarbone
Copy link
Collaborator

So this likely means that there's nothing wrong with Talos at the moment, we've just gotta implement a feature to get fit_generator to work. Feel free to let me know if I'm missing something here.

@mikkokotila mikkokotila added user support nothing is wrong with Talos and removed investigation gathering information labels Aug 30, 2018
@mikkokotila
Copy link
Contributor

Looks like all clear so closing here.

@johndavidmiller
Copy link

bump.

I'm fairly new to DL and Keras (but not new to other AI and ML) and I gotta say that it is amazing that most tools, examples, and academic papers are all built around these teeny tiny "toy" datasets like MNIST and CIFAR. Who uses 32x32 pixel images in a real application?

Please add my vote for adding Keras Sequence support in Talos so it can be used in real applications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user support nothing is wrong with Talos
Projects
None yet
Development

No branches or pull requests

4 participants