Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GQN trained on CLEVR dataset #27

Open
loganbruns opened this issue Jun 16, 2019 · 10 comments
Open

GQN trained on CLEVR dataset #27

loganbruns opened this issue Jun 16, 2019 · 10 comments

Comments

@loganbruns
Copy link

Thanks for the GQN implementation. I thought you might enjoy seeing some pictures of how it does trained on a different dataset. (Albeit with a limited amount of training time. I plan to train for longer.)

Screen Shot 2019-06-11 at 6 04 24 AM

Even on the test set it works pretty well with a relatively small amount of training. Seems to generalize better than on the flat shaded deepmind dataset.

image

I'm curious what kind of changes you might be interested in via pull request? I have some changes to the training parameters and also I've found self-attention to improve the speed of generalization and training in general. However that wasn't in the original paper.

Thanks,
logan

@waiyc
Copy link

waiyc commented Jun 19, 2019

Hi logan,
Your results looks great.
May I know what is the dataset size and how long do you train the model?

Chan

@loganbruns
Copy link
Author

@waiyc, approximately 100k iterations on ~15k training examples. Not as long as I'd have liked nor with as much data as I'd liked. I'm thinking of generating more data and retraining. Maybe at the size of the original CLEVR dataset which was significantly larger. (Waiting on some more disks.)

@ogroth
Copy link
Owner

ogroth commented Jun 19, 2019

@loganbruns That looks great, thank you for sharing these results! :) I'd be very happy to include a data loader for the CLEVR dataset (either from raw files or from pre-processed tfrecords). I'm currently in the middle of updating the data loader to a more stable and tf 1.12.1 compatible version. The update should be online within the week. So feel free to send a pull request for a CLEVR data loader. It should live under data_provider/clevr_provider.py and be modelled after the updated gqn_provider.py
I'm also very interesed in (self-) attention mechanisms for the model since they were used in follow-up papers like the localization and mapping one. I'm happy to discuss this on a separate issue thread.

@loganbruns
Copy link
Author

@ogroth, thanks for the reference. I'll read it. I also created a separate issue to discuss perhaps merging some of the changes. Regarding CLEVR, since I had to modify the dataset generation I also added to the dataset generation changes code to convert it into the deepmind dataset format. I was thinking of asking them if they'd take some of the changes so others could use their generator to generate for GQNs. That is what I was thinking at least.

@phongnhhn92
Copy link

@loganbruns would you mind sharing the conversion code that you have used to convert CLEVR dataset to GQN dataset tfrecords format, I am also creating my own dataset and still struggling to understand the GQN dataset format to make it work with this implementation.

@ogroth
Copy link
Owner

ogroth commented Jun 21, 2019

Hi @loganbruns , the new input pipeline is now in master. Would you mind modelling your input_fn for CLEVR after this one? Also, you can include data generation and conversion code for CLEVR under data_provider. I'm happy to review your pull request. :)

@loganbruns
Copy link
Author

@phongnhhn92, here is the source:

https://github.com/loganbruns/clevr-dataset-gen/blob/clevr_gqn/image_generation/convert_gqn.py

Just let me know if you have any questions.

@loganbruns
Copy link
Author

@ogroth , thanks. I'll take a look.

@waiyc
Copy link

waiyc commented Jun 26, 2019

@loganbruns From your convert_gqn.py I can see that you saved each scene with N number of frames as one TFrecord. As you mentioned you trained the model with 15k training example, so you generated 15k scenes/.tfrecord as training data.

Is my understanding correct?

@loganbruns
Copy link
Author

@waiyc , yes, 15k scenes each with N number of frames. I generated a file per train, val, and test. The train tfrecord file had 15k scenes. For the deepmind dataset each tfrecord file has 5k scenes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants