Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ImageGPT #14240

Merged
merged 33 commits into from
Nov 18, 2021
Merged

Add ImageGPT #14240

merged 33 commits into from
Nov 18, 2021

Conversation

NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Nov 2, 2021

What does this PR do?

This PR adds ImageGPT, "Generative Pre-training from Pixels", by OpenAI. ImageGPT is to GPT2 what ViT is to BERT.

OpenAI released 3 variants (small, medium and large) more than a year ago. Models are on the hub: https://huggingface.co/models?other=imagegpt

It directly fits into the existing GPT-2 model (with some minor changes: "quick gelu" activation function, different layernorm, no tied embeddings). The cool thing is that you can just use the generate() method to generate pixel values.

Here's a Colab notebook for both conditional and unconditional image generation.
Update: new notebook, with ImageGPTFeatureExtractor.

Big thanks go to openai/image-gpt#7, who made it very easy for me to understand and contribute the model.

To do:

  • Write tests for ImageGPTFeatureExtractor.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the model, but shouldn't there be a tokenizer/feature extractor to prepare the input for the model from a raw image? I feel that's the main the classification model can't be added to the auto-mapping.

README.md Outdated Show resolved Hide resolved
docs/source/model_doc/imagegpt.rst Outdated Show resolved Hide resolved
src/transformers/__init__.py Outdated Show resolved Hide resolved
src/transformers/__init__.py Outdated Show resolved Hide resolved
src/transformers/models/imagegpt/__init__.py Outdated Show resolved Hide resolved
src/transformers/models/imagegpt/modeling_imagegpt.py Outdated Show resolved Hide resolved
src/transformers/models/imagegpt/modeling_imagegpt.py Outdated Show resolved Hide resolved
src/transformers/models/imagegpt/modeling_imagegpt.py Outdated Show resolved Hide resolved
src/transformers/models/imagegpt/modeling_imagegpt.py Outdated Show resolved Hide resolved
src/transformers/models/imagegpt/modeling_imagegpt.py Outdated Show resolved Hide resolved
@vitormm44
Copy link

I'm trying to use the notebook you linked, but I'm receiving an import error:

ImportError: cannot import name 'ImageGPTForCausalLM' from 'transformers' (/usr/local/lib/python3.7/dist-packages/transformers/init.py)

@apeguero1
Copy link

Hi @NielsRogge! I just realized the original Image GPT architecture uses a root mean square instead of a standard deviation in its layer normalization so I believe it should look like this. Looks like there's a paper on it too haha :D

Interestingly, the image generation produces only subtle visible differences when using the same random seed but better to stick with the original tensorflow implementation I guess?

@NielsRogge
Copy link
Contributor Author

Thanks, I've updated it.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, @NielsRogge, this is in great shape.

  • It would be great to add some more explicit code examples; then I'm happy to play more with it regarding the API.
  • I've added a comment regarding the feature extractor. Pinging @sgugger as he might have a different opinion regarding how it should best be done.
  • Ideally, the image in the docs would be stored in a dataset rather than in the repository

docs/source/model_doc/imagegpt.rst Outdated Show resolved Hide resolved
src/transformers/models/imagegpt/modeling_imagegpt.py Outdated Show resolved Hide resolved
src/transformers/models/imagegpt/modeling_imagegpt.py Outdated Show resolved Hide resolved
@NielsRogge NielsRogge merged commit da36c55 into huggingface:master Nov 18, 2021
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image is still in the docs/source/imgs. It is unfortunate, as now that it has been merged to the repository it will always weigh down the repository.

I don't have a problem with the rest of the API, thanks for completing the examples and adapting the feature extractor.

@@ -249,6 +249,7 @@ Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.
1. **[GPT Neo](https://huggingface.co/transformers/model_doc/gpt_neo.html)** (from EleutherAI) released in the repository [EleutherAI/gpt-neo](https://github.com/EleutherAI/gpt-neo) by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy.
1. **[Hubert](https://huggingface.co/transformers/model_doc/hubert.html)** (from Facebook) released with the paper [HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units](https://arxiv.org/abs/2106.07447) by Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed.
1. **[I-BERT](https://huggingface.co/transformers/model_doc/ibert.html)** (from Berkeley) released with the paper [I-BERT: Integer-only BERT Quantization](https://arxiv.org/abs/2101.01321) by Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer.
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

@@ -247,6 +247,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
1. **[GPT-J](https://huggingface.co/transformers/model_doc/gptj.html)** (from EleutherAI) released in the repository [kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax/) by Ben Wang and Aran Komatsuzaki.
1. **[Hubert](https://huggingface.co/transformers/model_doc/hubert.html)** (from Facebook) released with the paper [HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units](https://arxiv.org/abs/2106.07447) by Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed.
1. **[I-BERT](https://huggingface.co/transformers/model_doc/ibert.html)** (from Berkeley) released with the paper [I-BERT: Integer-only BERT Quantization](https://arxiv.org/abs/2101.01321) by Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer.
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

@@ -271,6 +271,7 @@ conda install -c huggingface transformers
1. **[GPT-J](https://huggingface.co/transformers/model_doc/gptj.html)** (来自 EleutherAI) 伴随论文 [kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax/) 由 Ben Wang and Aran Komatsuzaki 发布。
1. **[Hubert](https://huggingface.co/transformers/model_doc/hubert.html)** (来自 Facebook) 伴随论文 [HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units](https://arxiv.org/abs/2106.07447) 由 Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed 发布。
1. **[I-BERT](https://huggingface.co/transformers/model_doc/ibert.html)** (来自 Berkeley) 伴随论文 [I-BERT: Integer-only BERT Quantization](https://arxiv.org/abs/2101.01321) 由 Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer 发布。
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (来自 OpenAI) 伴随论文 [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) 由 Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever 发布。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (来自 OpenAI) 伴随论文 [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) 由 Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever 发布。
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (来自 OpenAI) 伴随论文 [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) 由 Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever 发布。

@@ -283,6 +283,7 @@ conda install -c huggingface transformers
1. **[GPT-J](https://huggingface.co/transformers/model_doc/gptj.html)** (from EleutherAI) released with the paper [kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax/) by Ben Wang and Aran Komatsuzaki.
1. **[Hubert](https://huggingface.co/transformers/model_doc/hubert.html)** (from Facebook) released with the paper [HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units](https://arxiv.org/abs/2106.07447) by Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed.
1. **[I-BERT](https://huggingface.co/transformers/model_doc/ibert.html)** (from Berkeley) released with the paper [I-BERT: Integer-only BERT Quantization](https://arxiv.org/abs/2101.01321) by Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer.
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.
1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants