Add ImageGPT #14240

NielsRogge · 2021-11-02T10:10:33Z

What does this PR do?

This PR adds ImageGPT, "Generative Pre-training from Pixels", by OpenAI. ImageGPT is to GPT2 what ViT is to BERT.

OpenAI released 3 variants (small, medium and large) more than a year ago. Models are on the hub: https://huggingface.co/models?other=imagegpt

It directly fits into the existing GPT-2 model (with some minor changes: "quick gelu" activation function, different layernorm, no tied embeddings). The cool thing is that you can just use the generate() method to generate pixel values.

Here's a Colab notebook for both conditional and unconditional image generation.
Update: new notebook, with ImageGPTFeatureExtractor.

Big thanks go to openai/image-gpt#7, who made it very easy for me to understand and contribute the model.

To do:

Write tests for ImageGPTFeatureExtractor.

sgugger

Thanks for adding the model, but shouldn't there be a tokenizer/feature extractor to prepare the input for the model from a raw image? I feel that's the main the classification model can't be added to the auto-mapping.

README.md

docs/source/model_doc/imagegpt.rst

src/transformers/__init__.py

src/transformers/models/imagegpt/__init__.py

src/transformers/models/imagegpt/modeling_imagegpt.py

vitormm44 · 2021-11-11T11:40:49Z

I'm trying to use the notebook you linked, but I'm receiving an import error:

ImportError: cannot import name 'ImageGPTForCausalLM' from 'transformers' (/usr/local/lib/python3.7/dist-packages/transformers/init.py)

apeguero1 · 2021-11-15T22:04:19Z

Hi @NielsRogge! I just realized the original Image GPT architecture uses a root mean square instead of a standard deviation in its layer normalization so I believe it should look like this. Looks like there's a paper on it too haha :D

Interestingly, the image generation produces only subtle visible differences when using the same random seed but better to stick with the original tensorflow implementation I guess?

NielsRogge · 2021-11-16T15:26:20Z

Thanks, I've updated it.

LysandreJik

Thanks for working on this, @NielsRogge, this is in great shape.

It would be great to add some more explicit code examples; then I'm happy to play more with it regarding the API.
I've added a comment regarding the feature extractor. Pinging @sgugger as he might have a different opinion regarding how it should best be done.
Ideally, the image in the docs would be stored in a dataset rather than in the repository

docs/source/model_doc/imagegpt.rst

src/transformers/models/imagegpt/feature_extraction_imagegpt.py

src/transformers/models/imagegpt/modeling_imagegpt.py

LysandreJik

The image is still in the docs/source/imgs. It is unfortunate, as now that it has been merged to the repository it will always weigh down the repository.

I don't have a problem with the rest of the API, thanks for completing the examples and adapting the feature extractor.

LysandreJik · 2021-11-19T08:36:12Z

README.md

@@ -249,6 +249,7 @@ Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.
 1. **[GPT Neo](https://huggingface.co/transformers/model_doc/gpt_neo.html)** (from EleutherAI) released in the repository [EleutherAI/gpt-neo](https://github.com/EleutherAI/gpt-neo) by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy.
 1. **[Hubert](https://huggingface.co/transformers/model_doc/hubert.html)** (from Facebook) released with the paper [HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units](https://arxiv.org/abs/2106.07447) by Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed.
 1. **[I-BERT](https://huggingface.co/transformers/model_doc/ibert.html)** (from Berkeley) released with the paper [I-BERT: Integer-only BERT Quantization](https://arxiv.org/abs/2101.01321) by Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer.
+1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.


Suggested change

1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

LysandreJik · 2021-11-19T08:36:30Z

README_ko.md

@@ -247,6 +247,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
 1. **[GPT-J](https://huggingface.co/transformers/model_doc/gptj.html)** (from EleutherAI) released in the repository [kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax/) by Ben Wang and Aran Komatsuzaki.
 1. **[Hubert](https://huggingface.co/transformers/model_doc/hubert.html)** (from Facebook) released with the paper [HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units](https://arxiv.org/abs/2106.07447) by Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed.
 1. **[I-BERT](https://huggingface.co/transformers/model_doc/ibert.html)** (from Berkeley) released with the paper [I-BERT: Integer-only BERT Quantization](https://arxiv.org/abs/2101.01321) by Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer.
+1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.


Suggested change

1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

LysandreJik · 2021-11-19T08:36:56Z

README_zh-hans.md

@@ -271,6 +271,7 @@ conda install -c huggingface transformers
 1. **[GPT-J](https://huggingface.co/transformers/model_doc/gptj.html)** (来自 EleutherAI) 伴随论文 [kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax/) 由 Ben Wang and Aran Komatsuzaki 发布。
 1. **[Hubert](https://huggingface.co/transformers/model_doc/hubert.html)** (来自 Facebook) 伴随论文 [HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units](https://arxiv.org/abs/2106.07447) 由 Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed 发布。
 1. **[I-BERT](https://huggingface.co/transformers/model_doc/ibert.html)** (来自 Berkeley) 伴随论文 [I-BERT: Integer-only BERT Quantization](https://arxiv.org/abs/2101.01321) 由 Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer 发布。
+1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (来自 OpenAI) 伴随论文 [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) 由 Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever 发布。


Suggested change

1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (来自 OpenAI) 伴随论文 [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) 由 Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever 发布。

1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (来自 OpenAI) 伴随论文 [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) 由 Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever 发布。

LysandreJik · 2021-11-19T08:37:14Z

README_zh-hant.md

@@ -283,6 +283,7 @@ conda install -c huggingface transformers
 1. **[GPT-J](https://huggingface.co/transformers/model_doc/gptj.html)** (from EleutherAI) released with the paper [kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax/) by Ben Wang and Aran Komatsuzaki.
 1. **[Hubert](https://huggingface.co/transformers/model_doc/hubert.html)** (from Facebook) released with the paper [HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units](https://arxiv.org/abs/2106.07447) by Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed.
 1. **[I-BERT](https://huggingface.co/transformers/model_doc/ibert.html)** (from Berkeley) released with the paper [I-BERT: Integer-only BERT Quantization](https://arxiv.org/abs/2101.01321) by Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer.
+1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.


Suggested change

1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

1. **[ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html)** (from OpenAI) released with the paper [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

NielsRogge requested review from sgugger and patil-suraj November 2, 2021 14:12

NielsRogge force-pushed the add_image_gpt branch from a506d95 to afa5b0c Compare November 2, 2021 17:15

sgugger reviewed Nov 2, 2021

View reviewed changes

NielsRogge force-pushed the add_image_gpt branch from beb25de to 419096e Compare November 15, 2021 12:02

LysandreJik reviewed Nov 17, 2021

View reviewed changes

NielsRogge force-pushed the add_image_gpt branch from 613f001 to fd9470d Compare November 18, 2021 13:18

NielsRogge added 20 commits November 18, 2021 14:58

First draft

71efcde

More improvements

7de8c12

Improve conversion script

8c006c5

Fix init weights for layer norm

2225fef

Fix correct model for conversion script

82f2c32

Don't tie input and output embeddings

467b4ba

Add print statements for debugging

cd7b470

Add print statements for debugging

ee2fbf6

Fix vocab size of model

4e9488e

Improve documentation, remove fast tokenizer

168ebca

Add ImageGPTForImageClassification, improve docs

cc3aca6

Fix docs issue

6f9f9b8

Set verbosity level back to info

9559425

Improve tests

eec09bb

Fix tests and add figure

e14ec0b

Delete tokenizer file

654bdd1

Remove ImageGPTTokenizer from init files

43188da

Remove ImageGPTLayer from init files

5343a38

Remove ImageGPT tokenizer from docs

4bc67f5

First draft of ImageGPTFeatureExtractor

2386968

NielsRogge added 13 commits November 18, 2021 14:58

Fix typo

c184fd6

Fix bug

d5485e8

More improvements

1ac93e9

Apply suggestions from code review, add tests for feature extractor

1d134b8

Fix layernorm

91bde3e

Update save_pretrained method

12600ec

Fix issue

f64cb91

Make all tests of ImageGPTFeatureExtractor pass

dfb97ec

Update code examples

faf7e21

Rename model inputs to pixel_values

3783c52

Improve code examples

faabdfe

Update init_weights to post_init

1235e2a

Fix post_init

a67120d

NielsRogge force-pushed the add_image_gpt branch from fd9470d to a67120d Compare November 18, 2021 14:08

NielsRogge merged commit da36c55 into huggingface:master Nov 18, 2021

LysandreJik reviewed Nov 19, 2021

View reviewed changes

NielsRogge mentioned this pull request Nov 19, 2021

[ImageGPT] Small fixes #14460

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ImageGPT #14240

Add ImageGPT #14240

NielsRogge commented Nov 2, 2021 •

edited

Loading

sgugger left a comment

vitormm44 commented Nov 11, 2021

apeguero1 commented Nov 15, 2021

NielsRogge commented Nov 16, 2021

LysandreJik left a comment

LysandreJik left a comment

LysandreJik Nov 19, 2021

LysandreJik Nov 19, 2021

LysandreJik Nov 19, 2021

LysandreJik Nov 19, 2021

	1. [ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html) (from OpenAI) released with the paper [Generative Pretraining from Pixes](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.
	1. [ImageGPT](https://huggingface.co/transformers/master/model_doc/imagegpt.html) (from OpenAI) released with the paper [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever.

Add ImageGPT #14240

Add ImageGPT #14240

Conversation

NielsRogge commented Nov 2, 2021 • edited Loading

What does this PR do?

To do:

sgugger left a comment

Choose a reason for hiding this comment

vitormm44 commented Nov 11, 2021

apeguero1 commented Nov 15, 2021

NielsRogge commented Nov 16, 2021

LysandreJik left a comment

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

LysandreJik Nov 19, 2021

Choose a reason for hiding this comment

LysandreJik Nov 19, 2021

Choose a reason for hiding this comment

LysandreJik Nov 19, 2021

Choose a reason for hiding this comment

LysandreJik Nov 19, 2021

Choose a reason for hiding this comment

NielsRogge commented Nov 2, 2021 •

edited

Loading