Name		Name	Last commit message	Last commit date
parent directory ..
models		models
tests		tests
training_scripts		training_scripts
README.md		README.md
args.py		args.py
benchmarks.yml		benchmarks.yml
configs.yml		configs.yml
generate.py		generate.py
ipu_options.py		ipu_options.py
log.py		log.py
process_captions.py		process_captions.py
requirements.txt		requirements.txt
train.py		train.py

README.md

Mini DALL-E

Mini DALL-E for text-to-image generation, based on the models provided by the CompVis library and the lucidrains repo

Framework	Domain	Model	Datasets	Tasks	Training	Inference	Reference
PyTorch	Multimodal	Mini DALL-E		Text-to-image generation	✅ Min. 4 IPUs (POD4) required	✅ Min. 4 IPUs (POD4) required	Zero-Shot Text-to-Image Generation

Instructions summary

Install and enable the Poplar SDK (see Poplar SDK setup)
Install the system and Python requirements (see Environment setup)
Download the ImageNet LSVRC 2012 dataset (See Dataset setup)

Poplar SDK setup

To check if your Poplar SDK has already been enabled, run:

 echo $POPLAR_SDK_ENABLED

If no path is provided, then follow these steps:

Navigate to your Poplar SDK root directory
Enable the Poplar SDK with:

cd poplar-<OS version>-<SDK version>-<hash>
. enable.sh

Additionally, enable PopART with:

cd popart-<OS version>-<SDK version>-<hash>
. enable.sh

More detailed instructions on setting up your Poplar environment are available in the Poplar quick start guide.

Environment setup

To prepare your environment, follow these steps:

Create and activate a Python3 virtual environment:

python3 -m venv <venv name>
source <venv path>/bin/activate

Navigate to the Poplar SDK root directory
Install the PopTorch (PyTorch) wheel:

cd <poplar sdk root dir>
pip3 install poptorch...x86_64.whl

Navigate to this example's root directory
Install the Python requirements:

pip3 install -r requirements.txt

Running and benchmarking

To run a tested and optimised configuration and to reproduce the performance shown on our performance results page, use the examples_utils module (installed automatically as part of the environment setup) to run one or more benchmarks. The benchmarks are provided in the benchmarks.yml file in this example's root directory.

For example:

python3 -m examples_utils benchmark --spec <path to benchmarks.yml file>

Or to run a specific benchmark in the benchmarks.yml file provided:

python3 -m examples_utils benchmark --spec <path to benchmarks.yml file> --benchmark <name of benchmark>

For more information on using the examples-utils benchmarking module, please refer to the README.

More detailed instructions on setting up your PyTorch environment are available in the PyTorch quick start guide.

Dataset setup

COCO 2017

Download the COCO 2017 dataset from the source or via kaggle, or via the script we provide:

bash utils/download_coco_dataset.sh

Additionally, also download and unzip the labels:

curl -L https://github.com/ultralytics/yolov5/releases/download/v1.0/coco2017labels.zip -o coco2017labels.zip && unzip -q coco2017labels.zip -d '<dataset path>' && rm coco2017labels.zip

Disk space required: 26G

.
├── LICENSE
├── README.txt
├── annotations
├── images
├── labels
├── test-dev2017.txt
├── train2017.cache
├── train2017.txt
├── val2017.cache
└── val2017.txt

3 directories, 7 files

The annotations in this dataset along with this website belong to the COCO Consortium and are [https://creativecommons.org/licenses/by/4.0/legalcode](licensed under a Creative Commons Attribution 4.0 License). The COCO Consortium does not own the copyright of the images. Use of the images must abide by the [https://www.flickr.com/creativecommons/](Flickr Terms of Use). The users of the images accept full responsibility for the use of the dataset, including but not limited to the use of any copies of copyrighted images that they may create from the dataset. Full terms and conditions and more information are available on the [https://cocodataset.org/#termsofuse](Terms of Use)

Then some preprocessing is required:

python process_captions.py

The captions files are generated in ./data/COCO/train2017_captions.

Running and benchmarking

To run a tested and optimised configuration and to reproduce the performance shown on our performance results page, use the examples_utils module (installed automatically as part of the environment setup) to run one or more benchmarks. The benchmarks are provided in the benchmarks.yml file in this example's root directory.

For example:

python3 -m examples_utils benchmark --spec <path to benchmarks.yml file>

Or to run a specific benchmark in the benchmarks.yml file provided:

python3 -m examples_utils benchmark --spec <path to benchmarks.yml file> --benchmark <name of benchmark>

For more information on using the examples-utils benchmarking module, please refer to the README.

Other features

Text-to-image generation

To run text-image generation after training, with a checkpoint file (dalle_779.pt here):

python generate.py --dalle_path ./output/ckpt/dalle_799.pt --text "A plate of food has potatoes and fruit." --outputs_dir ./output --bpe_path models/bpe/bpe_yttm_vocab.txt

for example.

Licensing

This application is licensed under MIT license. Please see the LICENSE file in this directory for full details of the license conditions.

The following files are created by Graphcore and are licensed under MIT License (^* means additional license information stated following this list):

configs.yaml
README.md
requirements.txt
run_train.sh
log.py
tests/cpu_ipu_test.py
data/process_trainset.py

The following files include code derived from this repo which uses MIT license:

args.py
train.py
generate.py
models/init.py
models/attention.py
models/dalle.py
models/loader.py
models/tokenizer.py
models/transformer.py
models/vae.py
bpe/bpe_simple_vocab_16e6.txt.gz
bpe/bpe_yttm_vocab.txt

External packages:

taming-transformer, youtokentome, pyyaml, wandb, pytest, pytest-pythonpath are licensed under MIT License
torchvision is licensed under BSD 3-Clause License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytorch

pytorch

README.md

Mini DALL-E

Instructions summary

Poplar SDK setup

Environment setup

Running and benchmarking

Dataset setup

COCO 2017

Running and benchmarking

Other features

Text-to-image generation

Licensing

Files

pytorch

Directory actions

More options

Directory actions

More options

Latest commit

History

pytorch

Folders and files

parent directory

README.md

Mini DALL-E

Instructions summary

Poplar SDK setup

Environment setup

Running and benchmarking

Dataset setup

COCO 2017

Running and benchmarking

Other features

Text-to-image generation

Licensing