<p align="center">
<a href="https://github.com/jina-ai/dalle"><img src="https://res.cloudinary.com/startup-grind/image/upload/c_fill,dpr_2.0,f_auto,g_xy_center,h_650,q_auto:good,w_1440,x_w_mul_0.5,y_h_mul_0.0/v1/gcs/platform-data-dsc/event_banners/banner_8XSoAdr.png?md" alt="DALL·E Flow: A Human-in-the-Loop workflow for creating HD images from text" width="60%"></a>
<br>
</p>


<b>A Human-in-the-Loop<sup><a href="https://en.wikipedia.org/wiki/Human-in-the-loop">?</a></sup> workflow for creating HD images from text</b>

**🎓 TUM Workshop version**

[![GitHub Repo stars](https://img.shields.io/github/stars/jina-ai/dalle-flow?style=social)](https://github.com/jina-ai/dalle-flow) [![Google Colab](https://img.shields.io/badge/Slack-2.8k-blueviolet?logo=slack&amp;logoColor=white&style=flat-square)](https://slack.jina.ai) [![GitHub last commit (branch)](https://img.shields.io/github/last-commit/jina-ai/dalle-flow/main)](https://colab.research.google.com/github/jina-ai/dalle-flow/blob/main/client.ipynb)

Using client is super easy. The following steps are best run in Jupyter notebook or [Google Colab](https://colab.research.google.com/github/jina-ai/dalle-flow/blob/tum-workshop/client.ipynb).  

The only dependency you will need are [DocArray](https://github.com/jina-ai/docarray) and [Jina](https://github.com/jina-ai/jina), as DocArray is already included in Jina you only need to install `jina`

> 💁‍♂️ On Google Colab, you will need to restart the kernel after the install.

In [None]:
!pip install jina

We have provided a demo server for you to play:

In [None]:
server_url = 'grpc://tum-workshop.jina.ai:51005'

### Step 1: Generate via DALL·E Mega + GLIDE 3

Now let's define the prompt:

In [None]:
prompt = 'an oil painting of a humanoid robot playing chess in the style of Matisse'

**🎓 TUM Workshop**: do you need some hints and suggestions on the prompt? Check out those tricks:
 - [A Guide to Writing Prompts for Text-to-image AI](https://docs.google.com/document/d/17VPu3U2qXthOpt2zWczFvf-AH6z37hxUbvEe1rJTsEc/edit?usp=sharing)
 - [CLIP Templates](https://docs.google.com/document/d/1j2IAumYz4iZopOTAAOcCUKbFXP0jHK8mRgD4NLFKkaw/edit?usp=sharing)

Let's submit it to the server and visualize the results:

In [None]:
%%time

from docarray import Document

da = Document(text=prompt).post(server_url, parameters={'num_images': 8}).matches

da.plot_image_sprites(fig_size=(10,10), show_index=True)

Here we generate 16 candidates, 8 from DALLE-mega (upper) and 8 from GLIDE3 XL (lower), this is as defined in `num_images`, which takes about ~2 minutes. You can use a smaller value if it is too long for you.

### Step 2: Select and refinement via GLIDE 3

Of course, you may think differently. Notice the number in the top-left corner? Select the one you like the most and get a better view:

In [None]:
fav_id = 3

fav = da[fav_id]

fav.display()

Now let's submit the selected candidates to the server for diffusion.

In [None]:
%%time

diffused = fav.post(f'{server_url}', parameters={'skip_rate': 0.6, 'num_images': 9}, target_executor='diffusion').matches

diffused.plot_image_sprites(fig_size=(10,10), show_index=True)

This will give 9 images based on the given image. You may allow the model to improvise more by giving `skip_rate` a near-zero value, or a near-one value to force its closeness to the given image. The whole procedure takes about ~1 minutes.


### Step 3: Select and upscale via SwinIR

Select the image you like the most, and give it a closer look:


In [None]:
dfav_id = 2

fav = diffused[dfav_id]

fav.display()

Finally, submit to the server for the last step: upscaling to 1024 x 1024px.

> 💁‍♂️ This step will take ~30s.

In [None]:
%%time

fav = fav.post(f'{server_url}/upscale')
fav.display()

> 💁‍♂️ On Google colab this image may render exactly the same size as before. But it is in 1024x1024 already. Right click on the image and copy/save it. You will see.

That's it! It is _the one_. If not satisfied, please repeat the procedure. Btw, DocArray is a powerful and easy-to-use data structure for unstructured data. It is super productive for data scientists who work in cross-/multi-modal domain. To learn more about DocArray, [please check out the docs](https://docarray.jina.ai).

**🎓 TUM Workshop**: After this step, your artwork is automatically uploaded to our contest board! Heads up and look at the big screen, or find all artworks at here: http://tum-workshop.jina.ai:3001