gen-text-and-image

Use the Cohere text generation API and the Replicate image generation API together to generate text-and-image pairs

(c) Andrew Wren 2022. MIT Licence but note that to run you need API keys and to agree to comply with Cohere and Replicate's terms.

A lake

Portrait of the actress Marlene Dietrich

Mouse in the snow

Description

This project uses a transformer text generator, trained with an appropriate prompt, to generate a very short piece of text. This text is then used by a stable diffusion image generator to generate a corresponding image.

PROMPT

-gen->

'Portrait of the actress Marlene Dietrich'

-gen->

In my experience about two-thirds of the images are 'good' in terms of both image quality and fit to the text. This is, of course, a subjective judgement! See 30 examples in the \examples directory, which I have split into Good and Bad sub-directories.

How to use

Start by looking at the text-and-image pairs in the /examples/ directory, which I have divided, subjectively, into Good and Bad sub-directories.
These pairs were generated using main.py.

To generate more yourself:

(1) After cloning the repo, create a conda environment by running conda env create -f environment.yml in the repo root directory.

(2) Get API keys from Cohere and Replicate. Enter them in /lib/api_key_template.py as indicated, and then rename it to /lib/api_key_template.py. DO NOT add this file to git or otherwise share it as it now contains your API keys.
NOTE that, depending on quantity of generations used, you may need to pay for usage of these API keys.

(3) Run the Python program main.py to generate ten text-and-image pairs which can then be found in the /generate/ directory. The number to generate can be altered by using -n <number> as command line argument.

Things to try

The prompt used (PROMPT in /lib/settings.py) is important for the quality of the pairs generated. A very simple prompt like 'Draw an artistic picture of' generates more Bad pairs than Good pairs. The prompt currently used has a rate of about 1 Bad to every 2 Good. "Mileage may vary"; your views on Good and Bad may differ from mine! Can you find further improved prompts?

Try also using more Cohere tools. I found that training a classifier on Good and Bad examples, and then using this classification as a filter, helped poor prompts, but getting a good prompt was more effective. Visualisation of embeddings associated with the text does not suggest a particularly close relationship between these (mean over token) embeddings but maybe that will change if you improve the prompt further. See /resources/ for classifier.py and embedder.py which may help.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
examples		examples
inputs		inputs
lib		lib
resources		resources
.gitignore		.gitignore
LICENCE		LICENCE
README.md		README.md
environment.yml		environment.yml
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

inputs

inputs

lib

lib

resources

resources

.gitignore

.gitignore

LICENCE

LICENCE

README.md

README.md

environment.yml

environment.yml

main.py

main.py

Repository files navigation

gen-text-and-image

Use the Cohere text generation API and the Replicate image generation API together to generate text-and-image pairs

Description

How to use

Things to try

About

Releases

Packages

Languages

License

AndrewWren/gen-text-and-image

Folders and files

Latest commit

History

Repository files navigation

gen-text-and-image

Use the Cohere text generation API and the Replicate image generation API together to generate text-and-image pairs

Description

How to use

Things to try

About

Resources

License

Stars

Watchers

Forks

Languages