Skip to content

Documentation request: Text prompt primer #222

@MrDrMcCoy

Description

@MrDrMcCoy

One of the great difficulties of new users trying to coax AI image generators into producing something like what they imagine is the construction of the text prompt. Users are often told that they can just tell it things they want to see and it will do it. In my experience, many of the phrases I put into the prompts are either ignored or misunderstood. I suspect this is partially my own fault, and the situation would be improved with a bit of documentation.

What I'm looking for is a document that details the following:

  1. What phrases are understood for artistic styling? For example, would it understand things like pixel art, line drawing, comic book, pulp art, cad model, salvador dali, or solarpunk?
  2. What phrases are understood for characters and objects? For example, would it understand things like garden gnome, maelstrom, mineral vein, power armor, coat of arms, or soldering iron?
  3. What phrases are understood for verbs and modifiers? For example, would it understand things like opening, fallow, holding, jaundiced, ugly, angry, vibrating, dutch angle, or defenestrating?
  4. What phrases are understood for image output settings? For example, would it understand things like 16:9, UHD, or 5-bit color?
  5. Is there any significance to grammar or ordering of phrases?
  6. What are the practical limits of how many and how specific one's phrases might be?
  7. Are there any hidden modifier phrases that the processing engine watches for?
  8. What happens when you repeat phrases? For example, woman shining a flashlight in an alley, but the flashlight shines darkness instead of light.
  9. What grammar or phrases will be ignored by the processing engine?
  10. Are there any grammatical patterns that tend to lead to better results?
  11. Other tips and tricks for how to talk to the machine.

As with all current iterations of natural language processing, the engine's ability to interpret what we write will be significantly reduced from what humans can do. Therefore, humans need to know the boundaries of what the system can interpret so that we can talk to the machine in terms it will understand. Hopefully a document that details these things will be able to improve the usability, quality, and utility of tools like this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions