-
Notifications
You must be signed in to change notification settings - Fork 18
Styles: Fooocus V2 & Substyles
Like the other Styles, the Fooocus V2 style adds words to the prompt to help refine it. Each of these other styles are fairly short word lists and every word in the list is added to the prompt. So the standard styles add to the prompt in a very specific way.
In contrast, Fooocus V2 uses GPT-2 artificial intelligence (AI) to expand on the prompt by adding hopefully appropriate words from a dictionary of at least 642 entries. And in addition, randomness is introduced using the same seed used by the generative model. When using Fooocus V2, the generative process is actually functioning with two AI sources, the base model itself and the GPT-2 prompt expander.
The images shown in this article all use the HyperFlux5 preset with the same seed, using the prompt "Doukhobor adult man", derived from the wildcards "ethnic grownup". At the top of the page is the control image that does not use Fooocus V2 - or any other styles.
Fooocus V2 can use alternative word lists called substyles which make its prompt expansion more suitable for certain genres. Substyle selection is straightforward. If Fooocus V2 has been selected from the Styles tab, a Fooocus V2 Substyle dropdown menu overlays the bottom of the styles list to enable substyle selection. And a few presets are associated with specific substyles. For example, the FaeTastic and Magick presets default to using Spellcasting Druid.
While Fooocus V2 can be effective in building a powerful image, it does have some drawbacks. While its purpose is to strengthen the main prompt there are times when it may take it in an undesirable direction. It can also reduce the effectiveness of other styles, although Fooocus V2 is always added to the prompt after any other styles so that its influence is somewhat reduced.
The original or Default substyle was designed to be generic, and often that is exactly what an image needs.
Default uses 642 words which represent 834 tokens. A token is a unique numeric code used in the GPT-2 master dictionary. For example, a word such as "towering" will cause the AI to search for all available variations on this word and supply them as tokens for processing. In the case of "towering" there are four variations in the master dictionary: "tower', "Tower", "towers" and "Towers". Including the actual word in the substyle that makes a total of five possible tokens. Most likely the listed word, "towering", will be chosen, but the other possibilities enrich the lexical resources available to the AI. This brings more diversity and flexibility to the prompt building process.
As found in the image log, this is the Fooocus V2 Expansion for the image above:
"Doukhobor adult man, dramatic light, gorgeous, amazing, delicate, elegant, highly detailed, complex, artistic, sharp focus, fine detail, professional, winning, singular, best, creative, cool, unique, awesome, epic, stunning, cute, perfect, colorful, illuminated, pretty, attractive, pure, smart, focused, positive, loving, excellent"
The words in the Default substyle that are associated lighting would be incompatible with IC-Light feature. The Unlit substyle was created to be compatible with IC-Light, and it is the only substyle that IC-Light can use.
Unlit was created from Default by removing all lighting references, as well as other words that are unlikely to have an influence on generation, such as "trustworthy" - how would generative AI depict "trustworthy"?
In their place a lot of more influential words were added, bringing the total to 700 words, representing 951 tokens. Unlit provides a richer and potentially more interesting generic V2 substyle and, as shown in the image above, it does not actually exclude lighting objects.
"Doukhobor adult man, urban exotic city, gorgeous atmosphere, perfect, cinematic, dynamic, wide, sharp focus, highly detailed, fine detail, intricate, elegant, complex, epic color, ambient, dramatic sky, professional, composed, vivid, beautiful, emotional, shiny, marvelous, awesome, creative, pure, wonderful, fancy, breathtaking, symmetry, amazing"
Lovecraftian Damnation includes words from the vocabulary of author H. P. Lovecraft, and to a lesser extent Edgar Allan Poe and Ambrose Bierce. This substyle sets the stage for Lovecraft inspired insanity and horror, with even Arkham, Cerberus and Cthulhu putting in an appearance. And on Poe's behalf, the raven may say a word or two!
Like all the custom substyles Lovecraftian Damnation contains 700 words, in this case representing 984 tokens. This is the expansion associated with the Lovecraftian image above:
"Doukhobor adult man, Lovecraftian Damnation, cosmic horror, monster, eerie, evil, mist, magic, dark, gloomy, crystal, golden, detailed, very complex, extremely advanced fog, wizard, witch, cinematic light, incredible colors, painted, fine detail, polished, intricate, powerful, stone, artifact, glowing, strong, epic, forest"
With an emphasis on space science fiction and an optimistic future, Star Trek is represented by words such as "borg", "Klingon", "shuttle", "Starfleet", "starship", "transporter" and "Vulcan" (and most of the real planets in the solar system). Star Wars is represented by "blaster", "droid", "Jedi", "lightsaber", "master", "pods" and "Sith".
However Sci-Fi avoids specific character references like Kirk or Skywalker. It is likely better to put character names in the prompt if you want them, rather than have them popup at random.
Sci-Fi contains 984 tokens, and the image above was guided by this GPT-2 expansion:
"Doukhobor adult man, Sci-Fi, surreal, majestic, dramatic, intricate, elegant, sharp, digital engine, steam, epic composition, android, detailed, cinematic, dynamic, full moon, New fascinating city, star, planet, metal, shiny, colorful, light, dark, determined, creative, amazing, beautiful, attractive, delicate, cool, charming"
In many ways the opposite of Lovecraftian Damnation, Spellcasting Druid emphasizes the light side of natural magic, wild mystery and supernatural beauty - all words in the list. This is the richest of all the substyles, with its 700 words expressed in 1008 tokens. The image above represents this prompt expansion:
"Doukhobor adult man, Spellcasting Druid, Gold Ring, decorative magic crystal, field background, beautiful dynamic dramatic bright rainy atmosphere, cinematic lightning, storm, night, spring, realistic, golden, detailed, clay, focused, extremely, highly, majestic, artistic, nature, scenic, wool, rough, painted, pattern, light, legend, lore, wide open"
While Sci-Fi represents the imagined future of technology, Victorian Steampunk depicts the technology of a reimagined past. But not only is this the age of innovation and discovery, but the dystopian London of Dickens with its poverty and disease is included. From its collection of 985 tokens, this is the Victorian Steampunk prompt expansion for the image above.
"Doukhobor adult man, Victorian Steampunk, oil engineering intricate, metal, brass, leather, steam, gears, cart, glowing, dramatic, lens, detailed, excellent composition, cinematic dystopian vibrant colors, dynamic serious atmosphere, artistic, very inspirational, stunning, inspiring, creative, realistic, gorgeous, color, highly detail, incredible quality, inspired, rich vivid colorful"
And to round out this picture album, I could not omit a Lovecraftian image of a monster, this one with a different seed: