|
| 1 | +# Character Consistency — Face-Preserving Image Generation |
| 2 | + |
| 3 | +> Generate images that maintain a consistent character identity across multiple outputs using reference images and face embeddings. |
| 4 | +
|
| 5 | +--- |
| 6 | + |
| 7 | +## Overview |
| 8 | + |
| 9 | +Character consistency lets you anchor generated images to a reference face or character, ensuring the same person appears across portraits, expressions, full-body shots, and scene illustrations. AgentOS supports three levels of consistency via the `consistencyMode` parameter: |
| 10 | + |
| 11 | +| Mode | Strength | Use Case | |
| 12 | +|------|----------|----------| |
| 13 | +| `'strict'` | 0.85–0.9 | Avatar expression sheets, emotion variants. Face must match exactly. | |
| 14 | +| `'balanced'` | 0.6 | Full-body shots, different angles. Recognizable but allows natural variation. | |
| 15 | +| `'loose'` | 0.3 | "Inspired by" generations. Style/mood carries over, face may drift. | |
| 16 | + |
| 17 | +## Provider Support |
| 18 | + |
| 19 | +| Provider | Mechanism | Models | |
| 20 | +|----------|-----------|--------| |
| 21 | +| **Replicate** | Pulid (strict), Flux image input (balanced/loose) | `zsxkib/pulid`, `black-forest-labs/flux-dev` | |
| 22 | +| **Fal** | IP-Adapter | `fal-ai/flux/dev` | |
| 23 | +| **SD-Local** | ControlNet + IP-Adapter extension | Any SD 1.5 / SDXL checkpoint | |
| 24 | +| OpenAI | Not supported (graceful ignore) | — | |
| 25 | +| Stability | Not supported (graceful ignore) | — | |
| 26 | + |
| 27 | +## Basic Usage |
| 28 | + |
| 29 | +```typescript |
| 30 | +import { generateImage } from '@framers/agentos'; |
| 31 | + |
| 32 | +// Generate a consistent expression variant |
| 33 | +const result = await generateImage({ |
| 34 | + provider: 'replicate', |
| 35 | + prompt: 'Portrait of the character smiling warmly, soft lighting', |
| 36 | + referenceImageUrl: 'https://storage.example.com/character-neutral.png', |
| 37 | + consistencyMode: 'strict', |
| 38 | +}); |
| 39 | +``` |
| 40 | + |
| 41 | +When `consistencyMode` is `'strict'` and no model is explicitly set, Replicate auto-selects `zsxkib/pulid` for maximum face consistency. |
| 42 | + |
| 43 | +## Fields Reference |
| 44 | + |
| 45 | +### `referenceImageUrl` |
| 46 | + |
| 47 | +URL or base64 data URI of the reference character image. Each provider maps this to its native mechanism: |
| 48 | + |
| 49 | +- **Replicate (Pulid):** `main_face_image` input |
| 50 | +- **Replicate (standard Flux):** `image` input with `image_strength` |
| 51 | +- **Fal:** `ip_adapter_image` body field |
| 52 | +- **SD-Local:** ControlNet `input_image` with IP-Adapter preprocessor |
| 53 | + |
| 54 | +### `faceEmbedding` |
| 55 | + |
| 56 | +Optional 512-dimensional vector from InsightFace or equivalent. Used by the `AvatarPipeline` for drift detection — after generating each image, the pipeline extracts the face embedding from the output and compares it to this anchor via cosine similarity. Images that drift below the threshold (default 0.6) are regenerated. |
| 57 | + |
| 58 | +### `consistencyMode` |
| 59 | + |
| 60 | +Controls how aggressively the provider preserves the reference identity: |
| 61 | + |
| 62 | +```typescript |
| 63 | +// Strict — for expression sheets where faces must match |
| 64 | +await generateImage({ |
| 65 | + prompt: 'Character looking angry, dramatic lighting', |
| 66 | + referenceImageUrl: neutralPortrait, |
| 67 | + consistencyMode: 'strict', // Pulid auto-selected on Replicate |
| 68 | +}); |
| 69 | + |
| 70 | +// Balanced — for full-body shots |
| 71 | +await generateImage({ |
| 72 | + prompt: 'Full body shot of the character walking through a market', |
| 73 | + referenceImageUrl: neutralPortrait, |
| 74 | + consistencyMode: 'balanced', |
| 75 | +}); |
| 76 | + |
| 77 | +// Loose — for "inspired by" mood pieces |
| 78 | +await generateImage({ |
| 79 | + prompt: 'Abstract portrait in the style of the character', |
| 80 | + referenceImageUrl: neutralPortrait, |
| 81 | + consistencyMode: 'loose', |
| 82 | +}); |
| 83 | +``` |
| 84 | + |
| 85 | +## AvatarPipeline Integration |
| 86 | + |
| 87 | +The `AvatarPipeline` uses consistency modes per stage: |
| 88 | + |
| 89 | +| Stage | Mode | Rationale | |
| 90 | +|-------|------|-----------| |
| 91 | +| `neutral_portrait` | none | This IS the anchor — no reference exists yet | |
| 92 | +| `face_embedding` | none | Extraction, not generation | |
| 93 | +| `expression_sheet` | `'strict'` | Facial identity must match across all emotions | |
| 94 | +| `animated_emotes` | `'strict'` | Same character in motion | |
| 95 | +| `full_body` | `'balanced'` | Body proportions can vary; face should be recognizable | |
| 96 | +| `additional_angles` | `'balanced'` | 3/4 and profile views naturally differ from frontal | |
| 97 | + |
| 98 | +```typescript |
| 99 | +import { AvatarPipeline } from '@framers/agentos/media/avatar'; |
| 100 | + |
| 101 | +const pipeline = new AvatarPipeline(faceService, imageGenerator); |
| 102 | +const result = await pipeline.generate({ |
| 103 | + characterId: 'hero_001', |
| 104 | + identity: { |
| 105 | + displayName: 'Kael Stormwind', |
| 106 | + ageBand: 'young_adult', |
| 107 | + faceDescriptor: 'sharp jawline, green eyes, short dark hair, small scar above left eyebrow', |
| 108 | + }, |
| 109 | + generationConfig: { |
| 110 | + baseModel: 'black-forest-labs/flux-dev', |
| 111 | + provider: 'replicate', |
| 112 | + }, |
| 113 | + stages: ['neutral_portrait', 'face_embedding', 'expression_sheet', 'full_body'], |
| 114 | +}); |
| 115 | +``` |
| 116 | + |
| 117 | +## Choosing the Right Mode |
| 118 | + |
| 119 | +- **Avatars and expression sheets:** Always `'strict'`. The face is the product. |
| 120 | +- **Scene illustrations with known characters:** `'balanced'`. Character should be recognizable but the scene composition matters more. |
| 121 | +- **Style exploration and mood boards:** `'loose'`. The reference influences the vibe, not the pixels. |
| 122 | +- **No reference at all:** Omit `referenceImageUrl` entirely. The fields are fully optional. |
| 123 | + |
| 124 | +## Related |
| 125 | + |
| 126 | +- [Image Generation](./IMAGE_GENERATION.md) — Provider-agnostic generation API |
| 127 | +- [Style Transfer](./STYLE_TRANSFER.md) — Transfer visual aesthetics between images |
| 128 | +- [Image Editing](./IMAGE_EDITING.md) — Img2img, inpainting, upscaling |
0 commit comments