# SDXL-Turbo: Finding a reasonable CLAMP range for exploring the prompt embedding space
Using the diverse parti prompts v2 (1,6k entries) to explore the range of values for prompt embeddings. 

In [1]:
import torch
from evolutionary_prompt_embedding.image_creation import SDPromptEmbeddingImageCreator
from evolutionary_prompt_embedding.utils import clamp_range_from_parti, clamp_range_per_entry

In [2]:
creator = SDPromptEmbeddingImageCreator(batch_size=1, inference_steps=1)

Loading pipeline components...:   0%|          | 0/5 [00:00<?, ?it/s]

Loaded StableDiffusionPipeline {
  "_class_name": "StableDiffusionPipeline",
  "_diffusers_version": "0.25.0",
  "_name_or_path": "stabilityai/sd-turbo",
  "feature_extractor": [
    null,
    null
  ],
  "image_encoder": [
    null,
    null
  ],
  "requires_safety_checker": false,
  "safety_checker": [
    null,
    null
  ],
  "scheduler": [
    "diffusers",
    "EulerDiscreteScheduler"
  ],
  "text_encoder": [
    "transformers",
    "CLIPTextModel"
  ],
  "tokenizer": [
    "transformers",
    "CLIPTokenizer"
  ],
  "unet": [
    "diffusers",
    "UNet2DConditionModel"
  ],
  "vae": [
    "diffusers",
    "AutoencoderKL"
  ]
}


## Simple min-max range for prompt_embeds

In [3]:
test1 = clamp_range_from_parti(creator, lambda_accessor=lambda x: x.prompt_embeds)
print("Range for prompt_embeds: ", test1)

Token indices sequence length is longer than the specified maximum sequence length for this model (84 > 77). Running this sequence through the model will result in indexing errors


The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['as a beacon over rolling blue hills']
Range for prompt_embeds:  (-10.234375, 15.6484375)


For the SD-Turbo model the CLAMP range for prompt_embeds is around (-10.2, 15.6).
Keep in mind you can extend the values, but this restricts the search space to a reasonable range.

## More detailed CLAMP range for each entry in the tensor

In [4]:
min_tensor, max_tensor = clamp_range_per_entry(creator, lambda_accessor=lambda x: x.prompt_embeds)
print("prompt_embeds:")
print("Min tensor: ", min_tensor)
print("Max tensor: ", max_tensor)
torch.save(min_tensor.to('cpu'), 'sd_turbo_min_tensor.pt')
torch.save(max_tensor.to('cpu'), 'sd_turbo_max_tensor.pt')

The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['as a beacon over rolling blue hills']
prompt_embeds:
Min tensor:  tensor([[[-0.3132, -0.4475, -0.0082,  ...,  0.2544, -0.0325, -0.2959],
         [-2.4746, -2.2949, -3.8438,  ..., -1.5371, -3.7773, -2.3926],
         [-3.6895, -3.8496, -3.6426,  ..., -2.2070, -3.2363, -4.2031],
         ...,
         [-0.9478, -2.6660, -1.1367,  ..., -1.1592, -1.2041, -1.0410],
         [-1.0312, -2.6875, -1.0918,  ..., -1.3984, -1.3584, -0.6152],
         [-1.1230, -3.0918, -3.0703,  ..., -1.6553, -1.5303, -0.3486]]],
       device='mps:0', dtype=torch.float16)
Max tensor:  tensor([[[-0.3132, -0.4475, -0.0082,  ...,  0.2544, -0.0325, -0.2959],
         [ 3.3125,  1.7324,  1.9639,  ...,  2.4590,  2.7031,  2.3008],
         [ 2.9746,  2.4570,  2.4492,  ...,  3.1328,  3.8105,  2.9688],
         ...,
         [ 2.3242,  0.4470,  0.9487,  ...,  0.5933,  1.2305,  1.3486],
         [ 2.2305,  0.9614,  1.0

In [5]:
diff_tensor = (max_tensor - min_tensor).to(dtype=torch.float32)
print("Value range: ", diff_tensor.sum())

Value range:  tensor(332369., device='mps:0')
