# What's In A Name? Investigating Bias And Identity In Text-To-Image Models

## Introduction

This work will investigate the relationships between names and biases in text to image models. some narrower topics of interest include:

* constructing measures for characterizing the degree to which a model has learned the relationship between a name and specific entity
* measuring biases associated with names that are not tightly bound to identities
* characterizing how biases of these kinds can be composed into new identities
* characterizing the "stickiness" of specific attribute biases to identities/names
* investigating the potential for leveraging these measure to monitor training progress for personalization finetuning and general purpose pre-training


* biases in names
* known identities ("strong entities")
* overfitted identities (memorized images)
  * mona lisa?

* Just investigating biases, not making normative statements

* associated hypotheses
* out of scope hypotheses (future work)

## Hypotheses

1. The more weakly a name carries learned biases, the more closely the distribution of attributes observed when prompting that name approximates the model's global prior for the biases in those attributes. Consequence: low bias names can be used to probe model bias (assuming such names can be identified)

2. "pseudo-identities" can be composed for prompt-engineering in text space, similar to how celeb-basis composes identities in token space.

3. "Identity strength" can be quantified wrt...
  * rigidity of distribution of attributes generated by a given(fixed) prompt
  * flexibility to prompt identity into new scenarios (e.g. "sks shaking hands with albert einstein")

4. identity evolves following a particular pattern, and monitoring this pattern can be used to evaluate model fit/grokking (e.g. for LoRA)
  
5. identity/bias-preserving representations can be crystallized from pre-trained representations, and leveraging these should permit measuring identity strength/rigidity more efficiently than with "un-crystallized" pre-trained representations.

## Methodology [wip]

* choosing names
* generating images

## Experiments

* calibrating minimum # images
* investigating which embedding representation to use
* experimenting with summary statistics

## Insights

* presumptive phases of bias formation
* correlation between age bias and popularity of names by birth year
* strength of ethnic bias reflects undersampled data class
  * strong biases hint at low data cardinality for the given class


## Discussion


## Future Work


## References

* CLIP
* Stable Diffusion
* Stable Biases
* Celeb Basis

## Appendix

* Calibration plots
* DINO-vits stuff (assuming DINO-vitg shows in main report)


# Generating Images

## Collecting Names

a dataset of images was generated to interrogate the biases in the associated prompts. For an initial small dataset, names identified as useful in the Celeb Basis experiment were utilized (`celebs.txt`) to start from a base of names that were known to have representational biasesof varying strengths  associated. Rather than utilizing the whole name, we take only the first name to get a distribution over bias strengths. This procedure produced a list of 107 first names which was used to prototype our experimental methods.

Images were generated using stable diffusion v1.4 via huggingface's Diffuser's library. For each name, 128 images were generated using the prompt: `"a photo of {name}, portrait photography, full color, face full frame"`. The number 128 was arrived at by spot checking the convergence of naive "identity strength" measures and determining that 128 images provided a reasonable baseline for low variance measurement.

Future work: In the experimentation that motivated this work, it was observed that 4-8 images are generally sufficient for an expert prompter to distinguish "strong identities" from "pseudo-identities". 128 images should be interpreted as an extremely conservative upper bound, and we look forward to future works that identify or construct measures that are able to distinguish identity strength and bias rigidity in prompts using fewer images.

# Measuring Strength of Identities in Prompts

... tried DINO

... determined CLIP was better because...



![CLIP UMAP Identities](./clip-ftw.png)

In [3]:
import pandas as pd

df_names = pd.read_csv('2023-09-25_names_ranked.csv')
df_names.head()

Unnamed: 0,name,similarity@128
0,rihanna,0.922737
1,kanye,0.921163
2,beyoncé,0.905713
3,oprah,0.904635
4,madonna,0.901482


In [4]:
df_names.tail()

Unnamed: 0,name,similarity@128
102,lionel,0.700712
103,robin,0.700462
104,dante,0.696849
105,bruce,0.69661
106,roger,0.683294
