# Brand Diversity & Persona Bias Analysis

This notebook analyzes:
1. Overall brand diversity distribution
2. Persona-wise brand exposure bias

Data source:
- `persona_brand_tone_part_final.csv`


In [ ]:
import pandas as pd
import numpy as np

DATA_PATH = '../data/persona_brand_tone_part_final.csv'
df = pd.read_csv(DATA_PATH)

print('Rows:', len(df))
df.head()

## 1. Overall Brand Distribution

In [ ]:
brand_dist = (
    df.groupby('brand')['score']
      .mean()
      .sort_values(ascending=False)
)

brand_dist

## 2. Persona-wise Brand Exposure

In [ ]:
persona_brand = (
    df.groupby(['persona_id', 'brand'])['score']
      .mean()
      .reset_index()
)

persona_brand.head()

## 3. Brand Concentration per Persona

In [ ]:
def concentration_index(scores):
    p = scores / scores.sum()
    return (p ** 2).sum()  # Herfindahl-Hirschman Index

persona_concentration = (
    persona_brand.groupby('persona_id')['score']
    .apply(concentration_index)
    .sort_values(ascending=False)
)

persona_concentration

## 4. Interpretation Guide

- Higher concentration index → stronger brand bias
- Use this to:
  - apply brand caps
  - adjust persona-brand weights
  - tune resampling temperature
