# Evaluation of a Character Quality

**Purpose**: 
The purpose of this notebook is to evaluate the quality of a character using a standardized methodology and reproducible metrics.

*Note: This evaluation is required for all new characters prior to approval for the DCS-SE core character set and must be repeated for each character following any DCS-SE model update or game addition.*

## Quality Evaluation

Characterizing role-playing models is essential for understanding which models can effectively embody complex characters and whether our representations are sufficiently robust for different applications and experimental contexts.

To evaluate character quality, we use a structured approach designed to ensure consistency and robustness across researchers and over time as new characters are added and assessed.

### Methodology

Open-ended engagement is conducted first using the Explore game. This provides an initial, unstructured assessment of the character’s behavior, capabilities, and overall feel, and allows the researcher to become familiar with the character and update the character sheet with salient observations.

Next, general trends and observations are documented, with reference to prior evaluations where applicable.

Then the character is evaluated along specific dimensions for each of the core games, including counterfactual coherence, long-horizon coherence, contextual generalization, adversarial robustness, and theory-of-mind performance.

Finally, a summary assessment of the character’s representational quality is added to the character sheet and submitted for review.

### Step 0: Setup

In [None]:
# Make sure dcs is installed and check its version
!dcs --version

### Step 1: Open-Ended Engagement

Use the code below to launch the Explore game mode with the character. Update the character sheet as needed to address any obvious issues.

Record observations about what the character role-plays effectively and where it falls short. Make fixes where possible, and note any elements that may require guardrails. Capture any additional observations relevant to the quality and fidelity of the character’s representation.

In [None]:
# Update the characters db with your new character
!dcs create character --file my_new_character.json

In [None]:
# Run the Explore game and play/engage with your character
!dcs run --game Explore

### Step 2: Note General Trends

Across all character evaluation, the following trends have been observed:

TODO: update these trends/obsercations with specific measurements (how long is too long? how much divergence is too much? for what types of models? etc.)
1. Longer character sheets require larger models to be role-played effectively.
2. Character with higher "divergence" from normative human behavior require larger models and/or longer character sheets to be role-played effectively.


When engaging with the character, note any deviations from these trends.

### Step 3: Evaluate Counterfactual Coherence

Counterfactual coherence is a key dimension of character quality, reflecting how consistently the character behaves in line with its defined personality, values, and backstory.

In this step, we use counterfactual scenarios to probe the character’s coherence.

*Note: Coherence (consistency in novel situations) is important because transformers tend to ....*

TODO: create this test/evaluation (values & boundaries, moral lines, codes, non-negotiables, recall (factual accuracy)...
How would character X respond to the following novel scenario?
“What would this character do if they found a lost child in a busy train station?”
— even though this never occurs in the script, the answer should align with the character’s values, personality, and prior behavior.

If it doesn’t align—e.g., they suddenly act in a way that contradicts everything we know about them—that’s incoherent.

### Step 4: Long-Horizon Coherence

Long-horizon coherence assesses how well the character maintains its defined personality and values over extended interactions. This is important because we need to know when to put guardrails in place to prevent character drift.

TODO: create this test/evaluation (extended conversations, multi-turn dialogues, etc.)

### Step 5: Contextual Generalization

TODO: create this step

### Step 6: Adversarial Testing

TODO: create this step

### Step 7: Theory of Mind Tests

TODO: create this step

## Character Evaluation Summary

The following table should summaraize the character's strengths and limitations relative to games and models along the dimensions evaluated above.

1. Counterfactual Coherence
    - Present hypotheticals far from the script but consistent with the character’s world and see if they stay consistent in unseen scenarios.
    - Come up with unseen scenarios that are plausible/consistent with the character’s world but explicitly in the character sheet.
        + Character X tends to get oversensitized easily. Paint an opening scene of a construction or concert site or as a continauation of a scripts where the person bangs or talks loudly. Mark if the model stays in character and reacts as expected or not.


2. Long-Horizon Coherence
    •	Measure consistency over time in a long conversation or sequence. After how long do they drift off character?

3. Contextual Generalization
	•	Present nearby but new situations—same relationship, new stakes. Test whether a model adapts knowledge across slightly different contexts. 


4. Adversarial Testing
	•	Input designed to break the model or reveal inconsistencies. Throw tempting wrong choices or extreme circumstances: Like alternative instructions or offering the character a wildly out-of-character shortcut. See if the actor can justify staying in role without collapsing into caricature.


5. Theory of Mind Tests
	•	From Check if models infer others’ beliefs or emotions correctly. Give a scene where another character hides information; ask how their character interprets it. Shows whether the actor tracks what their character would or wouldn’t know.


These tell us what models can role play sufficiently well and for how long (turns, words, etc.) so we can set stop criteria for interactive applications, and what adversarial inputs to look out for.