# D&D Encounter Difficulty Calibration
## Generating Data
**T. J. Johnson**

---

The goal of this notebook is to demonstrate how to use the encounter_calibration scripts to simulate the outcomes of many enounters of a given difficulty and save the results as CSV files.  For full details of the scripts (e.g., reasoning behind the approach, assumptions made, etc.) see the repo README.

### Generating configuration files

First, we'll generate YAML configuration files for each difficulty category.  The ```write_configuration``` utility function takes as a required input the name of the output configuration file and then several other inputs which set things like the number of player characters (PCs), the average to hit for both the PCs, etc. (see the doc_string for more details).

For our purposes, we're only going to provide the target difficulty as the default values are what we're aiming for in initial exploration (five first level PCs with average values for PCs and enemies determined as detailed in the README).

In [1]:
#assumption is that the scripts are in the current directory or added to your PYTHONPATH

from encounter_utils import write_configuration

difficulties=['easy','medium','hard','deadly']

for difficulty in difficulties:
    write_configuration(config_file=f'{difficulty}_battle.yml',
                        difficulty=difficulty)

Let's take a look at one of the configurations to see what is inside.

In [2]:
import yaml

with open('easy_battle.yml','r') as cfile:
    config=yaml.safe_load(cfile)

config

{'difficulty': 'easy',
 'num_pcs': 5,
 'extras': 5,
 'pcs_levels': 1,
 'pcs_AC': 13,
 'pcs_ATK': 5,
 'pcs_HP': 8.5,
 'num_enemies': 0,
 'enemies_AC': 3,
 'enemies_ATK': 13,
 'enemies_HP': 0,
 'CRs': 'None',
 'initiative': 'None'}

### Running Many Simulations

Now that we've generated out configuation files, we can run many simulations, for battles of each difficulty rating, in parallel and save the output in CSV files (one per difficulty rating).  This is done via the ```generate_encounter_results``` function in the _run\_encounters_ script.  This function uses custom classes from the _encounter_ and _battle\_group_ scripts along with the _pandas_ and _multiprocessing_ modules and a supplied YAML configuration file to take care of everything.

An important question is "how many simulations"?  Let's first put the goal in context and then answer that question.

The great thing about table top roleplaying games is the player creativity in overcoming encounters, be they combat or social.  Trying to capture this creativity would require a far more sophisticated simulation.  However, some creative ideas succeed and some fail, so these outcomes will be 'in the tails.'  Therefore, we use a simple simulation to assess outcomes on average for each difficulty rating, with lofty ideas for future enhancements, and use our own, and crowd sourced, experience to evaluate how the assumptions and simplifications hold up.

We plan to assess the difficulty ratings via descriptive statistics and plots, such as "How many party resources is such an encounter expected to require?"  A given difficulty rating encompasses a range of scenarios, so these questions will result in a range of values.  We will want enough simulations to have a good feeling for the core of that range of values, but also to explore what the tails look like, even though we expect that these simple simulations will not accurately capture all of the crazy possibilities real game play can sometimes result in.

With these considerations in mind, and some experience generating simulations for more controlled situations, we will opt for 10,000 simulations to start with.  Once we start digging into the simulated data, we can evaluate if this (somewhat arbitrary) choice needs to be revisited.

In [3]:
from run_encounters import generate_encounter_results

num_sims=10000

#the simulations run fast and don't take up much memory
#so we'll run with 6 processes, note that this value
#should be set with the specs of your own machine
#kept in mind
num_jobs=6

#we'll use the index value from enumerate as a random seed
#for the rng used in each set of results, this creates some
#reproducibility while also leading to variation
#between the inital starting points
for idx,difficulty in enumerate(difficulties):
    #we'll use a couple of print statements as very basic ways
    #to monitor things
    print(f'Beginning simulations for {difficulty = }...')
    
    generate_encounter_results(encounter_config=f'{difficulty}_battle.yml',
                               output_csv=f'Simulated_{difficulty}_{num_sims}battles.csv',
                               num_sims=num_sims,
                               num_jobs=num_jobs,
                               SEED=idx)
    
    print(f'Done with simulations for {difficulty = }!')

Beginning simulations for difficulty = 'easy'...
Done with simulations for difficulty = 'easy'!
Beginning simulations for difficulty = 'medium'...
Done with simulations for difficulty = 'medium'!
Beginning simulations for difficulty = 'hard'...
Done with simulations for difficulty = 'hard'!
Beginning simulations for difficulty = 'deadly'...
Done with simulations for difficulty = 'deadly'!


### A Quick Look at the Output CSV Files

We will save a detailed investigation of the simulation results for a separate notebook, but it doesn't hurt to take a quick look and make sure that things look as expected (proper columns, proper values, etc.).  For this, we'll use _pandas_ to read in the CSV file corresponding to the 'easy' encounters as a dataframe and take a quick peek.

The column names are in the first line and there is no expected index column.

In [4]:
import pandas as pd

#some of the 'fraction' values can get messy with many numbers after
#the decimal, so we'll trim things to 4 decimal points
pd.set_option('display.precision',4)

easy_df=pd.read_csv('Simulated_easy_10000battles.csv')

easy_df.head(10)

Unnamed: 0,party_hp,party_extras,frac_party_hp,frac_party_extras,num_party_down,frac_party_down,success,enemies_hp,num_enemies_down,num_enemies,frac_enemies_down,CRs,totalXP,num_rounds,num_turns
0,9,0,0.2143,0.0,0.0,0.0,1,-6.0,1,1,1.0,1,200.0,4,18
1,42,5,1.0,1.0,0.0,0.0,1,-3.5,1,1,1.0,0,10.0,1,1
2,34,5,0.8095,1.0,0.0,0.0,1,-6.5,1,1,1.0,1/4,50.0,2,9
3,42,5,1.0,1.0,0.0,0.0,1,0.0,1,1,1.0,1/8,25.0,1,2
4,30,5,0.7143,1.0,0.0,0.0,1,-6.5,2,2,1.0,0_1/2,165.0,2,13
5,30,5,0.7143,1.0,0.0,0.0,1,-3.0,3,3,1.0,0_1/2_0,240.0,2,15
6,23,1,0.5476,0.2,0.0,0.0,1,-6.0,1,1,1.0,1,200.0,2,11
7,23,3,0.5476,0.6,0.0,0.0,1,-2.5,2,2,1.0,1/2_1/4,225.0,3,17
8,38,4,0.9048,0.8,0.0,0.0,1,-6.5,1,1,1.0,1/4,50.0,2,8
9,38,5,0.9048,1.0,0.0,0.0,1,-6.5,1,1,1.0,1/4,50.0,2,7


In [5]:
easy_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 15 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   party_hp           10000 non-null  int64  
 1   party_extras       10000 non-null  int64  
 2   frac_party_hp      10000 non-null  float64
 3   frac_party_extras  10000 non-null  float64
 4   num_party_down     10000 non-null  float64
 5   frac_party_down    10000 non-null  float64
 6   success            10000 non-null  int64  
 7   enemies_hp         10000 non-null  float64
 8   num_enemies_down   10000 non-null  int64  
 9   num_enemies        10000 non-null  int64  
 10  frac_enemies_down  10000 non-null  float64
 11  CRs                10000 non-null  object 
 12  totalXP            10000 non-null  float64
 13  num_rounds         10000 non-null  int64  
 14  num_turns          10000 non-null  int64  
dtypes: float64(7), int64(7), object(1)
memory usage: 1.1+ MB


In [6]:
easy_df.describe()

Unnamed: 0,party_hp,party_extras,frac_party_hp,frac_party_extras,num_party_down,frac_party_down,success,enemies_hp,num_enemies_down,num_enemies,frac_enemies_down,totalXP,num_rounds,num_turns
count,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0
mean,31.7528,3.8446,0.756,0.7689,0.0085,0.0017,0.9922,-4.8707,1.5851,1.6198,0.982,125.1697,2.1196,10.5569
std,8.8051,1.1859,0.2096,0.2372,0.0918,0.0184,0.088,4.0262,0.9144,0.9294,0.1099,79.3996,0.8571,5.9987
min,-13.0,0.0,-0.3095,0.0,0.0,0.0,0.0,-27.5,0.0,1.0,0.0,10.0,1.0,1.0
25%,26.0,3.0,0.619,0.6,0.0,0.0,1.0,-6.5,1.0,1.0,1.0,50.0,1.0,6.0
50%,34.0,4.0,0.8095,0.8,0.0,0.0,1.0,-3.5,1.0,1.0,1.0,112.5,2.0,11.0
75%,40.0,5.0,0.9524,1.0,0.0,0.0,1.0,-3.0,2.0,2.0,1.0,200.0,3.0,15.0
max,42.0,5.0,1.0,1.0,1.0,0.2,1.0,57.0,6.0,8.0,1.0,240.0,5.0,36.0


For an 'easy' encounter, we would generally expect that the party would almost always win.  That seems to be the case, with only 78 encounters out of 10,000 where the party failed.  Interestingly, the number of PCs unconscious at the end of the encounter (denoted by the 'num\_party\_down' column) is always 1, and the challenge rating is always 1.  This seems to indicate that PCs were knocked down, brought back up, maybe knocked down again, and then the final blow managed to put the total hit points of the party below zero.  This could be viewed as imitating a series of unlucky rolls for the PCs, lucky rolls for the enemies, and one final area of effect attack wiping out the party.

Area of effect attacks are not explicitly put into the simulation, so this may be a bit of a stretch.  While we have dug into the code to explore why this is happening, we will carefully examine the results, including for other difficulty ratings, in a different notebook to assure ourselves it isn't a bug in the code.

In [7]:
easy_df.query("success==0")

Unnamed: 0,party_hp,party_extras,frac_party_hp,frac_party_extras,num_party_down,frac_party_down,success,enemies_hp,num_enemies_down,num_enemies,frac_enemies_down,CRs,totalXP,num_rounds,num_turns
220,-1,4,-0.0238,0.8,1.0,0.2,0,22.0,0,1,0.0,1,200.0,3,13
506,-1,2,-0.0238,0.4,1.0,0.2,0,1.0,0,1,0.0,1,200.0,3,14
600,-1,4,-0.0238,0.8,1.0,0.2,0,15.0,0,1,0.0,1,200.0,3,13
697,-1,4,-0.0238,0.8,1.0,0.2,0,15.0,0,1,0.0,1,200.0,3,13
715,-1,4,-0.0238,0.8,1.0,0.2,0,15.0,0,1,0.0,1,200.0,3,13
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9181,-1,3,-0.0238,0.6,1.0,0.2,0,8.0,0,1,0.0,1,200.0,3,12
9358,-1,4,-0.0238,0.8,1.0,0.2,0,1.0,0,1,0.0,1,200.0,3,15
9371,-1,3,-0.0238,0.6,1.0,0.2,0,1.0,0,1,0.0,1,200.0,3,15
9385,-8,2,-0.1905,0.4,1.0,0.2,0,1.0,0,1,0.0,1,200.0,4,18


In [8]:
easy_df.query("success==0").CRs.describe()

count     78
unique     1
top        1
freq      78
Name: CRs, dtype: object