# D&D Race and Class Breakdown
- Here is an updated count of the various Race/Class combinations on DandDBeyond.com.
-What we will do here is a brief bit of Data Exploration, followed by answering some questions.


## Data Exploration
In this section we will:
- Take a look at the data
- Make a crosstab to better see the data
- Make some basic visualizations
- Lay down the framework for our questions

### Import, Load and Look

In [1]:
import pandas as pd
import matplotlib as plt
import numpy as np 
%config IPCompleter.greedy=True
%matplotlib inline
import seaborn as sns
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import cufflinks as cf

In [2]:
init_notebook_mode(connected=True)

In [3]:
cf.go_offline()

In [4]:
pwd

'C:\\Users\\Hakuj\\Documents\\DnD Project'

In [5]:
df = pd.read_csv('DnDBeyond.csv')

In [6]:
df.head()

Unnamed: 0,RACE,Class,n_characters
0,Human,Wizard,21665
1,Half-Elf,Warlock,19173
2,Human,Fighter,18920
3,Half-Orc,Fighter,13922
4,Half-Elf,Bard,12903


In [7]:
df.shape

(555, 3)

In [8]:
df = df.rename(columns={'n_characters':'Total', 'RACE':'Race'})

### Break up the data
- Let's make a crosstab to better utilize the 'Totals' column
- We'll also make a list of the Races and Classes

In [25]:
RaceClass = df.pivot_table(index='Race', columns='Class', values='Total', margins=True, fill_value=0)

In [26]:
RaceClass.values.dtype
#Fix this!

dtype('float64')

In [11]:
RaceClass
#Totals are floats?

Class,Artificer,Barbarian,Bard,Blood Hunter,Cleric,Druid,Fighter,Monk,Paladin,Ranger,Rogue,Sorcerer,Warlock,Wizard,All
Race,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
Aarakocra,610,9169,1834,869,6175,1698,10603,5272,3764,3822,3006,1810,1807,2901,3810.0
Aasimar,105,113,1637,120,963,247,1079,308,964,336,1280,1886,2159,1502,907.071429
Bugbear,76,3215,303,266,361,325,3086,1080,585,843,2344,188,363,215,946.428571
Centaur,43,973,136,187,858,771,994,454,415,702,199,110,159,114,436.785714
Centaur-UA,0,363,74,109,244,231,514,190,189,320,65,46,68,39,188.615385
Changeling,393,243,4806,370,1030,782,1231,935,698,658,6374,3004,4786,1671,1927.214286
Dragonborn,247,5198,3479,1274,4445,2024,9187,2973,11973,3276,5393,9072,7216,3342,4935.642857
Dwarf,29,233,255,162,1671,1563,347,809,133,315,405,585,311,237,503.928571
Elf,228,2459,687,449,11401,2437,2571,1941,1101,1104,916,1009,676,860,1988.5
Feral Tiefling,76,166,93,212,126,89,663,603,70,370,1108,115,287,458,316.857143


In [12]:
races = list(df['Race'].unique())

In [13]:
classes = list(df['Class'].unique())

In [260]:
grouped_race = df.groupby('Race').sum().sort_values(by='Total').reset_index()

In [259]:
grouped_class = df.groupby('Class').sum().sort_values(by='Total').reset_index()

In [261]:
grouped_class

Unnamed: 0,Class,Total
0,Artificer,7843
1,Blood Hunter,16979
2,Druid,52138
3,Ranger,52678
4,Bard,54427
5,Monk,62305
6,Paladin,64533
7,Wizard,64688
8,Sorcerer,64934
9,Barbarian,74541


In [262]:
grouped_race

Unnamed: 0,Race,Total
0,Viashino-UA,577
1,Simic Hybrid-UA,588
2,Vedalken-UA,787
3,Loxodon-UA,1390
4,Verdan,1414
5,Centaur-UA,2452
6,Minotaur-UA,2590
7,Feral Tiefling,4436
8,Simic Hybrid,5221
9,Vedalken,5934


In [263]:
df[df['Race']=='Aarakocra']

Unnamed: 0,Race,Class,Total
13,Aarakocra,Fighter,10603
17,Aarakocra,Barbarian,9169
35,Aarakocra,Cleric,6175
41,Aarakocra,Monk,5272
56,Aarakocra,Ranger,3822
57,Aarakocra,Paladin,3764
77,Aarakocra,Rogue,3006
81,Aarakocra,Wizard,2901
123,Aarakocra,Bard,1834
125,Aarakocra,Sorcerer,1810


### Quick Visuals
- Let's just get a peek at some simple visuals to show basic relationships

In [265]:
grouped_race.iplot(kind='bar', x='Race')

In [267]:
grouped_class.iplot(kind='bar', x='Class')

In [21]:
RaceClass.iplot(kind='box')

## Answer the Questions!
What questions you ask? That's a great question!

Here, we will answer some basic questions, such as:
- What's the most stereotypical class for each race?
- What's The least stereotypical combinations?
- Can we see what percentage of the population each combination makes?

### Percentages:


#### Ugly:

In [56]:
total_percent  = pd.crosstab(df['Race'], df['Class'],values=df['Total'],
                           aggfunc='mean', normalize=True)

In [51]:
total_percent

Class,Artificer,Barbarian,Bard,Blood Hunter,Cleric,Druid,Fighter,Monk,Paladin,Ranger,Rogue,Sorcerer,Warlock,Wizard,All
Race,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
Aarakocra,0.000699,0.010502,0.002101,0.000995,0.007073,0.001945,0.012144,0.006038,0.004311,0.004378,0.003443,0.002073,0.00207,0.003323,0.061052
Aasimar,0.00012,0.000129,0.001875,0.000137,0.001103,0.000283,0.001236,0.000353,0.001104,0.000385,0.001466,0.00216,0.002473,0.00172,0.014535
Bugbear,8.7e-05,0.003682,0.000347,0.000305,0.000413,0.000372,0.003535,0.001237,0.00067,0.000966,0.002685,0.000215,0.000416,0.000246,0.015166
Centaur,4.9e-05,0.001114,0.000156,0.000214,0.000983,0.000883,0.001138,0.00052,0.000475,0.000804,0.000228,0.000126,0.000182,0.000131,0.006999
Centaur-UA,0.0,0.000416,8.5e-05,0.000125,0.000279,0.000265,0.000589,0.000218,0.000216,0.000367,7.4e-05,5.3e-05,7.8e-05,4.5e-05,0.003022
Changeling,0.00045,0.000278,0.005505,0.000424,0.00118,0.000896,0.00141,0.001071,0.000799,0.000754,0.007301,0.003441,0.005482,0.001914,0.030882
Dragonborn,0.000283,0.005954,0.003985,0.001459,0.005091,0.002318,0.010522,0.003405,0.013713,0.003752,0.006177,0.010391,0.008265,0.003828,0.079089
Dwarf,3.3e-05,0.000267,0.000292,0.000186,0.001914,0.00179,0.000397,0.000927,0.000152,0.000361,0.000464,0.00067,0.000356,0.000271,0.008075
Elf,0.000261,0.002816,0.000787,0.000514,0.013058,0.002791,0.002945,0.002223,0.001261,0.001264,0.001049,0.001156,0.000774,0.000985,0.031864
Feral Tiefling,8.7e-05,0.00019,0.000107,0.000243,0.000144,0.000102,0.000759,0.000691,8e-05,0.000424,0.001269,0.000132,0.000329,0.000525,0.005077


In [203]:
percent_of_race  = pd.crosstab(df['Race'], df['Class'],values=df['Total'],
                           aggfunc='mean', margins=True, normalize='index')

In [204]:
percent_of_race

Class,Artificer,Barbarian,Bard,Blood Hunter,Cleric,Druid,Fighter,Monk,Paladin,Ranger,Rogue,Sorcerer,Warlock,Wizard
Race,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Aarakocra,0.011436,0.171897,0.034383,0.016292,0.115767,0.031834,0.198781,0.098838,0.070566,0.071654,0.056355,0.033933,0.033877,0.054387
Aasimar,0.008268,0.008898,0.128908,0.00945,0.075833,0.01945,0.084967,0.024254,0.075911,0.026459,0.100795,0.148516,0.170013,0.118277
Bugbear,0.005736,0.242642,0.022868,0.020075,0.027245,0.024528,0.232906,0.081509,0.044151,0.063623,0.176906,0.014189,0.027396,0.016226
Centaur,0.007032,0.159117,0.02224,0.030581,0.140311,0.126083,0.162551,0.074244,0.067866,0.1148,0.032543,0.017989,0.026002,0.018643
Centaur-UA,0.0,0.148042,0.030179,0.044454,0.099511,0.094209,0.209625,0.077488,0.07708,0.130506,0.026509,0.01876,0.027732,0.015905
Changeling,0.014566,0.009006,0.178125,0.013713,0.038175,0.028983,0.045625,0.034654,0.02587,0.024388,0.23624,0.111338,0.177384,0.061932
Dragonborn,0.003575,0.075225,0.050348,0.018437,0.064328,0.029291,0.132954,0.043025,0.173273,0.04741,0.078047,0.13129,0.10443,0.048365
Dwarf,0.004111,0.033026,0.036145,0.022962,0.236853,0.221545,0.049185,0.11467,0.018852,0.044649,0.057406,0.08292,0.044082,0.033593
Elf,0.00819,0.088329,0.024678,0.016128,0.409533,0.087539,0.092352,0.069722,0.039549,0.039657,0.032903,0.036244,0.024282,0.030892
Feral Tiefling,0.017133,0.037421,0.020965,0.047791,0.028404,0.020063,0.149459,0.135933,0.01578,0.083408,0.249775,0.025924,0.064698,0.103246


In [205]:
test = percent_of_race.reset_index().melt(['Race'])

In [207]:
merged = pd.merge(df,test)

In [227]:
merged = merged.rename(columns={'value': 'Percent_of_Race'})

In [228]:
merged

Unnamed: 0,Race,Class,Total,Percent_of_Race
0,Human,Wizard,21665,0.196699
1,Half-Elf,Warlock,19173,0.206077
2,Human,Fighter,18920,0.171777
3,Half-Orc,Fighter,13922,0.170780
4,Half-Elf,Bard,12903,0.138685
5,Half-Orc,Barbarian,12377,0.151828
6,Dragonborn,Paladin,11973,0.173273
7,Goliath,Barbarian,11938,0.287989
8,Half-Elf,Sorcerer,11706,0.125820
9,Half-Orc,Ranger,11509,0.141180


In [219]:
temp = total_percent.reset_index().melt('Race')

In [229]:
temp

Unnamed: 0,Race,Class,value
0,Aarakocra,Artificer,0.000699
1,Aasimar,Artificer,0.000120
2,Bugbear,Artificer,0.000087
3,Centaur,Artificer,0.000049
4,Centaur-UA,Artificer,0.000000
5,Changeling,Artificer,0.000450
6,Dragonborn,Artificer,0.000283
7,Dwarf,Artificer,0.000033
8,Elf,Artificer,0.000261
9,Feral Tiefling,Artificer,0.000087


In [249]:
almost_complete = pd.merge(merged, temp)

In [250]:
almost_complete = complete.rename(columns={'value':'Percent_of_Pop'})

In [251]:
almost_complete

Unnamed: 0,Race,Class,Total,Percent_of_Race,Percent_of_Pop
0,Human,Wizard,21665,0.196699,0.024814
1,Half-Elf,Warlock,19173,0.206077,0.021960
2,Human,Fighter,18920,0.171777,0.021670
3,Half-Orc,Fighter,13922,0.170780,0.015946
4,Half-Elf,Bard,12903,0.138685,0.014779
5,Half-Orc,Barbarian,12377,0.151828,0.014176
6,Dragonborn,Paladin,11973,0.173273,0.013713
7,Goliath,Barbarian,11938,0.287989,0.013673
8,Half-Elf,Sorcerer,11706,0.125820,0.013408
9,Half-Orc,Ranger,11509,0.141180,0.013182


In [257]:
completed = pd.merge(almost_complete, pd.crosstab(df['Race'], df['Class'],values=df['Total'],
                           aggfunc='mean', normalize='columns').reset_index().melt('Race')
        ).rename(columns={'value':'Percent_of_Class'})

In [258]:
completed

Unnamed: 0,Race,Class,Total,Percent_of_Race,Percent_of_Pop,Percent_of_Class
0,Human,Wizard,21665,0.196699,0.024814,0.334915
1,Half-Elf,Warlock,19173,0.206077,0.021960,0.229153
2,Human,Fighter,18920,0.171777,0.021670,0.176560
3,Half-Orc,Fighter,13922,0.170780,0.015946,0.129919
4,Half-Elf,Bard,12903,0.138685,0.014779,0.237070
5,Half-Orc,Barbarian,12377,0.151828,0.014176,0.166043
6,Dragonborn,Paladin,11973,0.173273,0.013713,0.185533
7,Goliath,Barbarian,11938,0.287989,0.013673,0.160153
8,Half-Elf,Sorcerer,11706,0.125820,0.013408,0.180275
9,Half-Orc,Ranger,11509,0.141180,0.013182,0.218478


In [280]:
grouped_class.iloc[0][1] / grouped_class.sum()[1] 

0.008983109280560954

#### Pretty:

#### Wordy

### Most Streotypical:

#### Ugly:

In [143]:
percent_of_race.loc['Human'].sort_values().reset_index()

Unnamed: 0,Class,Human
0,Artificer,0.010813
1,Blood Hunter,0.017741
2,Druid,0.038859
3,Barbarian,0.041346
4,Bard,0.048791
5,Ranger,0.052014
6,Sorcerer,0.056309
7,Monk,0.058351
8,Paladin,0.058696
9,Warlock,0.062582


In [128]:
#Make a dictionary of most popular class for each race
pop_x_race = {i:[
    percent_of_race.loc[i].sort_values().reset_index().iloc[-1][0],
    percent_of_race.loc[i].sort_values().reset_index().iloc[-1][-1].round(4)
]for i in races}

In [145]:
pop_x_race

{'Human': ['Wizard', 0.1967],
 'Half-Elf': ['Warlock', 0.2061],
 'Half-Orc': ['Fighter', 0.1708],
 'Dragonborn': ['Paladin', 0.1733],
 'Goliath': ['Barbarian', 0.288],
 'Elf': ['Cleric', 0.4095],
 'Aarakocra': ['Fighter', 0.1988],
 'Tabaxi': ['Rogue', 0.2409],
 'Tiefling': ['Warlock', 0.3089],
 'Firbolg': ['Druid', 0.2531],
 'Changeling': ['Rogue', 0.2362],
 'Kenku': ['Rogue', 0.2709],
 'Gnome': ['Cleric', 0.2026],
 'Tortle': ['Monk', 0.1973],
 'Yuan-ti Pureblood': ['Warlock', 0.2453],
 'Minotaur': ['Barbarian', 0.3737],
 'Goblin': ['Rogue', 0.2025],
 'Lizardfolk': ['Druid', 0.14],
 'Bugbear': ['Barbarian', 0.2426],
 'Kobold': ['Rogue', 0.2223],
 'Orc': ['Barbarian', 0.3512],
 'Vedalken': ['Wizard', 0.4611],
 'Halfling': ['Wizard', 0.3893],
 'Hobgoblin': ['Wizard', 0.2946],
 'Aasimar': ['Warlock', 0.17],
 'Triton': ['Paladin', 0.2176],
 'Loxodon': ['Cleric', 0.2761],
 'Dwarf': ['Cleric', 0.2369],
 'Kalashtar': ['Warlock', 0.1375],
 'Genasi': ['Rogue', 0.1339],
 'Feral Tiefling': ['Rogu

In [171]:
pop_text = {i: f'The most popular class for {i} is {pop_x_race[i][0]}. '
            f'Which accounts for {(pop_x_race[i][1]*100).round(2)}% of the {i} race' for i in pop_x_race}

In [172]:
pop_text['Human']

'The most popular class for Human is Wizard. Which accounts for 19.67% of the Human race'

#### Pretty:

In [173]:
pop_text['Human']

'The most popular class for Human is Wizard. Which accounts for 19.67% of the Human race'

Unnamed: 0,Race,Class,Total
0,Human,Wizard,21665
1,Half-Elf,Warlock,19173
2,Human,Fighter,18920
3,Half-Orc,Fighter,13922
4,Half-Elf,Bard,12903
5,Half-Orc,Barbarian,12377
6,Dragonborn,Paladin,11973
7,Goliath,Barbarian,11938
8,Half-Elf,Sorcerer,11706
9,Half-Orc,Ranger,11509


Human                0.1967
Half-Elf             0.2061
Half-Orc             0.1708
Dragonborn           0.1733
Goliath               0.288
Elf                  0.4095
Aarakocra            0.1988
Tabaxi               0.2409
Tiefling             0.3089
Firbolg              0.2531
Changeling           0.2362
Kenku                0.2709
Gnome                0.2026
Tortle               0.1973
Yuan-ti Pureblood    0.2453
Minotaur             0.3737
Goblin               0.2025
Lizardfolk             0.14
Bugbear              0.2426
Kobold               0.2223
Orc                  0.3512
Vedalken             0.4611
Halfling             0.3893
Hobgoblin            0.2946
Aasimar                0.17
Triton               0.2176
Loxodon              0.2761
Dwarf                0.2369
Kalashtar            0.1375
Genasi               0.1339
Feral Tiefling       0.2498
Centaur              0.1626
Minotaur-UA          0.3757
Simic Hybrid         0.1205
Centaur-UA           0.2096
Vedalken-UA         

#### Wordy

### Least Sterotypical:

#### Ugly:

In [138]:
percent_of_race.loc['Human'].sort_values().reset_index().iloc[0][0]

'Artificer'

In [141]:
least_pop_x_race = {i:[
    percent_of_race.loc[i].sort_values().reset_index().iloc[0][0],
    percent_of_race.loc[i].sort_values().reset_index().iloc[0][1].round(4)
] for i in races}

In [142]:
least_pop_x_race

{'Human': ['Artificer', 0.0108],
 'Half-Elf': ['Artificer', 0.0038],
 'Half-Orc': ['Artificer', 0.0017],
 'Dragonborn': ['Artificer', 0.0036],
 'Goliath': ['Artificer', 0.0031],
 'Elf': ['Artificer', 0.0082],
 'Aarakocra': ['Artificer', 0.0114],
 'Tabaxi': ['Artificer', 0.0047],
 'Tiefling': ['Artificer', 0.0037],
 'Firbolg': ['Artificer', 0.0038],
 'Changeling': ['Barbarian', 0.009],
 'Kenku': ['Artificer', 0.0115],
 'Gnome': ['Artificer', 0.002],
 'Tortle': ['Artificer', 0.0068],
 'Yuan-ti Pureblood': ['Artificer', 0.0083],
 'Minotaur': ['Artificer', 0.0051],
 'Goblin': ['Blood Hunter', 0.0158],
 'Lizardfolk': ['Artificer', 0.0041],
 'Bugbear': ['Artificer', 0.0057],
 'Kobold': ['Blood Hunter', 0.0102],
 'Orc': ['Artificer', 0.0028],
 'Vedalken': ['Blood Hunter', 0.0059],
 'Halfling': ['Blood Hunter', 0.008],
 'Hobgoblin': ['Druid', 0.0118],
 'Aasimar': ['Artificer', 0.0083],
 'Triton': ['Artificer', 0.0041],
 'Loxodon': ['Artificer', 0.0134],
 'Dwarf': ['Artificer', 0.0041],
 'Kalas

#### Pretty:

#### Wordy