# DnD Monsters: Dice and Data
As a Dungeon Master, it is very important to understand the strength of the monsters you pit against your players. Too weak, they are bored, too strong, they die or worse..they don't have fun. The current method known as Challenge Rating, CR, is a numerical system used to determine how difficult an enemey is based on a party of 4 players. Challenge Ratings range from 0 to 30. Unfortunately, this method is very basic and often times does not actually hold true to every encounter. 

One thing that isn't accounted for is action economy. This is the biggest detroyer of players, the strongest weapon in your arsenal. If your players are facing 100 monsters, that's 100 turns. Even if you manage to kill a good chunk of them, the majority will make it through and some of them...with critical hits. Thus is a much more difficult encounter than an equally XP worthwhile monster, with say 2 attacks. 

Wizards of the Coast not only provide a guideline for how much XP you should have per level per day, but they also show you how much a party of 4 at X level can stomach during one encounter. They also provide an XP multiplier that takes multiple monsters into consideration. For example, 10 monsters get a x2.5 XP multiplier, causing their total XP rating to jump up for the encounter, potentially making them deadly. Action Economy rules all. 

CR is unfortunately not a great method for measuring a monster's strength. It uses AC, HP, attack bonus, damage per round and Save DC as a general guideline. It doesn't take into account legendary action, at will spells, special abilities that cause status ailments, or any other boosting abilities.

There are two CRs: Defensive and Offensive, used to calulate the total CR of a monster. Using the chart provided you find the average of the CR indicated by the HP and AC. Offensive does the same thing but uses DPR and Attack Bonus. Then by averaging the two CRs we get our final monster Challenge Rating. As you can see this doesn't take into account any of the strong abilities a monster may have. Similarly, you may have a weak physical monster that uses spells that is vastly lower in CR than it should be. 

WoTC has augmented this system by applying multipliers or increases based on other features, trains, or abilities the monster may have. 

www.dndbeyond.com/monsters has many pages of monster listings. Each listing has a dropdown that has a monster table associated with it. This contains stats, abilities, and other important details. 

Unfortunately, dndbeyond has shut down its ability to scrape through automation detection software. I don't intend to break to ToS, so I will use the SRD from the dandwiki.com page instead. 

The goal of this investigation is to learn more about Monster's abilities in relation to the CR system. To understand if there are corellations in any of the stats, abilities, environments, size, etc. To see if we can classify monsters based on any of these traits. To create a dashboard that pits monsters against each other to compare. Finally, to see if there is a way to better address the CR system and use abilities, traits, features, and spells in a more cohesive manner 


## Libraries for Parsing
First we need to gain access to our monster data sheet. as stated above, dndbeyond.com has a great repository of monster data. This will need to be scrapped from there site. Unfortuntately, each of the monster pages is hidden behind an accordion dropdown and will need to be extracted. This is something I have not yet done, so I am excited to try. We will start out using Requests and BeautifulSoup since I am most comfortable with these.

In [59]:
#Import Libraries for scrapping
from bs4 import BeautifulSoup as bs
import requests as rq
import pandas as pd

## Get Request for Monster Names

In [60]:
#Fetching HTML
url = "https://www.dandwiki.com/wiki/5e_SRD:Monsters"
Request = rq.get(url).text

soup = bs(Request, 'html.parser')

# Collect Names of All Monsters in a List 
Unfortunately, dndwiki is not well crafted, which meant I needed to get creative. There weren't distinguishing classes or names or ids. styles between tables were a bit different, so i used that to gather the information needed.

In [61]:
#Find the main content div and and extract it for processing
#This involves finding the list items that are only housed within the parent table that has a width of 100%.
tables = soup.findAll('table',{'style':"width: 100%;"})
monster_names=[]

for table in tables:
    li_table = table.findAll('li')
    for name in li_table:
          monster_names.append(name.text)

# Clean up data
We need to remove duplicates and non-monsters from the list 

In [71]:
#Remove the non-monster data

#Remove Duplicate monsters if there are any
monster_names = list(set(monster_names))
monster_list=[]
#filter through and replace spaces with dashes to format for urls
for name in monster_names:
    if not(name.strip().isdigit()):
        new_name = name.replace(' ','-')
        monster_list.append(new_name)
    else:
        monster_list.append(name)



230


# Dictionary of URLs to parse
We will iterate through the monster name, knowing that dandwiki has a uniform site for all monsters pages www.dandwiki.com/wiki/5e_SRD:'MonsterName'.

In [70]:
monster_url=[]
for name in monster_list:
    monster_url.append('https://www.dndbeyond.com/monsters/'+name)


## Iterate through the websites to parse all the data
There are still some things on here that are not monsters (they summon monsters). For example the Deck of Many Things. This will break and analysis or modeling we try to do, so we need to remove them. We can look at all things monsters have in common that these other objects do not. Unfortunately, the DoMT and the figures of power also contain niche "monster" stats for their monsters. We will include these in our table, however Zombies and Dinosaurs do not, since they are just a category of many monsters, all of which are included in the list already. 

In [10]:
from collections import defaultdict

#function to make sure each get request is functioning properly and to parse the url
def Run_Soup_If_Status_Ok(url):
    request =rq.get(url)
    soup = bs(request.text, 'html.parser')
    return soup


monster_dict=defaultdict(list)

#append dictionary with monster name and the soupy information
for name,url in zip(monster_names,monster_url):
    monster_dict[name].append(Run_Soup_If_Status_Ok(url))


# Helper functions for the full Parse

# Create a data frame by parsing the Monster HTML tables
I am going to create an empty dictionary with keys from the extracted column names above. this dictionary will be converted into a pandas dataframe.

In [None]:
monster_dict = dict.fromkeys(column_names)

#Initialize the monster_dic with each value for all keys to be an empty list
for column in column_names:
    monster_dict[column_names] = []


# Fill in the monster_dict with records extracted from our HTML


# Create DataFrame from filled monster_dict

In [None]:
#ensure listlengths are the same
list_length = []

for col in monster_dict:
    list_length.append(len(monster_dict[col]))
print(list_length)

monster_df = pd.DataFrame(monster_dict)

monster_df

In [2]:
from selenium import webdriver
from bs4 import BeautifulSoup

url = 'https://www.dndbeyond.com/monsters/mummy-lord'


driver = webdriver.Chrome(executable_path='../env/chromedriver.exe')

driver.get(url)

driver.implicitly_wait(5)

soup = BeautifulSoup(driver.page_source, 'lxml')

stat_block = soup.find('div',{'class':'mon-stat-block'})
Environment = soup.find('footer')




  driver = webdriver.Chrome(executable_path='../env/chromedriver.exe')


In [6]:

column_names = ['Monster Name','Size','Type', 'Alignment']
#First set of column names from 'label span'
for headers in stat_block.findAll('span',{'class': lambda e: e.endswith('label') if e else False}):    
    column_names.append(headers.text)
    
for headers in stat_block.findAll('div',{'class': lambda e: e.endswith('heading') if e else False}):    
    column_names.append(headers.text)

for headers in Environment.findAll('p',{'class': lambda e: e.startswith('tags') if e else False}):    
    column_names.append(headers.contents[0].strip())

column_names.append('Traits')

# Create Empty Dictionary with Keys from the Extracted Column Names

In [7]:
monster_dict = dict.fromkeys(column_names)

#Initialize the monster_dic with each value for all keys to be an empty list
for column in column_names:
    monster_dict[column] = []

monster_dict

{'Monster Name': [],
 'Size': [],
 'Type': [],
 'Alignment': [],
 'Armor Class': [],
 'Hit Points': [],
 'Speed': [],
 'Saving Throws': [],
 'Skills': [],
 'Damage Vulnerabilities': [],
 'Damage Immunities': [],
 'Condition Immunities': [],
 'Senses': [],
 'Languages': [],
 'Challenge': [],
 'Proficiency Bonus': [],
 'STR': [],
 'DEX': [],
 'CON': [],
 'INT': [],
 'WIS': [],
 'CHA': [],
 'Actions': [],
 'Legendary Actions': [],
 'Environment:': [],
 'Traits': []}