# Scrape game data for Dragon Quest on the NES (1986)
---

Dragon Quest (released in North America as Dragon Warrior) is the first game in the Dragon Quest JRPG sereis created by Enix (now Square Enix).

This notebook contains code to scrape game data from webistes with accurate information about the game (monsters, items, spells, etc.). This information is hard coded into the game and therefore applies to any playthrough of the originial 1986 Dragon Quest on NES.

---

In [1]:
import numpy as np
import pandas as pd
import requests
import bs4
from bs4 import BeautifulSoup

In [2]:
!python --version

Python 3.7.6


In [3]:
print("Numpy version: ", np.__version__)
print("Pandas version: ", pd.__version__)
print("Requests version: ", requests.__version__)
print("Beautiful Soup version: ", bs4.__version__)

Numpy version:  1.18.1
Pandas version:  1.0.1
Requests version:  2.22.0
Beautiful Soup version:  4.8.2


## Monster Stats

Most of the monster data is taken from this great beastiary post on GameFAQs by user x_loto: http://gamefaqs.gamespot.com/nes/563408-dragon-warrior/faqs/69121.


Additional data for the minimum HP and gold values of each monster comes from this spectacularly comprehensive GameFAQs post by user Ryan8bit: https://gamefaqs.gamespot.com/nes/563408-dragon-warrior/faqs/61640. Monster stats were copied into their own text file (`monster_stats_faq.txt`) from section II.B of the FAQ, and that files was used to create my monster data file (`monster_data.csv`)

---

In [4]:
session = requests.Session()
session.headers

{'User-Agent': 'python-requests/2.22.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}

In [5]:
url = 'http://gamefaqs.gamespot.com/nes/563408-dragon-warrior/faqs/69121'

In [6]:
headers = {
    #'Access-Control-Allow-Origin': '*',
    #'Access-Control-Allow-Methods': 'GET',
    #'Access-Control-Allow-Headers': 'Content-Type',
    #'Access-Control-Max-Age': '3600',
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36'
    }

In [7]:
req = requests.get(url, headers=headers)
req.status_code

200

In [8]:
soup = BeautifulSoup(req.content, 'html.parser')
#print(soup.prettify())

In [9]:
monster_table = soup.find('table', class_='ffaq')

In [10]:
table_rows = monster_table.find_all('tr')

In [11]:
len(table_rows)

42

In [12]:
row_headers = table_rows.pop(0)
table_rows.pop(20)

<tr><th class="ffaq">Name</th><th class="ffaq">STR</th><th class="ffaq">AGI</th><th class="ffaq">MAX.HP</th><th class="ffaq">EXP</th><th class="ffaq">Max.GOLD</th><th class="ffaq">Special 1</th><th class="ffaq">Prob.1</th><th class="ffaq">Special 2</th><th class="ffaq">Prob.2</th><th class="ffaq">Sleep Res.</th><th class="ffaq">Stopspell Res.</th><th class="ffaq">Hurt Res.</th><th class="ffaq">Evasion</th></tr>

In [13]:
len(table_rows)

40

In [14]:
df_rows = []
for tr in table_rows:
    td = tr.find_all('td')
    row = [i.text for i in td]
    df_rows.append(row)

df_rows[-1]

['Dragonlord',
 '140',
 '200',
 '130',
 '0',
 '0',
 '',
 '',
 'Breathe fire 2',
 '2/4',
 '15/16',
 '15/16',
 '15/16',
 '0/64']

In [15]:
col_names = [i.text for i in row_headers.find_all('th')]
col_names

['Name',
 'STR',
 'AGI',
 'MAX.HP',
 'EXP',
 'Max.GOLD',
 'Special 1',
 'Prob.1',
 'Special 2',
 'Prob.2',
 'Sleep Res.',
 'Stopspell Res.',
 'Hurt Res.',
 'Evasion']

In [16]:
df = pd.DataFrame(df_rows, columns=col_names)
df.head()

Unnamed: 0,Name,STR,AGI,MAX.HP,EXP,Max.GOLD,Special 1,Prob.1,Special 2,Prob.2,Sleep Res.,Stopspell Res.,Hurt Res.,Evasion
0,Slime,5,3,3,1,1,,,,,0/16,15/16,0/16,1/64
1,Red Slime,7,3,4,1,2,,,,,0/16,15/16,0/16,1/64
2,Drakee,9,6,6,2,2,,,,,0/16,15/16,0/16,1/64
3,Ghost,11,8,7,3,4,,,,,0/16,15/16,0/16,4/64
4,Magician,11,12,13,4,11,,,Hurt,2/4,0/16,0/16,0/16,1/64


In [17]:
# HP and GP range.split(' - ')
faq_lines = open('./data/game_data/monster_stats_faq.txt').readlines()

In [18]:
hp_lines = [line for line in faq_lines if 'HP:' in line]
gp_lines = [line for line in faq_lines if 'GP:' in line]

In [19]:
hp_min = [''.join(line.split())[3:].split('-')[0] for line in hp_lines]
gp_min = [''.join(line.split())[3:].split('-')[0] for line in gp_lines]

In [20]:
col_names.insert(3, 'min_hp')
col_names.insert(6, 'min_gold')

In [21]:
#df_rows.insert(3, hp_min)
#df_rows.insert(6, gp_min)

In [22]:
#df_rows
for idx, row in enumerate(df_rows):
    row.insert(3, hp_min[idx])

for idx, row in enumerate(df_rows):
    row.insert(6, gp_min[idx])

In [23]:
col_names = ['monster', 'str', 'agi', 'hp_min', 'hp_max', 'exp', 'gp_min', 'gp_max', \
             'sk1', 'p1', 's2', 'p2', 'res_sleep', 'res stopspell', 'res_hurt', 'evade']

In [24]:
df = pd.DataFrame(df_rows, columns=col_names)

In [25]:
df.iloc[-1, 0] = 'Dragonlord (true form)'
df

Unnamed: 0,monster,str,agi,hp_min,hp_max,exp,gp_min,gp_max,sk1,p1,s2,p2,res_sleep,res stopspell,res_hurt,evade
0,Slime,5,3,3,3,1,1,1,,,,,0/16,15/16,0/16,1/64
1,Red Slime,7,3,4,4,1,2,2,,,,,0/16,15/16,0/16,1/64
2,Drakee,9,6,5,6,2,2,2,,,,,0/16,15/16,0/16,1/64
3,Ghost,11,8,6,7,3,3,4,,,,,0/16,15/16,0/16,4/64
4,Magician,11,12,10,13,4,9,11,,,Hurt,2/4,0/16,0/16,0/16,1/64
5,Magidrakee,14,14,12,15,5,9,11,,,Hurt,2/4,0/16,0/16,0/16,1/64
6,Scorpion,18,16,16,20,6,12,15,,,,,0/16,15/16,0/16,1/64
7,Druin,20,18,17,22,7,12,15,,,,,0/16,15/16,0/16,2/64
8,Poltergeist,18,20,18,23,8,13,17,,,Hurt,3/4,0/16,0/16,0/16,6/64
9,Droll,24,24,19,25,10,18,24,,,,,0/16,14/16,0/16,2/64


In [26]:
filepath = 'data/game_data/monster_data.csv'
df.to_csv(filepath, index=False)

## Equipment, Items, and Spells

A portion of the data for equipment, items, and spells also comes from the GameFAQs post by user Ryan8bit: https://gamefaqs.gamespot.com/nes/563408-dragon-warrior/faqs/61640.

Most of the data, however, comes from my personal observations while playing the game because many FAQs that include this information are incomplete or incorrect. This data can be found in the files `equipment_data.csv`, `item_data.csv`, and `spell_data.csv`.

---