# Doing it All Without pandas

As it turns out, the API that I selected returns data from requests in a somewhat conventional way. That is, instead of giving back a long string formatted as something like a csv or json, it gives back a list of dictionaries. 

This happened to work out well when I was able to use `pandas`, but without it, the easiest approach would probably be to index into the dictionaries and extract the data manually into lists.

In [1]:
#all this is just 1:1 for what I did the first time so far
from igdb_api_python.igdb import igdb
import requests

# igdb represents a requests object, as created by the IGDB API wrapper
api_key = "750c9d13c29e3ee77695e1cfebae2c62"
igdb = igdb(api_key)

result = igdb.games({
    'filters':{
        "[platforms][eq]":48,
        "[category][eq]":0
    },
    'fields': ['name','esrb.rating','total_rating'],
    'scroll': 1,
    'limit': 50,
    'order': 'name:desc'
})

games/?fields=name,esrb.rating,total_rating&filter[platforms][eq]=48&filter[category][eq]=0&order=name:desc&limit=50&scroll=1


In [12]:
print(result.headers['X-Count']) #2635 entries

2635


In [13]:
res_body = result.body

In [16]:
#res_body becomes the result.body of all 2635 entries
for i in range(int(2635/50) + 1):
    res_body += igdb.scroll(result).body

In [25]:
print(res_body[:50])
print(res_body[-50:])

[{'id': 81958, 'name': '永遠消失的幻想鄉 ～ The Disappearing of Gensokyo', 'total_rating': 80.0}, {'id': 20744, 'name': 'Ōkami HD', 'total_rating': 86.62726547927755, 'esrb': {'rating': 5}}, {'id': 23636, 'name': 'theHunter', 'total_rating': 40.0}, {'id': 6465, 'name': 'iO', 'total_rating': 66.5}, {'id': 27277, 'name': 'forma.8', 'total_rating': 77.94368919630995, 'esrb': {'rating': 4}}, {'id': 1353, 'name': 'flOw', 'total_rating': 74.6961664457953, 'esrb': {'rating': 3}}, {'id': 19008, 'name': 'ecotone', 'total_rating': 72.5}, {'id': 95399, 'name': 'duplicate Zanki Zero: Last Beginning'}, {'id': 52737, 'name': 'duplicate Rocksmith 2014 Edition', 'total_rating': 90.0}, {'id': 55162, 'name': 'duplicate Pillars of the Earth', 'total_rating': 72.375}, {'id': 42938, 'name': "duplicate Peanuts Movie: Snoopy's Grand Adventure", 'total_rating': 75.0}, {'id': 42928, 'name': "duplicate JoJo's Bizarre Adventure: Eyes of Heaven - Duplicate"}, {'id': 102112, 'name': 'duplicate Farming Simulator 2019'}, {'i

In [26]:
# Let's just get rid of those trailing errors real quick.

while(res_body[-1] == 'Err'):
    res_body.pop()

In [30]:
res_body[-15:] #Much better.

[{'id': 90654,
  'name': '303 Squadron: Battle of Britain',
  'total_rating': 50.0,
  'esrb': {'rating': 4}},
 {'id': 86657, 'name': '3 Minutes to Midnight', 'esrb': {'rating': 4}},
 {'id': 26847,
  'name': '2Dark',
  'total_rating': 53.6256830601093,
  'esrb': {'rating': 6}},
 {'id': 17828,
  'name': '20XX',
  'total_rating': 77.66666666666666,
  'esrb': {'rating': 4}},
 {'id': 100562, 'name': '198X'},
 {'id': 14360,
  'name': '1979 Revolution: Black Friday',
  'total_rating': 62.16666666666665,
  'esrb': {'rating': 6}},
 {'id': 9497, 'name': '140', 'total_rating': 71.1780377154587},
 {'id': 28337, 'name': '13 Sentinels: Aegis Rim'},
 {'id': 103249, 'name': '11-11: MEMORIES RETOLD'},
 {'id': 16745, 'name': '101 Ways to Die', 'total_rating': 65.0},
 {'id': 23168,
  'name': '100ft Robot Golf',
  'total_rating': 59.0,
  'esrb': {'rating': 4}},
 {'id': 8617,
  'name': '1001 Spikes',
  'total_rating': 83.5,
  'esrb': {'rating': 5}},
 {'id': 18686,
  'name': '10 Second Ninja X',
  'total_ra

In [32]:
len(res_body) #Entry count now matches what X-Count stated above

2635

# Cleaning Data

My process for cleaning the data is to essentially iterate over the list of dicts represented by `res_body`, and add them to new lists that include only entries of a certain ESRB rating. After that, I sum up the `'total-rating'` field for each list and divide by the number of entries which had such a field to get the mean.

In [51]:
rp_list = []

for entry in res_body:
    try:
        if entry['esrb']['rating'] == 1:
            rp_list.append(entry.copy())
    except:
        continue

In [65]:
len(rp_list)

36

In [68]:
ec_list = []

for entry in res_body:
    try:
        if entry['esrb']['rating'] == 2:
            ec_list.append(entry.copy())
    except:
        continue

In [69]:
len(ec_list)

0

In [86]:
e_list = []

for i, entry in enumerate(res_body):
    try:
        if entry['esrb']['rating'] == 3:
            e_list.append(entry.copy())
    except KeyError:
        continue

In [63]:
e10_list = []

for entry in res_body:
    try:
        if entry['esrb']['rating'] == 4:
            e10_list.append(entry.copy())
    except:
        continue

In [64]:
len(e10_list)

308

In [66]:
t_list = []

for entry in res_body:
    try:
        if entry['esrb']['rating'] == 5:
            t_list.append(entry.copy())
    except:
        continue

In [67]:
len(t_list)

461

In [60]:
m_list = []

for entry in res_body:
    try:
        if entry['esrb']['rating'] == 6:
            m_list.append(entry.copy())
    except:
        continue

In [61]:
len(m_list)

304

In [70]:
ao_list = []

for entry in res_body:
    try:
        if entry['esrb']['rating'] == 7:
            ao_list.append(entry.copy())
    except:
        continue

In [71]:
len(ao_list)

1

In [116]:
rp_sum = 0
rp_valid = 0
for entry in rp_list:
    try:
        rp_sum += entry['total_rating']
        rp_valid += 1
    except:
        continue

In [117]:
rp_mean = rp_sum/rp_valid
print(rp_valid)
print(rp_mean)

21
70.0681156481416


In [94]:
# no need to calculate for EC, since we know there are none

In [101]:
e_sum = 0
e_valid = 0
for entry in e_list:
    try:
        e_sum += entry['total_rating']
        e10_valid += 1
    except:
        continue

In [102]:
e_mean = e_sum/e_valid

ZeroDivisionError: float division by zero

In [103]:
e10_sum = 0
e10_valid = 0
for entry in e10_list:
    try:
        e10_sum += entry['total_rating']
        e10_valid += 1
    except:
        continue

In [114]:
e10_mean = e10_sum/e10_valid
print(e10_valid)
print(e10_mean)

277
70.98240567552998


In [106]:
t_sum = 0
t_valid = 0
for entry in t_list:
    try:
        t_sum += entry['total_rating']
        t_valid += 1
    except:
        continue

In [118]:
t_mean = t_sum/t_valid
print(t_valid)
print(t_mean)

431
71.39028828818454


In [109]:
m_sum = 0
m_valid = 0
for entry in m_list:
    try:
        m_sum += entry['total_rating']
        m_valid += 1
    except:
        continue

In [119]:
m_mean = m_sum/m_valid
print(m_valid)
print(m_mean)

288
74.14310489859653


In [111]:
ao_sum = 0
ao_valid = 0
for entry in ao_list:
    try:
        ao_sum += entry['total_rating']
        ao_valid += 1
    except:
        continue

In [120]:
ao_mean = ao_sum/ao_valid
print(ao_valid)
print(ao_mean)

1
84.3486983110335


## Analysis

As we can see, the means that we obtained in this fashion were mostly the same as those obtained using `pandas` in `homework02a.ipynb`. However, there was some slight variance in the total number of entries, but this didn't end up affecting the means. 

Also, for some bizarre reason, when I attempted to populate the list for the E rated games, the code cell seemed to stall. I attempted to debug this in several ways, but was unable to come up with a solution. But, given that all the other ratings returned identical results, it should be safe to conclude that this method is generally applicable.

In this analysis, I also didn't calculate a standard deviation - admittedly, this is partially because my knowledge of statistics isn't particularly amazing, but also, the results from analyzing standard deviation weren't particularly interesting, and would just be tedious to do in this fashion.