# Data Cleaning Playground Notebook

This notebook contains random code snippets, thoughts, and ideas for cleaning up the data related to my Splatoon 3 battle data.

## Package Imports and Set Up

In [19]:
import pandas as pd
import json
import requests

## Removing Duplicate Battles

This can be done by parsing the battle indexes and then pulling from the website each battle individually

> DO NOT RUN THE CODE SEGMENTS IN THIS SECTION

In [21]:
tmp_df = pd.read_csv('./../data/statink-super64guy.csv', index_col='id')
print(len(tmp_df.index))
battle_df = pd.DataFrame()

for i in tmp_df.index:
    url = 'https://stat.ink/api/v3/battle/' + i
    r = requests.get(url=url)
    if r.status_code == 200:    
        json_obj = json.loads(r.text)
        df = pd.DataFrame.from_dict(json_obj, orient="index").T
        df.set_index('id', inplace=True)
        battle_df = pd.concat([battle_df, df])

print(len(battle_df.index))

372


  battle_df = pd.concat([battle_df, df])
  battle_df = pd.concat([battle_df, df])


272


In [22]:
battle_df.to_csv('./../data/statink-super64guy.csv')

## Clean JSON Data

### Cleaning Base JSON Data

The following function will be used frequently to help with parsing the incorrectly formatted JSON objects that are littererd throughout the data.

In [23]:
battle_df = pd.read_csv('./../data/statink-super64guy.csv', index_col='id')
len(battle_df.index)

272

In [17]:
def clean_json(json_str):
    json_str = str(json_str)
    return json.loads(json_str.replace("'","\"").replace("True","true").replace("False","false").replace("None","null"))

To see if this works, we can then try parsing the `our_team_members` column:

In [18]:
clean_json(battle_df['our_team_members']['496d4e23-b606-403e-bd9e-557fddb0a4ef'])

[{'me': False,
  'rank_in_team': 1,
  'name': None,
  'number': None,
  'splashtag_title': None,
  'weapon': {'key': 'maneuver',
   'aliases': ['splat_dualies', '5010'],
   'type': {'key': 'maneuver',
    'aliases': [],
    'name': {'en_US': 'Dualies', 'ja_JP': 'マニューバー'}},
   'name': {'en_US': 'Splat Dualies', 'ja_JP': 'スプラマニューバー'},
   'main': 'maneuver',
   'sub': {'key': 'kyubanbomb',
    'aliases': [],
    'name': {'en_US': 'Suction Bomb', 'ja_JP': 'キューバンボム'}},
   'special': {'key': 'kanitank',
    'aliases': [],
    'name': {'en_US': 'Crab Tank', 'ja_JP': 'カニタンク'}},
   'reskin_of': 'maneuver'},
  'kill': 0,
  'assist': 0,
  'kill_or_assist': 0,
  'death': 0,
  'special': 0,
  'signal': None,
  'inked': 114,
  'disconnected': False,
  'crown': False,
  'gears': {'headgear': {'primary_ability': {'key': 'swim_speed_up',
     'name': {'en_US': 'Swim Speed Up', 'ja_JP': 'イカダッシュ速度アップ'},
     'primary_only': False},
    'secondary_abilities': [{'key': 'ink_recovery_up',
      'name': {'en

## Cleaning Single-Level JSON Objects

The following function should be able to clean JSON from a single object:

In [None]:
def pop_json(json_str)