# Popular Fandom Exploration

Using the data we scraped from the most popular fandoms on AO3, we will perform some data exploration and initial analysis in an attempt to see if there are any common trends between popular fandoms.

The popular fandoms we are looking at are within the following categories:
- Anime & Manga
- Books & Literature
- Cartoons, Comics, & Graphic Novels
- Celebrities & Real People
- Movies
- Music & Bands
- Other Media
- Theatre
- TV Shows
- Video Games
- Uncategorized Fandoms 
   - These fandoms were not large enough for us to effetively scrape them.

In [4]:
#import statements
import json
import os
import pandas as pd

In [2]:
#load in data
with open("../ao3bot/pop_fandoms_stats.json", "r") as f:
    temp_list = json.load(f)

Let's take a look at what the first set of data looks like:

In [3]:
temp_list[0]

{'fandom': '僕のヒーローアカデミア | Boku no Hero Academia | My Hero Academia',
 'total_works': 204713,
 'ratings': {'Teen And Up Audiences ': '72523',
  'General Audiences ': '46180',
  'Explicit ': '36226',
  'Mature ': '27304',
  'Not Rated ': '22480'},
  'Graphic Depictions Of Violence ': '24928',
  'Major Character Death ': '13598',
  'Underage ': '8812',
  'Rape/Non-Con ': '7490'},
 'categories': {'M/M ': '110168',
  'F/M ': '50222',
  'Gen ': '40606',
  'Multi ': '17137',
  'F/F ': '14465',
  'Other ': '9397'},
 'fandoms': {'僕のヒーローアカデミア | Boku no Hero Academia | My Hero Academia ': '204713',
  'Naruto ': '803',
  'Haikyuu!! ': '580',
  'Harry Potter - J. K. Rowling ': '451',
  'Shingeki no Kyojin | Attack on Titan ': '380',
  'Original Work ': '314',
  'Pocket Monsters | Pokemon - All Media Types ': '293',
  'Katekyou Hitman Reborn! ': '291',
  'One Piece ': '277',
  'Marvel Cinematic Universe ': '270'},
 'characters': {'Midoriya Izuku ': '104983',
  'Bakugou Katsuki ': '98802',
  'Todorok

We can see here that the first fandom we scraped was My Hero Academia or Boku no Hero Academia. We can also see what the general categories we have for analysis are, in particular we have:

| Tag | Description|
|:----|:-----------|
|total_works|This statistic is the total number of fics within the fandom. Note: some fics are only available when logged in, since we aren't our sampling is skewed.|
|ratings|Ratings consist of Teen and Up, General, Explicit, Mature, and Not Rated.|
|warnings|There are a few warnings that can be applied to fics, if the fics contain the mentioned material. There warnings are No Archive Warnings Apply, Creator Chose Not To Use Archive Warnings, Graphic Depictions Of Violence, Major Character Death, Underage, and Rape/Non-Con.|
|categories|Fics can fall within these categories that describe the relationships within the fic. There are M/M, F/M, F/F, Multi, Other, and Gen.|
|fandoms|This is a list of the most common fandoms within these fics, this is generally the main fandom and then fandoms that are popular within crossovers.|
|characters|This is a list of the most popular characters within the fic in the fandom.|
|relationships|This is a list of the most popular relationships within the fic in the fandom.|
|freeforms|These are tags that are author-written, so these are the most popular and common author-written tags.|

## Data Cleaning
There are a few things we need to do before we can start analyzing. One of those is clean the data. Specifically, right now there are spaces at the end of some values that we want to get rid of.

To make it easier to analyze, I am going to put the data into a `pandas` dataframe.

In [9]:
df = pd.json_normalize(
    temp_list, 
    meta = [
        'fandom',
        'total_works',
        ['ratings', 'Teen And Up Audiences'],
        ['ratings', 'General Audiences'],
        ['ratings', 'Explicit'],
        ['ratings', 'Mature'],
        ['ratings', 'Not Rated'],
        ['warnings','No Archive Warnings Apply '], 
    ]    
)
df.head()

Unnamed: 0,fandom,total_works,ratings,warnings,categories,fandoms,characters,relationships,freeforms
0,僕のヒーローアカデミア | Boku no Hero Academia | My Hero ...,204713,"{'Teen And Up Audiences ': '72523', 'General A...","{'No Archive Warnings Apply ': '100687', 'Crea...","{'M/M ': '110168', 'F/M ': '50222', 'Gen ': '4...",{'僕のヒーローアカデミア | Boku no Hero Academia | My Her...,"{'Midoriya Izuku ': '104983', 'Bakugou Katsuki...","{'Bakugou Katsuki/Midoriya Izuku ': '27849', '...","{'Fluff ': '37306', 'Angst ': '26180', 'Bakugo..."
1,Haikyuu!!,125662,"{'General Audiences ': '42203', 'Teen And Up A...","{'No Archive Warnings Apply ': '75892', 'Creat...","{'M/M ': '100786', 'F/M ': '14844', 'Gen ': '1...","{'Haikyuu!! ': '125662', '僕のヒーローアカデミア | Boku n...","{'Hinata Shouyou ': '34064', 'Oikawa Tooru ': ...","{'Iwaizumi Hajime/Oikawa Tooru ': '17245', 'Hi...","{'Fluff ': '34102', 'Angst ': '15991', 'Establ..."
2,Naruto,81411,"{'Teen And Up Audiences ': '26301', 'General A...","{'No Archive Warnings Apply ': '41337', 'Creat...","{'M/M ': '37981', 'F/M ': '27234', 'Gen ': '17...","{'Naruto ': '80780', 'Boruto: Naruto Next Gene...","{'Uzumaki Naruto ': '30425', 'Hatake Kakashi '...","{'Uchiha Sasuke/Uzumaki Naruto ': '10246', 'Ha...","{'Fluff ': '8634', 'Alternate Universe - Canon..."
3,Shingeki no Kyojin | Attack on Titan,63309,"{'Teen And Up Audiences ': '18030', 'Explicit ...",{'Creator Chose Not To Use Archive Warnings ':...,"{'M/M ': '37137', 'F/M ': '19382', 'F/F ': '69...",{'Shingeki no Kyojin | Attack on Titan ': '631...,"{'Levi Ackerman ': '35869', 'Eren Yeager ': '3...","{'Levi Ackerman/Eren Yeager ': '15771', 'Levi ...",{'Alternate Universe - Modern Setting ': '1100...
4,Miraculous Ladybug,48438,"{'General Audiences ': '21554', 'Teen And Up A...","{'No Archive Warnings Apply ': '30243', 'Creat...","{'F/M ': '33352', 'Gen ': '8430', 'F/F ': '412...","{'Miraculous Ladybug ': '48438', 'Batman - All...","{'Marinette Dupain-Cheng | Ladybug ': '40029',...",{'Adrien Agreste | Chat Noir/Marinette Dupain-...,"{'Fluff ': '10465', 'Identity Reveal ': '6383'..."
