This notebook shows you how to scrap information from MyAnimeList.net using jikanpy

### Data source:
- MyAnimeList.net has a good list of upcoming animes. 
- You can also find a list of APIs to mine data from their website at https://myanimelist.net/apiconfig/references/api/v2
- There is also an unofficial API which can be found here https://jikan.moe/. This version also has a Python wrapper. Details can be found here https://github.com/abhinavk99/jikanpy

### Data exploration/ pre-processing:

Jikan is a api that is designed to collect data from MAL. Here's some example code:

In [1]:
import pandas as pd
from pprint import pprint
from jikanpy import Jikan

In [2]:
#Initialize an object
jikan = Jikan()

In [3]:
#You can search for a particular anime using .search function
search_result = jikan.search('anime', 'Naruto', page=1)
pprint([search_result['results'][i]['title'] for i in range(len(search_result))])

['Naruto: Shippuuden',
 'The Last: Naruto the Movie',
 'Boruto: Naruto the Movie',
 'Naruto: Shippuuden Movie 4 - The Lost Tower',
 'Naruto: Shippuuden Movie 2 - Kizuna',
 'Naruto: Shippuuden Movie 5 - Blood Prison',
 'Naruto: Shippuuden Movie 6 - Road to Ninja',
 'Boruto: Naruto the Movie - Naruto ga Hokage ni Natta Hi',
 'Naruto: Shippuuden Movie 1',
 'Naruto Movie 1: Dai Katsugeki!! Yuki Hime Shinobu Houjou Dattebayo!']


In [4]:
naruto_id = search_result['results'][0]['mal_id']
naruto = jikan.anime(naruto_id)
pprint(naruto)

{'API_DEPRECATION': True,
 'API_DEPRECATION_DATE': '2022-02-10T16:02:50+00:00',
 'API_DEPRECATION_INFO': 'https://bit.ly/jikan-v3-deprecation',
 'aired': {'from': '2007-02-15T00:00:00+00:00',
           'prop': {'from': {'day': 15, 'month': 2, 'year': 2007},
                    'to': {'day': 23, 'month': 3, 'year': 2017}},
           'string': 'Feb 15, 2007 to Mar 23, 2017',
           'to': '2017-03-23T00:00:00+00:00'},
 'airing': False,
 'background': None,
 'broadcast': 'Thursdays at 19:30 (JST)',
 'demographics': [{'mal_id': 27,
                   'name': 'Shounen',
                   'type': 'anime',
                   'url': 'https://myanimelist.net/anime/genre/27/Shounen'}],
 'duration': '23 min per ep',
 'ending_themes': ['1:\xa0"Nagare Boshi ~Shooting Star~ (流れ星〜Shooting Star〜)" '
                   'by HOME MADE Kazoku\xa0(eps 1-18)',
                   '2:\xa0"Michi ~to you all (道 〜to you all)" by aluto\xa0(eps '
                   '19-30)',
                   '3:\xa0"KIMI M

In [5]:
#Available data on MAL
archive = jikan.season_archive()
pprint(archive['archive'])

[{'seasons': ['Winter'], 'year': 2023},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2022},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2021},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2020},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2019},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2018},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2017},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2016},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2015},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2014},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2013},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2012},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2011},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2010},
 {'seasons': ['Winter', 'Spring', 'Summer', 'Fall'], 'year': 2009},
 {'seaso

In [6]:
#You can query for all animes from a particular season, and year
winter_2018_anime = jikan.season(year=2018, season='winter')
pprint([ winter_2018_anime['anime'][i]['title'] for i in range(len(winter_2018_anime))])

['Violet Evergarden',
 'Darling in the FranXX',
 'Nanatsu no Taizai: Imashime no Fukkatsu',
 'Overlord II',
 'Saiki Kusuo no Ψ-nan 2',
 'Karakai Jouzu no Takagi-san',
 'Citrus',
 'Sora yori mo Tooi Basho',
 'Death March kara Hajimaru Isekai Kyousoukyoku',
 'Yuru Camp△',
 'Gakuen Babysitters']


In [7]:
years = [year for year in range (2022, 2023)]
seasons = ['winter', 'spring', 'summer', 'fall']

myanimelist = []

In [8]:
# Retrieve anime data through Jikan
for year in years:
    for season in seasons:
        myanimelist.append(jikan.season(year = year, season = season))
        

In [9]:
# Collect all necessary attributes: Title, Score, Members, Genre, Producers, Year, Season and Synopsis
animedata = []
for animeseason in myanimelist:
    for show in animeseason['anime']:
        animedata.append([show['title'], show['score'], show['members'], ', '.join(genre['name'] for genre in show['genres']), 
                        ', '.join(producer['name'] for producer in show['producers']), animeseason["season_year"],
                        animeseason["season_name"], show['synopsis']])
        

In [10]:
# Create a dataframe to store Anime data and remove duplicate entries
anime_df = pd.DataFrame(animedata, columns = ["Title", "Score", "Members", "Genre", "Producers", "Year", "Season", "Synopsis"])
anime_df.drop_duplicates(subset= "Title", keep = 'first', inplace = True)
anime_df.index.name = "ID"

In [11]:
anime_df.head()

Unnamed: 0_level_0,Title,Score,Members,Genre,Producers,Year,Season,Synopsis
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,Kimetsu no Yaiba: Yuukaku-hen,8.86,932000,"Action, Fantasy",ufotable,2022,Winter,The devastation of the Mugen Train incident st...
1,Shingeki no Kyojin: The Final Season Part 2,8.82,900000,"Action, Drama",MAPPA,2022,Winter,Turning against his former allies and enemies ...
2,Sono Bisque Doll wa Koi wo Suru,8.34,717000,"Romance, Slice of Life",CloverWorks,2022,Winter,High school student Wakana Gojou spends his da...
3,Arifureta Shokugyou de Sekai Saikyou 2nd Season,7.21,276000,"Action, Adventure, Fantasy","asread., studio MOTHER",2022,Winter,Transported to another world and left behind b...
4,Tensai Ouji no Akaji Kokka Saisei Jutsu,7.42,195000,"Comedy, Fantasy",Yokohama Animation Lab,2022,Winter,"The king of Natra has fallen ill, leaving the ..."


In [12]:
#To save as a .csv file
anime_df.to_csv('../data/anime_data.csv', index=True)

In [13]:
anime_df = pd.read_csv('../data/anime_data.csv')