### Data Exploration of the Anime Dataset

In this notebook, we explore the [Anime Dataset 2023](https://www.kaggle.com/datasets/dbdmobile/myanimelist-dataset?select=users-score-2023.csv) to gain insights and prepare for training a recommendation model. The key objectives are:

- Understand the structure and content of the dataset.
- Experiment with different analyses to identify patterns or trends.
- Assess whether other models can be applied to this dataset.
  
The final goal is to use these insights to train a recommendation model, which will be done in the [Model Notebook](./model.ipynb).


In [2]:
# download the data from the kaggle api
!python3 download.py

Downloading myanimelist-dataset.zip to ./data
100%|██████████████████████████████████████| 1.80G/1.80G [02:25<00:00, 13.6MB/s]
100%|██████████████████████████████████████| 1.80G/1.80G [02:25<00:00, 13.3MB/s]


In [3]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

## Initial Data Exploration for Recommender System Development
### Objective
The primary purpose of this phase in our analysis is to gain an initial understanding of the dataset that will be used to develop a recommender system. This step is crucial for identifying the key features that will influence the recommendation logic and for ensuring the data's appropriateness and quality for such a model.

In [10]:
df_data = pd.read_csv("./data/anime-dataset-2023.csv")
df_data.head(3)

Unnamed: 0,anime_id,Name,English name,Other name,Score,Genres,Synopsis,Type,Episodes,Aired,...,Studios,Source,Duration,Rating,Rank,Popularity,Favorites,Scored By,Members,Image URL
0,1,Cowboy Bebop,Cowboy Bebop,カウボーイビバップ,8.75,"Action, Award Winning, Sci-Fi","Crime is timeless. By the year 2071, humanity ...",TV,26.0,"Apr 3, 1998 to Apr 24, 1999",...,Sunrise,Original,24 min per ep,R - 17+ (violence & profanity),41.0,43,78525,914193.0,1771505,https://cdn.myanimelist.net/images/anime/4/196...
1,5,Cowboy Bebop: Tengoku no Tobira,Cowboy Bebop: The Movie,カウボーイビバップ 天国の扉,8.38,"Action, Sci-Fi","Another day, another bounty—such is the life o...",Movie,1.0,"Sep 1, 2001",...,Bones,Original,1 hr 55 min,R - 17+ (violence & profanity),189.0,602,1448,206248.0,360978,https://cdn.myanimelist.net/images/anime/1439/...
2,6,Trigun,Trigun,トライガン,8.22,"Action, Adventure, Sci-Fi","Vash the Stampede is the man with a $$60,000,0...",TV,26.0,"Apr 1, 1998 to Sep 30, 1998",...,Madhouse,Manga,24 min per ep,PG-13 - Teens 13 or older,328.0,246,15035,356739.0,727252,https://cdn.myanimelist.net/images/anime/7/203...
