Project: Anime Popularity and Rating Analysis
Welcome to this data analysis project on anime shows. The goal is to explore and understand trends in anime ratings, popularity, genres, and other aspects using data visualisation.

Features and What They Represent:
Column Name	Description
anime_id:	A unique identifier assigned to each anime
name: Title of the anime
genre:	Comma-separated list of genres (e.g., Action, Adventure, Comedy)
Type:	Format of the anime (TV, Movie, OVA, Special, Music, etc.)
episodes: Total number of episodes
rating:	Average user rating (out of 10)
Member: Number of members who added this anime to their list (proxy for popularity)
Unnamed: 0	Index column from previous processing (can be dropped)

Total entries: ~12294 anime records

Missing values: Present in some fields like rating and genre — require cleaning

Genre field: Contains multiple genres per anime, separated by commas

Episode field: Some entries contain 'Unknown' — needs to be handled before analysis

Popularity indicator: The members field is a good proxy for how popular an anime is.

In [40]:
import numpy as np
import pandas as pd

In [41]:
df = pd.read_csv(r"C:\Users\Soumyadeep\Downloads\anime.csv")

In [42]:
df.head()

Unnamed: 0,Rank,Title,Score
0,1,Fullmetal Alchemist: BrotherhoodTV (64 eps)Apr...,9.1
1,2,"Steins;GateTV (24 eps)Apr 2011 - Sep 20112,473...",9.07
2,3,Bleach: Sennen Kessen-henTV (13 eps)Oct 2022 -...,9.06
3,4,"Gintama°TV (51 eps)Apr 2015 - Mar 2016605,113 ...",9.06
4,5,Shingeki no Kyojin Season 3 Part 2TV (10 eps)A...,9.05


In [43]:
df.loc[0]

Rank                                                     1
Title    Fullmetal Alchemist: BrotherhoodTV (64 eps)Apr...
Score                                                  9.1
Name: 0, dtype: object

In [44]:
def extract_epi (txt):
 
    check = False 
    data = ""
    
    for i in txt:
        
        if i == ")" :
            check = False
            return data  

        
        if check == True :
            data += i
        if i == '(' :
            check = True
            

            
            

In [45]:
df["Episodes"] = df["Title"].apply(extract_epi)

In [46]:
df['Episodes'] = df["Episodes"].str.replace(" eps","")

In [47]:
df

Unnamed: 0,Rank,Title,Score,Episodes
0,1,Fullmetal Alchemist: BrotherhoodTV (64 eps)Apr...,9.1,64
1,2,"Steins;GateTV (24 eps)Apr 2011 - Sep 20112,473...",9.07,24
2,3,Bleach: Sennen Kessen-henTV (13 eps)Oct 2022 -...,9.06,13
3,4,"Gintama°TV (51 eps)Apr 2015 - Mar 2016605,113 ...",9.06,51
4,5,Shingeki no Kyojin Season 3 Part 2TV (10 eps)A...,9.05,10
5,6,"Gintama'TV (51 eps)Apr 2011 - Mar 2012534,105 ...",9.04,51
6,7,Gintama: The FinalMovie (1 eps)Jan 2021 - Jan ...,9.04,1
7,8,Hunter x Hunter TV (148 eps)Oct 2011 - Sep 201...,9.04,148
8,9,Kaguya-sama wa Kokurasetai: Ultra RomanticTV (...,9.04,13
9,10,Gintama': EnchousenTV (13 eps)Oct 2012 - Mar 2...,9.03,13


In [48]:
df['Episodes'] = df['Episodes'].astype(int)

In [49]:
df

Unnamed: 0,Rank,Title,Score,Episodes
0,1,Fullmetal Alchemist: BrotherhoodTV (64 eps)Apr...,9.1,64
1,2,"Steins;GateTV (24 eps)Apr 2011 - Sep 20112,473...",9.07,24
2,3,Bleach: Sennen Kessen-henTV (13 eps)Oct 2022 -...,9.06,13
3,4,"Gintama°TV (51 eps)Apr 2015 - Mar 2016605,113 ...",9.06,51
4,5,Shingeki no Kyojin Season 3 Part 2TV (10 eps)A...,9.05,10
5,6,"Gintama'TV (51 eps)Apr 2011 - Mar 2012534,105 ...",9.04,51
6,7,Gintama: The FinalMovie (1 eps)Jan 2021 - Jan ...,9.04,1
7,8,Hunter x Hunter TV (148 eps)Oct 2011 - Sep 201...,9.04,148
8,9,Kaguya-sama wa Kokurasetai: Ultra RomanticTV (...,9.04,13
9,10,Gintama': EnchousenTV (13 eps)Oct 2012 - Mar 2...,9.03,13


In [52]:
df.loc[0] ["Title"]

'Fullmetal Alchemist: BrotherhoodTV (64 eps)Apr 2009 - Jul 20103,218,472 membersManga StoreVolume 1€4.58Preview'

In [55]:
df.loc[1] ["Title"]

'Steins;GateTV (24 eps)Apr 2011 - Sep 20112,473,707 members'

In [65]:
def time_stamp(txt) :
    data = ""
    check = False
    for i in range(len(txt)) :
        if txt[i] == ")" :
            for j in range(i+1, i+20) :
                data += txt[j]
            return data
                

In [66]:
df["Time Stamp"] = df["Title"].apply(time_stamp)

In [67]:
df

Unnamed: 0,Rank,Title,Score,Episodes,Time Stamp
0,1,Fullmetal Alchemist: BrotherhoodTV (64 eps)Apr...,9.1,64,Apr 2009 - Jul 2010
1,2,"Steins;GateTV (24 eps)Apr 2011 - Sep 20112,473...",9.07,24,Apr 2011 - Sep 2011
2,3,Bleach: Sennen Kessen-henTV (13 eps)Oct 2022 -...,9.06,13,Oct 2022 - Dec 2022
3,4,"Gintama°TV (51 eps)Apr 2015 - Mar 2016605,113 ...",9.06,51,Apr 2015 - Mar 2016
4,5,Shingeki no Kyojin Season 3 Part 2TV (10 eps)A...,9.05,10,Apr 2019 - Jul 2019
5,6,"Gintama'TV (51 eps)Apr 2011 - Mar 2012534,105 ...",9.04,51,Apr 2011 - Mar 2012
6,7,Gintama: The FinalMovie (1 eps)Jan 2021 - Jan ...,9.04,1,Jan 2021 - Jan 2021
7,8,Hunter x Hunter TV (148 eps)Oct 2011 - Sep 201...,9.04,148,Oct 2011 - Sep 2014
8,9,Kaguya-sama wa Kokurasetai: Ultra RomanticTV (...,9.04,13,Apr 2022 - Jun 2022
9,10,Gintama': EnchousenTV (13 eps)Oct 2012 - Mar 2...,9.03,13,Oct 2012 - Mar 2013


In [71]:
def get_months(duration):
    try:
        start_str, end_str = duration.split(" - ")
        start = pd.to_datetime(start_str, format="%b %Y")
        end = pd.to_datetime(end_str, format="%b %Y")
        return (end.year - start.year) * 12 + (end.month - start.month) + 1
    except:
        return None  

df['total_months'] = df['Time Stamp'].apply(get_months)


In [72]:
df

Unnamed: 0,Rank,Title,Score,Episodes,Time Stamp,total_months
0,1,Fullmetal Alchemist: BrotherhoodTV (64 eps)Apr...,9.1,64,Apr 2009 - Jul 2010,16
1,2,"Steins;GateTV (24 eps)Apr 2011 - Sep 20112,473...",9.07,24,Apr 2011 - Sep 2011,6
2,3,Bleach: Sennen Kessen-henTV (13 eps)Oct 2022 -...,9.06,13,Oct 2022 - Dec 2022,3
3,4,"Gintama°TV (51 eps)Apr 2015 - Mar 2016605,113 ...",9.06,51,Apr 2015 - Mar 2016,12
4,5,Shingeki no Kyojin Season 3 Part 2TV (10 eps)A...,9.05,10,Apr 2019 - Jul 2019,4
5,6,"Gintama'TV (51 eps)Apr 2011 - Mar 2012534,105 ...",9.04,51,Apr 2011 - Mar 2012,12
6,7,Gintama: The FinalMovie (1 eps)Jan 2021 - Jan ...,9.04,1,Jan 2021 - Jan 2021,1
7,8,Hunter x Hunter TV (148 eps)Oct 2011 - Sep 201...,9.04,148,Oct 2011 - Sep 2014,36
8,9,Kaguya-sama wa Kokurasetai: Ultra RomanticTV (...,9.04,13,Apr 2022 - Jun 2022,3
9,10,Gintama': EnchousenTV (13 eps)Oct 2012 - Mar 2...,9.03,13,Oct 2012 - Mar 2013,6


In [77]:
df[df['Score'] == df['Score'].max()] ['Title']

0    Fullmetal Alchemist: BrotherhoodTV (64 eps)Apr...
Name: Title, dtype: object

In [78]:
df['Title'].head()

0    Fullmetal Alchemist: BrotherhoodTV (64 eps)Apr...
1    Steins;GateTV (24 eps)Apr 2011 - Sep 20112,473...
2    Bleach: Sennen Kessen-henTV (13 eps)Oct 2022 -...
3    Gintama°TV (51 eps)Apr 2015 - Mar 2016605,113 ...
4    Shingeki no Kyojin Season 3 Part 2TV (10 eps)A...
Name: Title, dtype: object

In [79]:
df[df['Episodes'] == df['Episodes'].max()]

Unnamed: 0,Rank,Title,Score,Episodes,Time Stamp,total_months
15,16,"GintamaTV (201 eps)Apr 2006 - Mar 20101,034,41...",8.94,201,Apr 2006 - Mar 2010,48


In [80]:
df[df["Time Stamp"] == df["Time Stamp"].max()]

Unnamed: 0,Rank,Title,Score,Episodes,Time Stamp,total_months
21,22,Violet Evergarden MovieMovie (1 eps)Sep 2020 -...,8.89,1,Sep 2020 - Sep 2020,1
