In [1]:
print("Show recommender - Data Collection")

Show recommender - Data Collection


In [2]:
pip install requests

Note: you may need to restart the kernel to use updated packages.


In [3]:
import requests

url = "https://api.jikan.moe/v4/anime?q=naruto&limit=3"
response = requests.get(url)
data = response.json()

print(data["data"][0]["title"])
print(data["data"][0]["genres"])
print(data["data"][0]["synopsis"])

Naruto
[{'mal_id': 1, 'type': 'anime', 'name': 'Action', 'url': 'https://myanimelist.net/anime/genre/1/Action'}, {'mal_id': 2, 'type': 'anime', 'name': 'Adventure', 'url': 'https://myanimelist.net/anime/genre/2/Adventure'}, {'mal_id': 10, 'type': 'anime', 'name': 'Fantasy', 'url': 'https://myanimelist.net/anime/genre/10/Fantasy'}]
Twelve years ago, a colossal demon fox terrorized the world. During the monster's attack on the Hidden Leaf Village, the Hokage—the village's leader and most powerful ninja—sacrifices himself to seal the beast inside a newborn, relieving civilization from destruction while dooming the baby to a lonely life.

Now, after years of being shunned and bullied, Naruto Uzumaki pesters the village with elaborate pranks and vandalism. Despite these antics, he works hard to achieve his dream: to become the Hokage and earn the acknowledgement of those who have mistreated him for his entire life. Naruto joins Team 7, a ninja squad made up of two of his peers—prodigy Sasuk

In [4]:
import pandas as pd

animes = []
for query in ["naruto", "one piece", "attack on titan", "death note"]:
    url = f"https://api.jikan.moe/v4/anime?q={query}&limit=1"
    response = requests.get(url)
    data = response.json()
    anime = data["data"][0]
    animes.append({
        "title" : anime["title"],
        "genres" : [g["name"] for g in anime["genres"]],
        "synopsis" : anime["synopsis"]
    })

df = pd.DataFrame(animes)
df.to_csv("anime_sample.csv", index = False)
print("Saved anime_sample.csv")
    

Saved anime_sample.csv


In [5]:
df.head()

Unnamed: 0,title,genres,synopsis
0,Naruto,"[Action, Adventure, Fantasy]","Twelve years ago, a colossal demon fox terrori..."
1,One Piece Movie 01,"[Action, Adventure, Fantasy]","Many years ago, Woonan, a legendary pirate, pl..."
2,Shingeki no Kyojin,"[Action, Award Winning, Drama, Suspense]","Centuries ago, mankind was slaughtered to near..."
3,Death Note,"[Supernatural, Suspense]","Brutal murders, petty thefts, and senseless vi..."


In [6]:
df["combined_features"] = df["genres"].astype(str) + " " + df["synopsis"]

In [7]:
df.head()

Unnamed: 0,title,genres,synopsis,combined_features
0,Naruto,"[Action, Adventure, Fantasy]","Twelve years ago, a colossal demon fox terrori...","['Action', 'Adventure', 'Fantasy'] Twelve year..."
1,One Piece Movie 01,"[Action, Adventure, Fantasy]","Many years ago, Woonan, a legendary pirate, pl...","['Action', 'Adventure', 'Fantasy'] Many years ..."
2,Shingeki no Kyojin,"[Action, Award Winning, Drama, Suspense]","Centuries ago, mankind was slaughtered to near...","['Action', 'Award Winning', 'Drama', 'Suspense..."
3,Death Note,"[Supernatural, Suspense]","Brutal murders, petty thefts, and senseless vi...","['Supernatural', 'Suspense'] Brutal murders, p..."


In [8]:
from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(stop_words = "english")
tfidf_matrix = vectorizer.fit_transform(df["combined_features"])

In [9]:
tfidf_matrix

<Compressed Sparse Row sparse matrix of dtype 'float64'
	with 329 stored elements and shape (4, 299)>

In [13]:
from sklearn.metrics.pairwise import cosine_similarity

cosine_sim_matrix = cosine_similarity(tfidf_matrix, tfidf_matrix)

In [14]:
cosine_sim_matrix

array([[1.        , 0.04544838, 0.04062236, 0.03343191],
       [0.04544838, 1.        , 0.02995926, 0.03210429],
       [0.04062236, 0.02995926, 1.        , 0.04420731],
       [0.03343191, 0.03210429, 0.04420731, 1.        ]])

In [29]:
def recommend(title):
    idx = df[df["title"].str.lower() == title.lower()].index[0]
    sim_scores = list(enumerate(cosine_sim_matrix[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:4]  # top 3 recommendations
    anime_indices = [i[0] for i in sim_scores]
    return df["title"].iloc[anime_indices]


In [34]:
anime_name = input("Give me the anime name and I will recommend top 3 similar ones to it:")
recommend(anime_name)

Give me the anime name and I will recommend top 3 similar ones to it: Naruto


1    One Piece Movie 01
2    Shingeki no Kyojin
3            Death Note
Name: title, dtype: object