# Manga Recommender with Collaborative Filtering
![reco](https://whalescans.com/twitter-image.png)


## Collaborative Filtering

Collaborative filtering represents a fundamental method employed within numerous recommendation systems, facilitating the prognostication of user preferences by leveraging data derived from other users. This technique is grounded in the assumption that users who have exhibited comparable preferences historically are inclined to continue exhibiting congruent preferences in the future. `Singular Value Decomposition (SVD)` emerges as a precise `matrix factorization` technique employed for effecting such predictions. In the context of constructing our manga recommender, we will harness the implementation of SVD provided by the Numpy library.

## Let's start by installing and importing the libraries

In [1]:
%pip install lancedb pandas

[0mNote: you may need to restart the kernel to use updated packages.


In [5]:
import httpx
import csv
import asyncio

In [4]:
# Your FastAPI application's base URL
BASE_URL = 'https://svelte-manga-api.valiantlynx.com'
GENRES = ['Action',
 'Adventure',
 'Comedy',
 'Cooking',
 'Doujinshi',
 'Drama',
 'Erotica',
 'Fantasy',
 'Gender bender',
 'Harem',
 'Historical',
 'Horror',
 'Isekai',
 'Josei',
 'Manhua',
 'Manhwa',
 'Martial arts',
 'Mature',
 'Mecha',
 'Medical',
 'Mystery',
 'One shot',
 'Pornographic',
 'Psychological',
 'Romance',
 'School life',
 'Sci fi',
 'Seinen',
 'Shoujo',
 'Shoujo ai',
 'Shounen',
 'Shounen ai',
 'Slice of life',
 'Smut',
 'Sports',
 'Supernatural',
 'Tragedy',
 'Webtoons',
 'Yaoi',
 'Yuri']  # Update as needed

In [6]:
async def fetch_manga(server='MANGANELO', genre="", page=1):
    async with httpx.AsyncClient() as client:
        response = await client.get(f'{BASE_URL}/api/manga', params={'server': server, 'genre': genre, 'page': page})
        return response.json()['mangas']

In [8]:
# Function to fetch manga details
async def fetch_manga_details(manga_id, server='MANGANELO'):
    async with httpx.AsyncClient() as client:
        response = await client.get(f'{BASE_URL}/api/manga/{manga_id}', params={'server': server})
        if response.status_code == 200:
            try:
                return response.json()
            except ValueError:
                print(f"Error decoding JSON from response for manga ID {manga_id}")
                return None  # Or an appropriate default value/structure
        else:
            print(f"Failed to fetch details for manga ID {manga_id}. Status code: {response.status_code}")
            return None  # Or an appropriate default value/structure

In [9]:
async def main():
    for genre in GENRES:
        for page in range(1, 50):  # Adjust page range as needed
            try:
                current_mangas = await fetch_manga(genre=genre, page=page)
                if not current_mangas:
                    break
                for manga in current_mangas:
                    manga_details = await fetch_manga_details(manga['id'])
                    if manga_details:
                        manga['authors'] = '|'.join(str(x) for x in manga_details.get('authors', []))
                        manga['genres'] = '|'.join(str(x) for x in manga_details.get('genres', []))
                        manga['lastUpdated'] = manga_details['lastUpdated']
                        manga['views'] = manga_details['views']
                        mangas.append(manga)
                    else:
                        # manga_details is None, handle the error
                        print(f"Details for manga ID {manga['id']} could not be fetched or processed.")

            except ValueError:
                continue

            


    print('mangas[] has been created.')

In [10]:
import numpy as np
import pandas as pd
import lancedb

  from .autonotebook import tqdm as notebook_tqdm


In [11]:
mangas = []
await main()

CancelledError: 

In [12]:
mangas[:2]

[{'title': 'Rebirth Of The Immortal Venerable',
  'img': '/mangaimage/manga-nr990726.jpg',
  'latestChapter': 'Chapter 209',
  'rating': '3.83',
  'src': '/manga/manga-nr990726',
  'id': 'manga-nr990726',
  'titleId': 'Rebirth Of The Immortal Venerable',
  'description': "Rebirth of the Immortal Venerable summary is updating. Come visit MangaNato.com sometime to read the latest chapter of Rebirth of the Immortal Venerable. If you have any question about this manga, Please don't hesitate to contact us or translate team. Hope you enjoy it.",
  'authors': 'Daxiedao Anime',
  'genres': 'Action|Adventure|Martial arts|Mature|Supernatural|Manhua',
  'lastUpdated': '2023-07-10 23:12',
  'views': 9000000.0},
 {'title': 'Target 1 Billion Points! Open The Ultimate Game Of Second Life!',
  'img': '/mangaimage/manga-hy985133.jpg',
  'latestChapter': 'Chapter 74',
  'rating': '3.89',
  'src': '/manga/manga-hy985133',
  'id': 'manga-hy985133',
  'titleId': 'Target 1 Billion Points! Open The Ultimate 

In [13]:
# Convert the manga list to a DataFrame
mangas_df = pd.DataFrame(mangas)
mangas_df.head()


Unnamed: 0,title,img,latestChapter,rating,src,id,titleId,description,authors,genres,lastUpdated,views
0,Rebirth Of The Immortal Venerable,/mangaimage/manga-nr990726.jpg,Chapter 209,3.83,/manga/manga-nr990726,manga-nr990726,Rebirth Of The Immortal Venerable,Rebirth of the Immortal Venerable summary is u...,Daxiedao Anime,Action|Adventure|Martial arts|Mature|Supernatu...,2023-07-10 23:12,9000000.0
1,Target 1 Billion Points! Open The Ultimate Gam...,/mangaimage/manga-hy985133.jpg,Chapter 74,3.89,/manga/manga-hy985133,manga-hy985133,Target 1 Billion Points! Open The Ultimate Gam...,Xie Yu kept himself closed because of the deat...,炽翊漫画,Action,2022-08-21 20:55,9000000.0
2,Lan Ke Qi Yuan,/mangaimage/manga-ml989594.jpg,Chapter 203,4.66,/manga/manga-ml989594,manga-ml989594,Lan Ke Qi Yuan,Lan Ke Qi Yuan is a Manga/Manhwa/Manhua in (En...,阅文漫画,Action|Adventure|Fantasy|Martial arts,2023-04-18 21:56,9000000.0
3,Sankarea,/mangaimage/manga-be956713.jpg,Vol.11 Chapter 56.5: Extra: Sankarea If,4.74,/manga/manga-be956713,manga-be956713,Sankarea,Chihiro Furuya is a male high-school student h...,Hattori Mitsuru,Action|Comedy|Cooking|Drama|Horror|Romance|Sho...,2019-04-14 18:25,9000000.0
4,Martial Streamer,/mangaimage/manga-qc993611.jpg,Chapter 41,4.68,/manga/manga-qc993611,manga-qc993611,Martial Streamer,"TaeMin was a student taking a gap year, when h...",Buksam,Go Ha Som|Working Brain Please|Action|Drama|Fa...,2024-03-30 13:43,9000000.0


In [15]:
# For this example, let's simulate manga embeddings
# Normally, you would use a more sophisticated method to generate embeddings
np.random.seed(42)  # For reproducibility
mangas_df.head()

Unnamed: 0,title,img,latestChapter,rating,src,id,titleId,description,authors,genres,lastUpdated,views
0,Rebirth Of The Immortal Venerable,/mangaimage/manga-nr990726.jpg,Chapter 209,3.83,/manga/manga-nr990726,manga-nr990726,Rebirth Of The Immortal Venerable,Rebirth of the Immortal Venerable summary is u...,Daxiedao Anime,Action|Adventure|Martial arts|Mature|Supernatu...,2023-07-10 23:12,9000000.0
1,Target 1 Billion Points! Open The Ultimate Gam...,/mangaimage/manga-hy985133.jpg,Chapter 74,3.89,/manga/manga-hy985133,manga-hy985133,Target 1 Billion Points! Open The Ultimate Gam...,Xie Yu kept himself closed because of the deat...,炽翊漫画,Action,2022-08-21 20:55,9000000.0
2,Lan Ke Qi Yuan,/mangaimage/manga-ml989594.jpg,Chapter 203,4.66,/manga/manga-ml989594,manga-ml989594,Lan Ke Qi Yuan,Lan Ke Qi Yuan is a Manga/Manhwa/Manhua in (En...,阅文漫画,Action|Adventure|Fantasy|Martial arts,2023-04-18 21:56,9000000.0
3,Sankarea,/mangaimage/manga-be956713.jpg,Vol.11 Chapter 56.5: Extra: Sankarea If,4.74,/manga/manga-be956713,manga-be956713,Sankarea,Chihiro Furuya is a male high-school student h...,Hattori Mitsuru,Action|Comedy|Cooking|Drama|Horror|Romance|Sho...,2019-04-14 18:25,9000000.0
4,Martial Streamer,/mangaimage/manga-qc993611.jpg,Chapter 41,4.68,/manga/manga-qc993611,manga-qc993611,Martial Streamer,"TaeMin was a student taking a gap year, when h...",Buksam,Go Ha Som|Working Brain Please|Action|Drama|Fa...,2024-03-30 13:43,9000000.0


In [16]:
# Save to CSV if needed
mangas_df.to_csv('./files/mangas_embedd.csv', index=False)


In [35]:
import lancedb
import numpy as np
import pandas as pd
from hashlib import md5


In [36]:
# Load manga data
mangas = pd.read_csv('./files/test.csv')
mangas.drop_duplicates(subset=['title'], inplace=True)
mangas.fillna('', inplace=True)  # Handle missing values


ParserError: Error tokenizing data. C error: Expected 12 fields in line 1143, saw 18


In [24]:
# Encoding functions for different attributes
def encode_text(text):
    """Simplified encoding for text attributes to a fixed-size vector."""
    hash_digest = md5(text.encode('utf-8')).hexdigest()
    return np.array([int(hash_digest[i:i+2], 16) for i in range(0, len(hash_digest), 2)])

def encode_numeric(value, max_value):
    """Normalize numeric values."""
    return np.array([float(value) / max_value])

def encode_date(date_str):
    """Convert dates into a timestamp."""
    try:
        return np.array([pd.to_datetime(date_str).timestamp()])
    except:
        return np.array([0.0])

In [25]:
# Vector generation for each manga based on specified attributes
def generate_vector(manga, attributes):
    vector_parts = []
    if 'title' in attributes:
        vector_parts.append(encode_text(manga['title']))
    if 'description' in attributes:
        vector_parts.append(encode_text(manga['description']))
    if 'authors' in attributes:
        vector_parts.append(encode_text(manga['authors']))
    if 'genres' in attributes:
        vector_parts.append(encode_text(manga['genres']))
    if 'rating' in attributes:
        vector_parts.append(encode_numeric(manga['rating'], 5))  # Assuming rating is out of 5
    if 'views' in attributes:
        vector_parts.append(encode_numeric(manga['views'], 1e9))  # Assuming max views is 1 billion for normalization
    if 'latestChapter' in attributes:
        vector_parts.append(encode_text(manga['latestChapter']))
    if 'lastUpdated' in attributes:
        vector_parts.append(encode_date(manga['lastUpdated']))
    return np.concatenate(vector_parts)

In [26]:
# Prepare data for LanceDB with dynamic attribute selection
attributes = ['title', 'description', 'authors', 'genres', 'rating', 'views', 'latestChapter', 'lastUpdated']
mangas['vector'] = mangas.apply(lambda row: generate_vector(row, attributes), axis=1)

data = [{
    "id": row['id'],
    "title": row['title'],
    "vector": row['vector'].tolist(),
} for index, row in mangas.iterrows()]

In [29]:
# Connect to LanceDB
db = lancedb.connect("./data/manga-db")
try:
    db.drop_table("manga_set")
except Exception as e:
    print("Dropping table failed:", e)

table = db.create_table("manga_set", data=data)


In [30]:
def get_recommendations(query_title, limit=10):
    query_vector = next(row['vector'] for index, row in mangas.iterrows() if row["title"] == query_title)
    result = table.search(query_vector).limit(limit + 1).to_pandas()
    return result[result['title'] != query_title][['title']].head(limit)


## Get the Recommendations
Finally, we can create a function that takes a manga title and returns the top 5 similar mangas. By searching in our vector store for the embeddings of the manga, we can return a dataframe of the most similar mangas. We can also add some flair reading and displaying the links of each manga.

In [34]:
# Example usage
print(get_recommendations("Rebirth Of The Immortal Venerable", limit=20))


                                                title
1                                     Akuma No Yuusha
2                                   I Am Space☆Dandy!
3                                      Lan Ke Qi Yuan
4                                           Ssam Bbak
5                                        Unparalleled
6                              Please Love Me Gentle.
7                   I Have A Post-Apocalyptic Dungeon
8                                             Hiniiru
9                            Introduction To Survival
10                                      Video Girl Ai
11                                City: Crime Stories
12  Houkago Wa Kenka Saikyou No Gyaru Ni Tsurekoma...
13         That Is Needed For A Villainous Aristocrat
14                                            Coppers
15                                    Phantom Busters
16                                           Chong Zi
17                                   Burial Sword Art
18                          

## Tada!! your first manga recommendation system is live

Of course, this won't be completely accurate. There are other ways improve the accuracy, such as `reducing the dimensions` of the original data, or filtering out users/mangas with few ratings. But this is a good start to building a manga recommender system.