# Item Based Recommendation

![image.png](attachment:320ed19f-70dd-4f2b-8de1-4ce179a2eaf8.png)

Item based recommendation is a form of collaborative filtering for recommender systems based on the similarity between items calculated using people's ratings of those items.

# <span style="color:#2D7680"> Table of Contents </span>
 [<span style="color:#2D7680">  1. Examine the Dataset </span>](#Exa)
 
 [<span style="color:#2D7680"> 2. Data Preparation </span>](#Dat)
 
 [<span style="color:#2D7680"> 3. Item Based Selection </span>](#Ite)
 
 [<span style="color:#2D7680">  4. Conclusion </span>](#Con)
 
***


##  <span style="color:#2D7680"> 1. Examine the Dataset </span> <a class="anchor" id="Exa"></a>

The datasets describe ratings and free-text tagging activities from MovieLens, a movie recommendation service. It contains 20000263 ratings and 465564 tag applications across 27278 movies. These data were created by 138493 users between January 09, 1995 and March 31, 2015. This dataset was generated on October 17, 2016.
Users were selected at random for inclusion. All selected users had rated at least 20 movies.

**Short explanation of titles from datasets:**

 **movie.csv**

*         movieId – unique movie number
*         title – movie name
*         movieId – unique movie number



 **rating.csv**
    
*         userid – unique user number
*         movieId – unique movie number
*         rating – the rating given to the movie by the user
*         timestamp – movie release date

##  <span style="color:#2D7680"> 2. Data Preparation </span> <a class="anchor" id="Dat"></a>

Step 1: Read movie, rating datasets.

Step 2: Add the movie names and genre of the Ids to the rating data set from the movie data set.

Step3: Keep the names of the films with less than 1000 votes in the list and remove them from the data set.

Step 4: Create a pivot table for the dataframe with the userIDs in the index, the movie names in the columns and the ratings as values.

In [None]:
# Import libraries

import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 500)
pd.set_option('display.expand_frame_repr', False)

In [None]:
# Load datasets

movie = pd.read_csv("/kaggle/input/movielens-20m-dataset/movie.csv")
rating = pd.read_csv("/kaggle/input/movielens-20m-dataset/rating.csv")

In [None]:
# Combine two datasets

df = movie.merge(rating, how="left", on="movieId")
df.head(5)

In [None]:
# Set movie unique values

df["title"].nunique()
df["title"].value_counts().head()

In [None]:
# To exclude movies with less than 1000 votes from the dataset

comment_counts = pd.DataFrame(df["title"].value_counts())
rare_movies = comment_counts[comment_counts["title"] <= 1000].index
common_movies = df[~df["title"].isin(rare_movies)]

In [None]:
# Create a pivot table for the dataframe with the userIDs in the index, the movie names in the columns, and the ratings as values

user_movie_df = common_movies.pivot_table(index=["userId"], columns=["title"], values="rating")
user_movie_df.head(5)

##  <span style="color:#2D7680"> 3. Item Based Selection </span> <a class="anchor" id="Ite"></a>

Step 1: Choosing a random user id

Step 2: The id of the movie with the most recent score is taken from the movies that the selected    user gives 5 points.

Step 3: Filters the user_movie_df dataframe created in the User based recommendation section according to the selected movie id

Step 4: Find correlation of selected movie and other movies using filtered dataframe

Step 5: Bring the movie names from the movie dataset and select the first 5 movies to recommend

In [None]:
random_user = int(pd.Series(user_movie_df.index).sample(1, random_state=45).values)
random_user

In [None]:
# The ID of the movie with the highest and most recent vote by the random user

random_user_movie = df[(df["userId"] == random_user) & (df["rating"] == 5)].sort_values(by="timestamp", ascending=False).reset_index()
random_user_movie.head(5)

In [None]:
first_movie = random_user_movie["title"][0]
first_movie

In [None]:
# Finding correlation of selected movie and other movies using filtered dataframe

movie_item = user_movie_df[first_movie]
recommended = user_movie_df.corrwith(movie_item).sort_values(ascending=False)
recommended.head(5)

In [None]:
# Suggesting the top 5 movies other than the selected movie itself

recommended=recommended.reset_index()

recommended.columns = ['movie_name', 'corr']

recommended.sort_values(by="corr", ascending=False, inplace=True)

recommended = recommended.iloc[1: ]

recommended.head(5)

##  <span style="color:#2D7680"> 4. Conclusion  </span> <a class="anchor" id="Con"></a>

The movie with the highest score and the most recent date of the randomly selected user is selected.

The correlation between the selected movie and other movies is calculated

The top five movies with the highest correlation are recommended

 ##  <span style="color:#2D7680"> Keep in Touch!  </span> 

You can follow my the other social media adresses to see this kind of works!

* [GitHub](https://github.com/Vedatgul)
* [LinkedIn](http://www.linkedin.com/in/vedat-gül)
* [Medium](http://medium.com/@veribilimi35)