# **Movie Recommendation System with Python**
In this project, we'll develop a basic recommender system with Python and pandas.

Movies will be suggested by similarity to other movies; this is not a robust recommendation system, but something to start out on.

In [None]:
import numpy as np
import pandas as pd

# **Data**
We have two datasets:

*  A dataset of movie ratings.
*  A dataset of all movies titles and their ids.

In [None]:
#Reading the ratings dataset.
column_names = ['user_id', 'item_id', 'rating', 'timestamp']
df = pd.read_csv('data/u.data', sep='\t', names=column_names)

In [None]:
df.head()


Reading the movie titles

In [None]:
movie_titles = pd.read_csv("data/Movie_Id_Titles")
movie_titles.head()

We can merge them together:

In [None]:
df = pd.merge(df,movie_titles,on='item_id')
df.head()

# **Exploratory Analysis**
Let's explore the data a bit and get a look at some of the best rated movies.

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('white')
%matplotlib inline


Let's create a ratings dataframe with average rating and number of ratings:

In [None]:
df.groupby('title')['rating'].mean().sort_values(ascending=False).head()

In [None]:
df.groupby('title')['rating'].count().sort_values(ascending=False).head()

In [None]:
ratings = pd.DataFrame(df.groupby('title')['rating'].mean())
ratings.head()


Setting the number of ratings column:

In [None]:
ratings['num of ratings'] = pd.DataFrame(df.groupby('title')['rating'].count())
ratings.head()

Visualizing the number of ratings:

In [None]:
plt.figure(figsize=(10,4))
ratings['num of ratings'].hist(bins=40)

In [None]:
plt.figure(figsize=(10,4))
ratings['rating'].hist(bins=40)


It makes intuitive sense for most ratings to be around the 3.0 mark.