**Task 04 : Build a Movie Recommendation System**

*Step 1: Data*

*First, we need to gather movie datasets and import essential libraries for data processing and machine learning. For this example, we'll use the MovieLens dataset, a popular dataset for building recommendation systems.*

In [1]:
import pandas as pd

# Load MovieLens dataset
movies_df = pd.read_csv('C:\\Users\\shame\\OneDrive\\Desktop\\TXON TASK\\TXON_04\\movies.csv')
ratings_df = pd.read_csv('C:\\Users\\shame\\OneDrive\\Desktop\\TXON TASK\\TXON_04\\ratings.csv')

*Step 2 : Analysis*

*We will create generic recommendations of top-rated movies from the existing dataset. This step aims to provide initial recommendations based on average ratings. We will consider movies with a minimum number of ratings to ensure credibility.*

In [2]:
# Calculate average rating and number of ratings for each movie
movie_ratings = ratings_df.groupby('movieId')['rating'].agg(['mean', 'count']).reset_index()

# Set a minimum number of ratings required to be considered for recommendations
min_ratings_count = 100

# Filter movies with minimum ratings count
popular_movies = movie_ratings[movie_ratings['count'] >= min_ratings_count]

# Sort movies based on average rating in descending order
top_rated_movies = popular_movies.sort_values(by='mean', ascending=False)

# Get generic movie recommendations (top 10 movies)
generic_recommendations = top_rated_movies.merge(movies_df, on='movieId').head(10)

*Step 3: Personalization*

*To personalize the recommendations, we need to get the user's movie scores for some films they've watched. Let's assume the user provides their own movie ratings as a dictionary, where the keys are movie titles, and the values are their ratings (usually on a scale of 1 to 5).*

In [3]:
# User-provided movie ratings (example)
user_ratings = {
    'Movie A': 4.5,
    'Movie B': 3.0,
    'Movie C': 5.0,
    # Add more movie ratings as per user's preferences
}

*Step 4: Strategy*

*Implement a content-based or collaborative filtering strategy. For this example, let's use a content-based filtering approach, where we recommend movies similar to the ones the user has rated highly.*

In [4]:
# Filter movies similar to the ones the user has rated highly (content-based filtering)
def content_based_recommendations(user_ratings, movies_df):
    user_movies = pd.DataFrame(list(user_ratings.items()), columns=['title', 'rating'])
    user_movie_ids = movies_df[movies_df['title'].isin(user_movies['title'])]['movieId']
    similar_movies = movies_df[movies_df['movieId'].isin(user_movie_ids)]
    return similar_movies

# Get content-based movie recommendations for the user
content_based_recommendations = content_based_recommendations(user_ratings, movies_df)

*Step 5: Combination*

*Finally, combine the generic recommendations from Step 2 with the personalized content-based recommendations from Step 4 to create a final list of movie recommendations for the user.*

In [5]:
# Combine generic and personalized recommendations
final_recommendations = pd.concat([generic_recommendations, content_based_recommendations])
final_recommendations = final_recommendations.drop_duplicates(subset='title', keep='first')

# Print the final movie recommendations
print(final_recommendations[['title', 'mean']])

                              title      mean
0  Shawshank Redemption, The (1994)  4.429022
1             Godfather, The (1972)  4.289062
2                 Fight Club (1999)  4.272936
3    Godfather: Part II, The (1974)  4.259690
4              Departed, The (2006)  4.252336
5                 Goodfellas (1990)  4.250000
6                 Casablanca (1942)  4.240000
7           Dark Knight, The (2008)  4.238255
8        Usual Suspects, The (1995)  4.237745
9        Princess Bride, The (1987)  4.232394
