# Introduction

Welcome to our Movie Ratings DataFrame! In this digital realm, we have meticulously curated a collection of ratings data from our diverse group of movie enthusiasts. Each row represents a unique viewer, while each column corresponds to a specific movie title.

As you peruse through our DataFrame, you'll uncover a treasure trove of insights into the cinematic tastes and preferences of our esteemed viewers. From heart-pounding thrillers to heartwarming dramas, our ratings data encapsulates a wide spectrum of cinematic experiences.

Join us as we delve into the world of movies, exploring the highs and lows of cinematic storytelling through the lens of our meticulously curated ratings data. With each rating, a story unfolds, revealing the unique perspectives and passions of our viewers.

So grab your popcorn and settle in for an immersive journey through the enchanting world of movies, as we unravel the mysteries of cinematic ratings together!

In [65]:
import pandas as pd
import numpy as np

# Data 

In [66]:
data = {
    'Spaceman': [4, 3, np.nan, 5, 4],  
    'Codes': [5, 4, 3, 4, 4],
    'Codes Part II': [3, 3, 4, 4, 3],  
    'Noah': [2, np.nan, 3, 3, 2],  # 
    'Mea Culpa': [np.nan, 4, 5, 4, 5],
    'Fear': [4, 3, 2, 2, 3]  
}

# Index labels for our DataFrame

In [67]:
index_labels = ['Patrick', 'Sharon', 'Elicia', 'Tony', 'Mae']

# Create DataFrame

In [68]:
ratings_df = pd.DataFrame(data, index=index_labels)

# Display the DataFrame

In [69]:
print("Original Ratings Data:")
print(ratings_df)
print()

Original Ratings Data:
         Spaceman  Codes  Codes Part II  Noah  Mea Culpa  Fear
Patrick       4.0      5              3   2.0        NaN     4
Sharon        3.0      4              3   NaN        4.0     3
Elicia        NaN      3              4   3.0        5.0     2
Tony          5.0      4              4   3.0        4.0     2
Mae           4.0      4              3   2.0        5.0     3



# Calculate average ratings for each user and each movie

In [70]:
avg_ratings_by_user = ratings_df.mean(axis=1)
avg_ratings_by_movie = ratings_df.mean()

# Display average ratings

In [71]:
print("Average Ratings by User:")
print(avg_ratings_by_user)
print("\nAverage Ratings by Movie:")
print(avg_ratings_by_movie)
print()

Average Ratings by User:
Patrick    3.600000
Sharon     3.400000
Elicia     3.400000
Tony       3.666667
Mae        3.500000
dtype: float64

Average Ratings by Movie:
Spaceman         4.0
Codes            4.0
Codes Part II    3.4
Noah             2.5
Mea Culpa        4.5
Fear             2.8
dtype: float64



# Normalize ratings

In [72]:
normalized_ratings_df = (ratings_df - ratings_df.min()) / (ratings_df.max() - ratings_df.min())

# Display normalized ratings

In [73]:
print("Normalized Ratings:")
print(normalized_ratings_df)
print()

Normalized Ratings:
         Spaceman  Codes  Codes Part II  Noah  Mea Culpa  Fear
Patrick       0.5    1.0            0.0   0.0        NaN   1.0
Sharon        0.0    0.5            0.0   NaN        0.0   0.5
Elicia        NaN    0.0            1.0   1.0        1.0   0.0
Tony          1.0    0.5            1.0   1.0        0.0   0.0
Mae           0.5    0.5            0.0   0.0        1.0   0.5



# Calculate average normalized ratings for each user and each movie

In [74]:
avg_normalized_ratings_by_user = normalized_ratings_df.mean(axis=1)
avg_normalized_ratings_by_movie = normalized_ratings_df.mean()

# Display average normalized ratings

In [75]:
print("Average Normalized Ratings by User:")
print(avg_normalized_ratings_by_user)
print("\nAverage Normalized Ratings by Movie:")
print(avg_normalized_ratings_by_movie)
print()


Average Normalized Ratings by User:
Patrick    0.500000
Sharon     0.200000
Elicia     0.600000
Tony       0.583333
Mae        0.416667
dtype: float64

Average Normalized Ratings by Movie:
Spaceman         0.5
Codes            0.5
Codes Part II    0.4
Noah             0.5
Mea Culpa        0.5
Fear             0.4
dtype: float64



# Conclusion

Advantages of using normalized ratings:
- Allows for fairer comparison across users with different rating scales.
- Reduces bias introduced by users who tend to rate everything high or low.

Disadvantages of using normalized ratings:
- May lose some information present in the original ratings.
- Assumes that the ratings are on the same underlying scale, which might not always be true.