## Massey Rating Movie method within Genre

This is the Massey method applied to movies assumed to be in the Movielens dataset format.  See:

https://grouplens.org/datasets/movielens/

The userMovie matrix has already been created with makeUserMovieMatrix.ipynb.

### Load the ratings and movies into dataframes

In [1]:
import pandas as pd
import numpy as np

userMovie = np.load('userMovieMatrixAction.npy')

numberUsers, numberGenreMovies = userMovie.shape

genreFilename = 'action.csv'
genre = pd.read_csv(genreFilename)

### Create Massey ratings

Note, a movie competes over one users ratings. As such, the games are row-wise for all pairs of nonzero entries. 

In [4]:
masseyMatrix = np.diag(np.ones(numberGenreMovies))
b = np.ones(numberGenreMovies)

for i in range(numberUsers):
    for j in range(numberGenreMovies):
        if (userMovie[i,j] != 0): # then there are games
            for k in range(j+1,numberGenreMovies):
                team1ID = j
                team1Score = userMovie[i,j]
                if (userMovie[i,k] != 0): # then there is a game between movie j and k
                    team2ID = k
                    team2Score = userMovie[i,k]

                    masseyMatrix[team1ID, team2ID] -= 1
                    masseyMatrix[team2ID, team1ID] -= 1

                    masseyMatrix[team1ID, team1ID] += 1
                    masseyMatrix[team2ID, team2ID] += 1

                    pointDifferential = abs(team1Score - team2Score)

                    if team1Score > team2Score:
                        b[team1ID] += pointDifferential
                        b[team2ID] -= pointDifferential
                    elif team1Score < team2Score:
                        b[team1ID] -= pointDifferential
                        b[team2ID] += pointDifferential
        
# replace last row with ones and 0 on RHS
masseyMatrix[-1,:] = np.ones((1,numberGenreMovies))
b[-1] = 0

### Sort and print the ranking of teams

In [None]:
r = np.linalg.solve(masseyMatrix,b)
iSort = np.argsort(-r)

# Remove movies with no ratings, find total ratings by column sums 
totalMovieRatings = np.sum(userMovie,0)

print('\n\n************** Massey Rating Method **************\n')
print('===========================')
print('Rank       Rating  Movie   ')
print('===========================')
rank = 1
for i in range(numberGenreMovies):
    if (totalMovieRatings[iSort[i]] != 0):  # if the movie has at least 1 rating
        print(f'{rank:4d} {r[iSort[i]]:10.3f}  {genre.at[iSort[i],"title"]}')        
        rank += 1
        
print('')   # extra carriage return

print('Total films with ratings: %d' % len(np.where(totalMovieRatings!=0)[0]))
print('Total films in genre: %d' % numberGenreMovies)