netflix-prize

This Python 2.7 software analyzes data from the Netflix Prize challenge that sought to achieve the lowest possible root mean square error. For this project, my software obtained an RMSE of approximately 0.9 stars for the data provided by Netflix.

The movie rating files contain over 100 million ratings from 480,000 randomly-chosen, anonymous Netflix customers over 17,000 movie titles. The data was collected between October 1998 and December 2005 and reflect the distribution of all ratings received during this period. The ratings are on a scale from 1 to 5 (integral) stars. To protect customer privacy, each customer ID has been replaced with a randomly-assigned ID. The date of each rating and the title and year of release for each movie ID are also provided. The original data can be found here. For the sake of performance and simplicity, I built caches for averages.

To run the code, use the command python RunNetflix.py < RunNetflix.in > RunNetflix.out. To test the code, use the command python TestNetflix.py > TestNetflix.out

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

caches

caches

.gitignore

.gitignore

Netflix.py

Netflix.py

README.md

README.md

RunNetflix.in

RunNetflix.in

RunNetflix.out

RunNetflix.out

RunNetflix.py

RunNetflix.py

TestNetflix.out

TestNetflix.out

TestNetflix.py

TestNetflix.py

makefile

makefile

Repository files navigation

netflix-prize

About

Releases 3

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
caches		caches
.gitignore		.gitignore
Netflix.py		Netflix.py
README.md		README.md
RunNetflix.in		RunNetflix.in
RunNetflix.out		RunNetflix.out
RunNetflix.py		RunNetflix.py
TestNetflix.out		TestNetflix.out
TestNetflix.py		TestNetflix.py
makefile		makefile

scottenriquez/netflix-prize

Folders and files

Latest commit

History

Repository files navigation

netflix-prize

About

Resources

Stars

Watchers

Forks

Languages