A movie recommendation system built using Apache Spark’s ML library
Download Apache Spark 2.4.6 distribution pre-built for Apache Hadoop 2.7 link.
- unpack the archive
- set the
$SPARK_HOME
environment variableexport SPARK_HOME=$(pwd)
- navigate to
PyCharm → Preferences ... → Project spark-demo → Project Structure → Add Content Root
in the main menu - select all
.zip
files from$SPARK_HOME/python/lib
- click apply and save changes
- navigate to
Run → Edit Configurations → + → Python
in the main menu - select
movie_recommendation.py
forScript
- name it
movie_recommendation
PYSPARK_PYTHON=python3
PYTHONPATH=$SPARK_HOME/python
PYTHONUNBUFFERED=1
movies.csv
andratings.csv
need to be under../data/*.csv
relative to the script path
- click
Run → Run 'movie_recommendation'
in the main menu