Building a Movie Recommender System

This is a data engineering project for movie suggestions based on MovieLens raw dataset. It is built using below mentioned Azure services.

The Architecture Diagram for this project is shown below -

I have used azure data factory as a orchestration tool for building and executing data pipeline. The main tasks involved are -

Data cleaning using ADF's data flow by removing duplicate rows and null values and ingesting them to Azure data lake storage gen2 in parquet format.
Data transformation in azure databricks by calculating Bayesian average ratings and top 5 tags for each movie using spark SQL.
Data analysis and best movie by genre or rating calculations in Azure synapse analytics.

I have used the below mentioned resources in Azure portal for building this movie recommender project end-to-end.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
images		images
Movie_Data_Transformation.ipynb		Movie_Data_Transformation.ipynb
README.md		README.md

Provide feedback