This repository contains the codes for 3 projects on the Yelp Dataset for a Deep Learning course in HKUST
The training method for each project is provided under Final_model.ipynb
in each folder, as well as the training and validation data used.
The report Project_report.pdf
for each project will further discuss the model and features of the data used, as well as further explain the implementation and the final hyperparameters.
Project 1: Sentiment Analysis, predicting the rating based on the user reviews, here only the text review is used. The final model uses the Transformers-based RoBERTa model, which is able to achieve 70.60% validation accuracy and 0.6648 Macro-F1 score
Project 2: Link Prediction using GraphSAGE, predicting the presence of relationship between vertices using DFS-like approach. The final model uses AUC score metrics, and able to achieve 95.76%
Project 3: Recommendation Prediction based on Neural Collaborative Filtering implementation with some feature engineering. RMSE metrics is used, and the final model is able to achieve the value of 1.061
Another competitive approach is Wide and Deep Learning
All of the training of the model is done on Google Colab because of their GPU support, which is free and requires no installations.
- Download the dataset: https://www.yelp.com/dataset
- Data documentation: https://www.yelp.com/dataset/documentation/main
- Past winners and their papers: https://www.yelp.com/dataset/challenge/winners