# Personalized Movie Recommendations
This notebook shows how to build a personalized movie recommendation model with ThirdAI's Universal Deep Transformer (UDT) model, our all-purpose classifier for tabular datasets. In this demo, we will train and evaluate the model on the Movielens 1M dataset, but you can easily replace this with your own dataset.

You can immediately run a version of this notebook in your browser on Google Colab at the following link:

https://githubtocolab.com/ThirdAILabs/Demos/blob/main/PersonalizedMovieRecommendations.ipynb

This notebook uses an activation key that will only work with this demo. If you want to try us out on your own dataset, you can obtain a free trial license at the following link: https://www.thirdai.com/try-bolt/

In [None]:
!pip3 install thirdai --upgrade

import thirdai
thirdai.licensing.activate("Y9MT-TV7T-4JTP-L4XH-PWYC-4KEF-VX93-3HV7")

# Dataset Download
We will use the demos module in the thirdai package to download the Movielens 1M dataset. You can replace this step and the next step with a download method and a UDT initialization step that is specific to your dataset.

In [None]:
from thirdai.demos import download_movielens

train_filename, test_filename, inference_batch, _ = download_movielens()

# UDT Initialization
We can now create a UDT model by passing in the types of each column in the dataset and the target column we want to be able to predict.

For this demo, we additionally want to use "temporal context" to make predictions. Adding temporal context requires a single bolt.types.date() column to use to track the timestamp of training data. We pass in a dictionary called temporal_tracking_relationships that tells UDT we want to track movies over time for each user. This allows UDT to make better predictions for the target column by creating temporal features that take into account the historical relationship between users and movies.

In [None]:
from thirdai import bolt

model = bolt.UniversalDeepTransformer(
    data_types={
        "userId": bolt.types.categorical(),
        "movieTitle": bolt.types.categorical(),
        "timestamp": bolt.types.date(),
    },
    temporal_tracking_relationships={"userId": ["movieTitle"]},
    target="movieTitle",
    n_target_classes=3706,
)

# Training
We can now train our UDT model with just one line! Feel free to customize the number of epochs and the learning rate; we have chosen values that give good convergence.

In [None]:
model.train(train_filename, epochs=3, learning_rate=0.001, metrics=["recall@10"]);

# Evaluation
Evaluating the performance of the UDT model is also just one line!

In [None]:
model.evaluate(test_filename, metrics=["recall@1", "recall@10", "recall@100"]);

# Saving and Loading
Saving and loading a trained UDT model to disk is also extremely straight forward.

In [None]:
save_location = "personalized_movie_recommendation.model"

# Saving
model.save(save_location)

# Loading
model = bolt.UniversalDeepTransformer.load(save_location)

# Making Recommendations
Let's get a hands-on experience of the benefits of temporal tracking while learning about UDT's inference API. Suppose you are a new user so the model does not have your watch history. What is the first movie that it recommends? We'll find out using the `model.predict()` method. Keep in mind that Movielens contains reviews between 2000 and 2003, so it will only recommend old movies.

In [None]:
import numpy as np

user_id = '20382' # Random new user id
timestamp = '2023-01-12'

# Call model.predict() with a dictionary of column names to values, excluding 
# the target column. It returns an array of probabilities. Each position in the 
# array corresponds with a different movie.
prediction = model.predict({'userId': user_id, 'timestamp': timestamp})
# We are interested in the top recommendation, so we look for the position
# with the highest probability and convert it back to a movie title using the
# model.class_name() method.
prediction_title = model.class_name(np.argmax(prediction))

print("Recommendation:", prediction_title)

In our tests, the model recommends "Star Wars VI: Return of the Jedi (1983)".  It's a very successful franchise so it makes sense that the model recommends this to someone it has never seen before. Unfortunately, you are not a Star Wars fan. In fact, you're not even a fan of science fiction. You ended up watching The Godfather. To tell the model this, we use the `model.index()` method. It adds an entry to your watch history, which is maintained by the model. This does not retrain the model, so it's a very efficient operation.

In [None]:
model.index({
    'userId': user_id, 
    'timestamp': timestamp, 
    # This is how the title is formatted in the dataset.
    'movieTitle': 'Godfather The (1972)' 
})

Now rerun the previous cell. We hope you like the next recommendation.

# Other methods

In [None]:
# Like the predict() method but for batches.
model.predict_batch([
    {
        'userId': user_id, 
        'timestamp': '2023-01-12', 
    }, {
        'userId': user_id, 
        'timestamp': '2023-01-13', 
    },
])

# Like the index() method but for batches.
model.index_batch([
    {
        'userId': user_id, 
        'timestamp': timestamp, 
        # This is how the title is formatted in the dataset.
        'movieTitle': 'Godfather The (1972)' 
    }, {
        'userId': user_id, 
        'timestamp': timestamp, 
        # This is how the title is formatted in the dataset.
        'movieTitle': 'Godfather: Part II The (1974)'
    },
])

# Resets the model's temporal trackers; in the case of this demo, it deletes
# users' watch histories.
model.reset_temporal_trackers()