# Amazon Personalize - Trending Now Blog Post

This notebook will help you prepare interactions dataset to use it with Trending Now recipe in [Amazon Personalize]

User interests can change based on a variety of factors, such as external events or the interests of other users. It is critical for websites and apps to tailor their recommendations to these changing interests to improve user engagement. With Trending-Now, you can surface items from your catalogue that are rising in popularity faster with higher velocity than other items, such as trending news, popular social content or newly released movies. Amazon Personalize looks for items that are rising in popularity at a faster rate than other catalogue items to help users discover items that are engaging their peers. Amazon Personalize also allows customers to define the time periods over which trends are calculated depending on their unique business context, with options for every 30 mins, 1 hour, 3 hours or 1 day, based on the most recent interactions data from users. This notebook will demonstrate how the new recipe aws-trending-now (or aws-vod-trending-now for recommenders) can help recommend the top trending items from the interactions dataset.

The estimated time to run through this notebook is about 10 minutes.

## How to use the Notebook

The code is broken up into cells like the one below. There's a triangular Run button at the top of this page that you can click to execute each cell and move onto the next, or you can press `Shift` + `Enter` while in the cell to execute it and move onto the next one.

As a cell is executing you'll notice a line to the side showcase an `*` while the cell is running or it will update to a number to indicate the last cell that completed executing after it has finished exectuting all the code within a cell.

Simply follow the instructions below and execute the cells to get started.


## Imports
Python ships with a broad collection of libraries and we need to import those as well as the ones installed to help us like [boto3](https://aws.amazon.com/sdk-for-python/) (AWS SDK for python) and [Pandas](https://pandas.pydata.org/)/[Numpy](https://numpy.org/) which are core data science tools.

In [None]:
# Imports
import boto3
import json as json
import numpy as np
import pandas as pd
import time
import datetime

## Download, Prepare, and Upload Training Data

For this notebook walkthrough, we will use Movielens public dataset, available at https://grouplens.org/datasets/movielens/. Follow the link to learn more about the data and potential uses.

First we need to download the data (training data). In this tutorial, for the interactions data we will be using ratings history from the movies review dataset, MovieLens. The dataset contains the user_id, rating, item_id, the interactions between the users and items and the time this interaction took place (timestamp which is given as unix epoch time). The dataset also contains movie title information to map the movie id to the actual title and genres.

### Download and Explore the Interactions Dataset

In [None]:
data_dir = "blog_data"
!mkdir $data_dir

In [None]:
!cd $data_dir && wget http://files.grouplens.org/datasets/movielens/ml-25m.zip
!cd $data_dir && unzip ml-25m.zip
dataset_dir = data_dir + "/ml-25m/"

The dataset has been successfully downloaded 

Lets learn more about the dataset by viewing its charateristics

In [None]:
!pygmentize $dataset_dir/README.txt

From the README, we see there is a file ratings.csv that should work as a proxy for our interactions data, after all rating a film definitely is a form of interacting with it. The dataset also has some genre information as some movie genome data. In this POC we will focus on the interactions data.

In [None]:
interactions_df = pd.read_csv(dataset_dir + '/ratings.csv')

In [None]:
interactions_df.head(10)

## Prepare the Interactions Data

### Drop Columns

Some columns in this dataset would not add value to our model and as such need to be dropped from this dataset. Columns such as *rating*.

In [None]:
interactions_df.head(10)

In [None]:
interactions_df.drop(columns=['rating'], axis=1, inplace=True)

In [None]:
interactions_df.head()

In [None]:
interactions_df = interactions_df.rename(columns = {'userId':'USER_ID', 'movieId':'ITEM_ID', 'timestamp':'TIMESTAMP'})

In [None]:
interactions_df.head()

In [None]:
interactions_file_path = 'curated_interactions_training_data.csv'

In the cell below, we will write our cleaned data to a file named "curated_interactions_training_data.csv"

In [None]:
interactions_df.to_csv(interactions_file_path)

File named 'curated_interactions_training_data.csv' is created in this notebook instance.

Download the file into your local machine and upload it to an S3 bucket before you import the data it into the Amazon Personalize. 