# Problem 0 (FFT) 4 pts

* Write naive Descrete 1D Fourier Transform using the definition: $S[f] = \sum_{i=0}^{N - 1} s[i] e^{\frac{-j 2 \pi f i}{N}}$
* You can assume the length of the signal is a power of 2

In [None]:
import numpy as np

def ft(s: np.array) -> np.array:
    pass

* Split signal into odd and even parts and make a single fft iteration

In [None]:
def fft_iteration(s: np.array) -> np.array:
    pass

* Make a full recursion until the lengths reaches 1 and compare the performance with 2 steps of recursion only
* At each step make sure that the result of your fft is equivalent to `np.fft.fft`
* Compare speed performance with `np.fft.fft` and explain, if it makes sense to write your own fft

# Problem 1 (Building recommender systems with SVD) 3 pts

Recommender systems are gaining more and more popularity. They're used in a broad range of areas: music, movies, e-commerce, social and professional networks, online news to name a few. Web services like Pandora, Netflix and Amazon build and deploy their own recommender engines in order to make customers happier and generate additional revenue. Those companies, and many others, take the problem of building high-quality recommender system very seriously. One of many examples of a great interest in this area coming from industries is the famous Netflix competition with **$1million** prize for winners.

In this task you'll build a very simple yet powerfull engine for recommender system. Given the [Movielens 10M](https://grouplens.org/datasets/movielens/) data you'll implement an SVD-based model of movie recommendation system. SVD-based approach belongs to a family of collaborative filtering algorithms that use matrix factorization (MF). While there are many sophisticated algorithms for MF, pure SVD remains one of the [top-performers](http://www.researchgate.net/publication/221141030_Performance_of_recommender_algorithms_on_top-N_recommendation_tasks). The main idea behind these algorithms is to represent each user and each movie as vectors in some low-dimensional feature space. That low-dimensional space(or **latent factors space**) shows what features a user likes and which of them are present in a movie. Interpreting those features is a separate and hard task and we will simply build the latent factors using SVD without focusing on the intrinsic meaning of those features. With this model the "likeability" of a movie for a particular user is estimated by a weighted inner product of their latent factors. The model should also respond fast and produce recommendations right after a new user demonstrated some of his preferences. Recomputing SVD for this task may take prohibitively long time that's why we'll use folding-in technique for making fresh recommendations quickly.

![](../week2/data/recommender.jpeg)


## Task 0. Preparing the data.
You're given convenience functions get_movielens_data and split_data.

* Use these functions to download the data, put it into the memory and split it into the training and testing set with 80/20 rule (80% of the data goes into training set and the rest - into test set).

Be aware that these sets are disjoint, e.g. users from the training set are not in the testing set and vice versa. This means that test users will be "new" (e.g. unseen before) for the trained model. In order to produce reasonable recommendations for these "new" users we will use folding-in (see task 2).

Note: downloading the dataset may take a couple of minutes. If you already have a copy of MovieLens 10M data on your computer you may force program to use it by specifying local_file argument in the get_movielens_data function. The local_file argument should be a full path to the zip file.

In [4]:
import pandas as pd
import zipfile
from requests import get

import numpy as np
import scipy as sp
from scipy import sparse

from collections import namedtuple
import sys

In [5]:
if sys.version_info[0] < 3:
    from StringIO import StringIO
else:
    from io import BytesIO

In [None]:
def get_movielens_data(local_file=None):
    '''Downloads movielens data, normalizes users and movies ids,
    returns data in sparse CSR format.
    '''
    if not local_file:
        print('Downloading data...')
        zip_file_url = 'http://files.grouplens.org/datasets/movielens/ml-10m.zip'
        zip_response = get(zip_file_url)
        
        if sys.version_info[0] < 3:
            zip_contents = StringIO(zip_response.content)
        else:
            zip_contents = BytesIO(zip_response.content)
        print('Done.')
    else:
        zip_contents = local_file
    
    print('Loading data into memory...')
    with zipfile.ZipFile(zip_contents) as zfile:
        zdata = zfile.read('ml-10M100K/ratings.dat')
        delimiter = ';'
        zdata = zdata.replace('::'.encode(), delimiter.encode()) # makes data compatible with pandas c-engine
        if sys.version_info[0] < 3:
            ml_data = pd.read_csv(StringIO(zdata), sep=delimiter, header=None, engine='c',
                                  names=['userid', 'movieid', 'rating', 'timestamp'],
                                  usecols=['userid', 'movieid', 'rating'])
        else:
            ml_data = pd.read_csv(BytesIO(zdata), sep=delimiter, header=None, engine='c',
                                  names=['userid', 'movieid', 'rating', 'timestamp'],
                                  usecols=['userid', 'movieid', 'rating'])
    
    # normalize indices to avoid gaps
    ml_data['movieid'] = ml_data.groupby('movieid', sort=False).grouper.group_info[0]
    ml_data['userid'] = ml_data.groupby('userid', sort=False).grouper.group_info[0]
    
    # build sparse user-movie matrix
    data_shape = ml_data[['userid', 'movieid']].max() + 1
    data_matrix = sp.sparse.csr_matrix((ml_data['rating'],
                                       (ml_data['userid'], ml_data['movieid'])),
                                        shape=data_shape, dtype=np.float64)
    
    print('Done.')
    return data_matrix

In [None]:
def split_data(data, test_ratio=0.2):
    '''Randomly splits data into training and testing datasets. Default ratio is 80%/20%.
    Returns datasets in namedtuple format for convenience. Usage:
    train_data, test_data = split_data(data_matrix)
    or
    movielens_data = split_data(data_matrix)
    and later in code: 
    do smth with movielens_data.train 
    do smth with movielens_data.test
    '''
    
    num_users = data.shape[0]
    idx = np.zeros((num_users,), dtype=bool)
    sel = np.random.choice(num_users, int(test_ratio*num_users), replace=False)
    np.put(idx, sel, True)
    
    Movielens_data = namedtuple('MovieLens10M', ['train', 'test'])
    movielens_data = Movielens_data(train=data[~idx, :], test=data[idx, :])
    return movielens_data

## Task 1. Building the core of recommender.
* Build representation of users and movies in the latent factors space with help of SVD.
* Calculate the data sparsity (e.g. number of nonzero elements divided by the total size of the matrix)
* Is it feasible to use regular SVD from numpy.linalg on your computer for this task?
* Fix the rank of approximation and compute truncated SVD using scipy.sparse.linalg.svds.
* Be aware that scipy returns singular values in ascending order (see the docs).
* Sort all your svd data in descending (by singular values) order without breaking the result of the product (i.e. without messing up the low-rank approximation).
The data returned by sparse SVD is also not contiguous in memory which may affect performance. Use np.ascontiguousarray to fix that.
* Plot singular values.
* Can you tell from the graph whether the data has a low-rank structure?
* Is it possible to estimate from the graph what SVD rank (or number of latent factors) will be sufficient for your model?
* Pick several users at random from the training set. Calculate recommendations for these users using truncated SVD. Compare movies that users rated with top-10 recommendations produced by your latent factors model. What can you say about produced recommendations?

## Task 2. Evaluating performance of the recommender.

Evaluation of the model is done by splitting test dataset into 2 subsets:
* user's behaviour history - this is used to produce recommendations by your trained model
* evaluation data - considered as the ground truth and used to estimate the quality of recommendations
Overall perfromance is measured by the total number of correct predictions made by the model on the test set.
Your tasks:
* Set $N$ - size of evaluation set - equal to 3 (for building top-3 recommendations).
* Split the test dataset into history and evaluation subsets. The simplest way is to do it user-by-user (e.g row-by-row) in the loop (see pseudocode below). In each row the last $N$ rated movies are used as evaluation data and all remainig user's ratings go into history subset. Scipy functions .nonzero() or .indices might be helpful.
* For each user from test set generate top-$N$ recommendations using the history subset and folding-in technique (described below).
    * What is the complexity of making recommendations for a new user? Compare it with calculation of full SVD.
    * Is it a good idea to use folding-in technique for all future users without recomputing SVD? Why?
* Calculate the number of correctly predicted recommendations (a.k.a. # of hits). This is done by calculating the number of recommended items which are also present in the evaluation subset for the selected user. You may want to use numpy.in1d function.
* Report the total number of correct predictions over the full test set.

## Folding-in technique

![](../week2/data/folding.gif)

A new user can be considered as an update to original matrix (appending new row). Appending a row in the original matrix corresponds to appending a row into the users latent factors matrix in the SVD decomposition. We can formulate the relation between theese two updates as (see [here](https://dl.acm.org/doi/pdf/10.1145/224170.285569) for details and picture above, for single user $q$ = 1): $$
p^T = x^TVS^{-1}
$$ Where $p$ is an update to latent factors matrix and $x$ is an update to original user-movie matrix (e.g. user's preferences or behaviour history). Then, to compute recommendations for the new user we simply restore a part of the original matrix, corresponding to the update: $$
r^T = p^TSV^T = x^TVV^T
$$ where $r$ is our recommendations vector that we're looking for: $$
r = VV^Tx
$$
Note, that matrix $P = VV^T$ satisfies the following property: $P^2 = P$, e.g. $P$ - is a projector. This means that our folding-in procedure can be naturally describied as a projection of the user preferences onto the latent factors space.

## Pseudocode for measuring recommender system quality

```
initialize total_score variable with 0
for each user in test-data:
    rating_history <-- all but the last N movies rated by user
    evaluation_set <-- the last N movies rated by user

    initialize user_preferences_vector with 1s using rating_history as indices of non-zero values
    top-N recommendations <-- folding-in applied to user_preferences_vector
    correct_predictions <-- # of intersections of recommendations and evaluation_set
    total_score <-- total_score + correct_predictions

return total_score

```
You may use different implementation at your will, the only two criterias:
it should be not slower than a reference implementation
it should produce correct results

## Bonus task: Fine-tuning your model.

* Try to find the rank that produces the best evaluation score
* Plot the dependency of evaluation score on rank of SVD for all your trials in one graph
* Report the best result and the coressponding SVD rank
* Compare your model with the non-personalized recommender which simply recommends top-3 movies with highest average ratings.

Optionally: You may want to test you parameters with different data splittings in order to minimize risk of local effects. You're also free to add modifications to your code for producing better results. Report what modificatons you've done and what effect it had if any.

# Problem 2 (SVD as a low pass filter), 3 pts

* Take your favorite picture with permissible license. You can take selfie or make any other photo with your smartphone.
* Upload the image to your computer and plot its singular values
* Make truncation with ranks: full-rank, 10, 20, 30

In [None]:
import numpy as np

im = plt.imread('HERE IS THE PATH TO MY PHOTO')
u, s, vT = np.linalg.svd(im)

# Truncation with reconstruction goes here

* For each rank compute average FFT between RGB channels
* Plot power spectrums taking the absolute values (`np.abs(...)`) of the frequency responses with `plt.subplots`
* Take care about the correct shift of the frequencies when plotting the power spectrums
* Explain the differences in the power spectrums for different ranks