## Neighbor-Based Collaborative Filtering using *lastfm-360K* Dataset
#### Dataset can be found [here](http://mtg.upf.edu/static/datasets/last.fm/lastfm-dataset-360K.tar.gz) with specific data information [here.](https://www.upf.edu/web/mtg/lastfm360k)

This notebook is made in order to try out neighbor-based collaborative filtering as part of a recommendation system. We will be perform both User-User and Item-Item CF models. This uses cosine similarity as a measure of distance (similarity) of items or users.

In [1]:
import pandas as pd
import numpy as np
import scipy.sparse as sparse
from scipy.sparse.linalg import spsolve
from sklearn.neighbors import NearestNeighbors
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
# Read the data
# Organized as --> user_id /t artist_id /t artist_name /t #_plays
# Info about the playlist: https://www.upf.edu/web/mtg/lastfm360k 
raw_data = pd.read_table('lastfm-dataset-360K/usersha1-artmbid-artname-plays.tsv')
raw_data

Unnamed: 0,00000c289a1829a808ac09c00daf10bc3c4e223b,3bd73256-3905-4f3a-97e2-8b341527f805,betty blowtorch,2137
0,00000c289a1829a808ac09c00daf10bc3c4e223b,f2fb0ff0-5679-42ec-a55c-15109ce6e320,die Ärzte,1099
1,00000c289a1829a808ac09c00daf10bc3c4e223b,b3ae82c2-e60b-4551-a76d-6620f1b456aa,melissa etheridge,897
2,00000c289a1829a808ac09c00daf10bc3c4e223b,3d6bbeb7-f90e-4d10-b440-e153c0d10b53,elvenking,717
3,00000c289a1829a808ac09c00daf10bc3c4e223b,bbd2ffd7-17f4-4506-8572-c1ea58c3f9a8,juliette & the licks,706
4,00000c289a1829a808ac09c00daf10bc3c4e223b,8bfac288-ccc5-448d-9573-c33ea2aa5c30,red hot chili peppers,691
...,...,...,...,...
17535649,"sep 20, 2008",7ffd711a-b34d-4739-8aab-25e045c246da,turbostaat,12
17535650,"sep 20, 2008",9201190d-409f-426b-9339-9bd7492443e2,cuba missouri,11
17535651,"sep 20, 2008",e7cf7ff9-ed2f-4315-aca8-bcbd3b2bfa71,little man tate,11
17535652,"sep 20, 2008",f6f2326f-6b25-4170-b89d-e235b25508e8,sigur rós,10
