# Decentralized Prediction

## Summary

Here you'll find a usage of the decentralized prediction implementation.
You have a py file that do the same avaialble: decentralized.py

The notebook will use the User and Server classes that are used for simulate the network of agents that wants to create a similarity matrix without giving up data.
These classes are documented in the class files

In [1]:
import numpy as np
import pandas as pd
import os
from model.track_collection import TrackCollection
from utils.collection_splitter import splitter
from agent.server import Server
from agent.user import User

## Configuration

We created a way to extract rating of each track from a user's library. 
The probleme we faced is that we just have one real user's library. So we can't use it for the prediction (we need more users, and if we split the lib of this user in multiple libraries, it'll not be relevant because the rating will be the same).
So we created a totally fake music library, and 5 users with a part of the global library and notation on it. 
You can find the details in files:
* data/track_collection_test.json - The global library
* data/users/i.json - The lib of the ith user

So here in the config, the commented code is dynamic, but not relevant as we have just one real library. And the other code is hardcorded for the 5 test users.

In [2]:
#### CONFIG 
number_of_users = 5

### Loading the tracks data; and splitting them into number_of_users collections

track_collection = TrackCollection()
track_collection.load(os.path.join('data', 'track_collection_test.json'))
df_track_collection = track_collection.to_dataframe()
track_list = df_track_collection[['id']]

#user_collections = splitter(track_collection, number_of_users, 0.3)
#user_dfs = []

### Generating the users_dataframes vector with all tracks and their ratings

#for user_collection in user_collections:
#  ndf = user_collection.to_dataframe()[['id', 'rating_score']]
#  user_matrix = track_list.merge(ndf, on='id', how='left').fillna(0)
#  user_dfs.append(user_matrix[['rating_score']])


user_dfs = []
for i in range(number_of_users):
  tc = TrackCollection()
  tc.load(os.path.join('data', 'users', str(i+1)+'.json'))
  ndf = tc.to_dataframe()[['id', 'rating_score']]
  user_matrix = track_list.merge(ndf, on='id', how='left').fillna(0)
  user_dfs.append(user_matrix[['rating_score']])

### Generating the user
users = []
i = 0
for df in user_dfs:
  users.append(User(i, df))
  i += 1

### Setting the user loop: each user have to know which one is the next, in order to compute the decentralized calculuses
for user in users:
  if(user.id < number_of_users-1):
    user.nextNode = users[user.id+1]
  else:
    user.nextNode = users[0]



In [3]:
df = user_dfs[0].astype("float")

float(df[df["rating_score"] > 0.].mean())

0.43999999999999995

## Runing the server

Now we have created all the users, and created a loop of users, we can create the server and run it.
By runing it, it'll generate the similmarity matrix and spread it to all users.

So at the end of this executio, all users will have the similarity matrix calculated in a decentralized way.

In [4]:
### Generating the server
server = Server(users, track_list)

### Running the server
server.run()

### Printing the similarity matrix
users[0].similarity_matrix


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,1.0,-0.343876,-inf,0.752071,-0.1438235,0.0867391,-0.336336,0.08743854,-0.09818338,-0.197085,-0.3309113,0.085644,0.5783496,0.5745961,0.752071,0.312618,-0.2602099,-0.215667,0.557316,0.727986
1,-0.343876,1.0,-inf,-0.3265986,0.8295151,0.724381,0.821584,0.5424508,-0.1332427,0.098844,0.6531973,-0.158748,0.09561829,-0.03390318,-0.3265986,0.173415,-0.1579773,0.572346,-0.024398,0.181458
2,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf
3,0.752071,-0.326599,-inf,1.0,-1.630581e-11,0.2572827,0.0,0.2491364,-6.360811e-12,0.0,0.2,0.56939,0.644094,0.4982729,1.0,0.804046,3.105366e-12,0.223039,0.478091,0.493865
4,-0.143823,0.829515,-inf,0.0,1.0,0.9225351,0.935414,0.297775,-0.2786391,-0.110647,0.7171372,0.365174,0.0,-0.2729604,4.658803e-12,0.435178,-0.1496356,0.571247,0.392857,0.501739
5,0.086739,0.724381,-inf,0.2572827,0.9225351,1.0,0.833196,0.5120513,0.0,0.147834,0.7186173,0.431843,0.3264069,1.005125e-12,0.2572827,0.611719,0.07875871,0.473488,0.593816,0.630933
6,-0.336336,0.821584,-inf,0.0,0.9354143,0.8331956,1.0,0.3713907,-0.260643,-0.031846,0.8944272,0.403696,-1.98468e-12,-0.2553311,1.452637e-12,0.542763,-0.1017973,0.783718,0.133631,0.220863
7,0.087439,0.542451,-inf,0.2491364,0.297775,0.5120513,0.371391,1.0,0.4646419,0.674167,0.5813184,-0.167228,0.8347541,0.7241379,0.2491364,0.52284,0.2835493,0.436598,0.0,0.082026
8,-0.098183,-0.133243,-inf,-6.360811e-12,-0.2786391,0.0,-0.260643,0.4646419,1.0,0.952905,-2.827027e-12,-0.071226,0.4368151,0.3436414,0.0,0.0,0.9286466,-0.401113,0.153252,-0.218752
9,-0.197085,0.098844,-inf,0.0,-0.1106473,0.1478343,-0.031846,0.6741669,0.9529048,1.0,0.2563593,-0.087027,0.5337196,0.4198758,0.0,0.155566,0.8720676,-0.122524,0.034045,-0.267281


## Prediction

Now the similarity matrix is created, we can do the predictions.
The predictions can be computed locally for an user. That's perfect: this way we don't give any inforamtion to other users

In [8]:

### The list of notes that users don't have in their library
userToPredict = []
userToPredict.append([2,3,4,5,6,10,11,14,15,18])
userToPredict.append([0,2,3,8,12,13,14,18,19])
userToPredict.append([2,3,4,5,6,7,10,12,13,14,15,17,19])
userToPredict.append([2,3,7,8,9,10,12,13,14,15,16])
userToPredict.append([1,2,4,5,6,7,8,9,10])

i = 0
for uToPredict in userToPredict:
    for j in uToPredict:
        if users[i].willILikeIt(j):
            print("User %d will probably like song %i " % (i,j))
    i+=1

print("Finish")

User 0 will not like song 2; note = 0.000000
User 0 will not like song 3; note = 0.281674
User 0 will not like song 4; note = 0.112668
User 0 will not like song 5; note = 0.149631
User 0 will not like song 6; note = 0.123621
User 0 will not like song 10; note = 0.177915
User 0 will not like song 11; note = 0.041045
User 0 will not like song 14; note = 0.281674
User 0 will not like song 15; note = 0.187636
User 0 will not like song 18; note = 0.130809
User 1 will not like song 0; note = 0.086931
User 1 will not like song 2; note = 0.000000
User 1 will not like song 3; note = 0.328197
User 1 will not like song 8; note = 0.143804
User 1 will not like song 12; note = 0.218155
User 1 will not like song 13; note = 0.135162
User 1 will not like song 14; note = 0.328197
User 1 will not like song 18; note = 0.259111
User 1 will not like song 19; note = 0.250011
User 2 will not like song 2; note = 0.000000
User 2 will not like song 3; note = 0.154211
User 2 will not like song 4; note = 0.089173
