#  Normalization test of split and metrics functions 

In this notebook we check the consistency of split and metrics function. We first check this using an artificial dataset and then using the movielens 100k data used in the recommender tests.  

In [1]:
# set the environment path to find Recommenders
import sys
sys.path.append("../../")
import os

import numpy as np 
import pandas as pd
import itertools

from reco_utils.dataset import movielens
from reco_utils.dataset.python_splitters import python_random_split, python_stratified_split
from reco_utils.dataset.numpy_splitters import numpy_stratified_split
from reco_utils.dataset.sparse import AffinityMatrix

from reco_utils.evaluation.python_evaluation import map_at_k, ndcg_at_k, precision_at_k, recall_at_k

print("System version: {}".format(sys.version))
print("Pandas version: {}".format(pd.__version__))

System version: 3.6.6 |Anaconda custom (64-bit)| (default, Jun 28 2018, 11:07:29) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
Pandas version: 0.23.4


## 1 Artificial dataset 

For debugging purpose it is useful to generate random sparse matrices. The function `affinity_matrix()` a random rating matrix with a specified degree of sparsness. Realistic user/affinity matrices show a high degree of sparsness;  for example, the sparsness of the movielens dataset is $\ge 90$%, depending on the particular chosen data size, e.g. movielens 100k $\simeq 93$%, movielens 1m $\simeq 95$% etc...  

In [2]:
def affinity_matrix(users, items, ratings, spars):

    '''
    Generate a random user/item affinity matrix. By increasing the likehood of 0 elements we simulate 
    a typical recommeding situation where the input matrix is highly sparse. 
    
    Args: 
        users (int): number of users (rows).
        items (int): number of items (columns).
        ratings (int): rating scale, e.g. 5 meaning rates are from 1 to 5.
        spars: probablity of obtaining zero. This roughly correponds to the sparsness. 
               of the generated matrix. If spars = 0 then the affinity matrix is dense. 
    
    Returns: 
        X (np array, int): sparse user/affinity matrix 
    
    '''
    
    np.random.seed(123)

    s= [(1-spars)/5]*5 #uniform probability for the 5 ratings
    s.append(spars) 
    P= s[::-1] 
    
    # generates the user/item affinity matrix. Ratings are from 1 to 5, with 0s denoting unrated items
    X= np.random.choice(ratings+1, (users,items), p = P)
    
    return X

In [3]:
#generate the random sparse matrix. In this example we choose a ~80% sparsness 
X = affinity_matrix(users=20,items=50, ratings= 5, spars= 0.8)
X

array([[0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 4, 0, 0, 0, 0, 0,
        2, 0, 0, 5, 0, 0],
       [0, 1, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 3,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 2, 0, 0, 0, 0, 0, 5, 0, 0,
        0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 2, 0, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 5, 3, 0,
        0, 0, 3, 5, 0, 0, 1, 0, 0, 0, 0, 2, 5, 0, 0, 0, 0, 0, 0, 0, 4, 0,
        0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 1, 0, 4, 0, 0, 0, 0,
        2, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 4, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0],
       [0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 5,
        0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 5, 0, 0, 0, 0, 5, 0,
        0, 0, 5, 0, 0, 0],
       [0, 0, 0, 0, 0, 5, 0, 2, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 5

In [4]:
#Check the sparsness of the dataset#

zero = (X == 0).sum()  # number of unrated items
total = X.shape[0] * X.shape[1]  # number of elements in the matrix
sparsness = zero / total * 100  # Percentage of zeros in the matrix

sparsness 

80.2

In [5]:
#number of ratings per user
rated = np.sum(X != 0, axis=1)
rated

array([ 6,  7, 12,  9, 10, 12, 12,  7, 12, 11, 10,  8, 14, 13, 11, 12, 11,
        8,  9,  4])

In [6]:
#Total number of rated items 
total_rated = rated.sum()
total_rated

198

In order to simulate the recommendation task, we split this dataset into train and test set and evaluate precision@k on the test set alone. This introduces additional sparsness in the data and we check how this affect the normalization of the the @k results. Below, we compare two different split strategies, the first is the **global** split of `python_random_split`and the second is the **local** split `of numpy_stratified_split`. In order to apply the former splitter we first need to map X to a dataframe representation. 

In [7]:
def sparse_to_df(X):

        """
        Map the user/affinity matrix to a pd dataframe

        """
        m, n = X.shape #obtain the matrix dimensions: m = #users, n=#items 

        userids = []

        for i in range(1, m+1):
            userids.extend([i]*n)


        itemids = [i for i in range(1, n+1)]*m
        ratings = np.reshape(X, -1)

        #create dataframe
        results = pd.DataFrame.from_dict(
                        {
                            'user': userids,
                            'item': itemids,
                            'rating': ratings,
                         }
                    )

        #here we eliminate the missing ratings to obtain a standard form of the df as that of real dataframe.
        results = results[results.rating !=0]

        return results

In [8]:
#create a pandas df 
X_df = sparse_to_df(X)
X_df.head(10)

Unnamed: 0,user,item,rating
6,1,7,5
21,1,22,2
37,1,38,3
38,1,39,4
44,1,45,2
47,1,48,5
51,2,2,1
58,2,9,2
64,2,15,2
71,2,22,3


### 1.1 Splitting 

Splitting data in an unsupervised setting generally works differently than in the supervised one. 

Let us first consider a typical supervised learning problem, for example a binary classification problem. We are given a matrix $X^{\mu}_i$, where $\mu \in [1, m]$ is the example index and $i \in [1,n]$ is the feature index. We are also given a ground truth vector $y^{\mu}$. The matrix $\hat{X}$ is generally dense and we want to cut a certain percentage `t` of examples for the training set and `(1-t)` for the test set. In this case $Xtr^{\mu}_i$ contains the **same** number of features (columns) but different examples (rows) and we split $y^{\mu}$ accordingly. 

In the unspervised case, no ground truth vector is provided. In the recommendation case, the user/item affinity matrix contains the ratings as training examples; the only way to verify if the recommendation is correct is to cut part of the ratings for the test set: the ratings are in this case the examples. For the same user we can then verify if the prediction is correct or not by comparing to the test set. Due to the unequal number of ratings per user, we need to make sure that each user contributes the same number of train/test examples. 


Since the matrix is sparse and inhomogenous,  


### 1.1.1 Python Random splitter

In [9]:
#split data using the random split 
Xtr_random, Xtst_random = python_random_split(X_df, ratio = 0.75, seed= 123)

In [10]:
Xtst_random.head() 

Unnamed: 0,user,item,rating
767,16,18,3
165,4,16,1
331,7,32,3
183,4,34,1
128,3,29,1


In [11]:
print( 'global % of rated items in the train set', (len(Xtr_random.rating)/total_rated)*100 )
print( 'global % of rated items in the test set', (len(Xtst_random.rating)/total_rated)*100 )

global % of rated items in the train set 74.74747474747475
global % of rated items in the test set 25.252525252525253


Let us check now the per-user percentage of train/tes set example

In [12]:
#per user % of training example 
(Xtr_random.groupby(by= 'user').rating.count()/total_rated)*100 

user
1     2.525253
2     3.535354
3     4.040404
4     3.030303
5     3.535354
6     5.050505
7     5.555556
8     2.525253
9     4.545455
10    4.040404
11    4.040404
12    2.020202
13    6.060606
14    5.555556
15    5.050505
16    3.030303
17    3.030303
18    2.020202
19    3.535354
20    2.020202
Name: rating, dtype: float64

In [13]:
#per user % of test example 
(Xtst_random.groupby(by= 'user').rating.count()/total_rated)*100 

user
1     0.505051
3     2.020202
4     1.515152
5     1.515152
6     1.010101
7     0.505051
8     1.010101
9     1.515152
10    1.515152
11    1.010101
12    2.020202
13    1.010101
14    1.010101
15    0.505051
16    3.030303
17    2.525253
18    2.020202
19    1.010101
Name: rating, dtype: float64

We can identify the following problems: 

1. each user, both in train and test set, contributes a different number of examples. 
2. the per user train/test ratio is unbalanced, e.g. user1 in train test contributes 5 times more than in test but for user3 it is only 2 times.   
3. the test set misses two users with no apparent reason (2 and 20). 

### 1.1.2 Python Stratified splitter 

In [14]:
#split data using the random split 
Xtr_strat, Xtst_strat = python_stratified_split(X_df, ratio = 0.75, seed= 123, col_user ='user', col_item= 'item')

In [15]:
Xtst_strat.head() 

Unnamed: 0,user,item,rating
37,1,38,3
47,1,48,5
85,2,36,2
91,2,42,5
125,3,26,5


In [16]:
print( 'global % of rated items in the train set', (len(Xtr_strat.rating)/total_rated)*100 )
print( 'global % of rated items in the test set', (len(Xtst_strat.rating)/total_rated)*100 )

global % of rated items in the train set 74.74747474747475
global % of rated items in the test set 25.252525252525253


In [17]:
#per user % of training example 
(Xtr_strat.groupby(by= 'user').rating.count()/total_rated)*100

user
1     2.020202
2     2.525253
3     4.545455
4     3.535354
5     4.040404
6     4.545455
7     4.545455
8     2.525253
9     4.545455
10    4.040404
11    4.040404
12    3.030303
13    5.050505
14    5.050505
15    4.040404
16    4.545455
17    4.040404
18    3.030303
19    3.535354
20    1.515152
Name: rating, dtype: float64

Note that the sum of the above percentages is 74.74% as it should be. So both the `random` and the `stratified` split breaks split the original dataset into non equal percentages. 

In [18]:
#per user % of test example 
(Xtst_strat.groupby(by= 'user').rating.count()/total_rated)*100 

user
1     1.010101
2     1.010101
3     1.515152
4     1.010101
5     1.010101
6     1.515152
7     1.515152
8     1.010101
9     1.515152
10    1.515152
11    1.010101
12    1.010101
13    2.020202
14    1.515152
15    1.515152
16    1.515152
17    1.515152
18    1.010101
19    1.010101
20    0.505051
Name: rating, dtype: float64

This time the number of users is the same in both train and test set but again the percentage of per user example is not constant nor is the split ratio between train and test set. This clearly has effects on the statistical analysis of the model performance (the evalaluation metrics). 

### 1.1.3 Numpy stratified split 

In this case the data are split by keeping a per-user, constant percentage. 

In [19]:
Xtr_np, Xtst_np = numpy_stratified_split(X, ratio = 0.75, seed= 123)

In [20]:
#number of rated elements in the train/test set 
Xtr_rated = np.sum(Xtr_np != 0, axis=1)  # number of rated items in the train set
Xtst_rated = np.sum(Xtst_np != 0, axis=1)  # number of rated items in the test set

In [21]:
print( 'global % of rated items in the train set', (Xtr_rated.sum() / total_rated)*100 )
print( 'global % of rated items in the test set', (Xtst_rated.sum() / total_rated)*100 )

global % of rated items in the train set 74.74747474747475
global % of rated items in the test set 25.252525252525253


In [22]:
#per user percentage of training examples 
Xtr_rated/rated

array([0.66666667, 0.71428571, 0.75      , 0.77777778, 0.8       ,
       0.75      , 0.75      , 0.71428571, 0.75      , 0.72727273,
       0.8       , 0.75      , 0.71428571, 0.76923077, 0.72727273,
       0.75      , 0.72727273, 0.75      , 0.77777778, 0.75      ])

In [23]:
#per user percentage of test examples 
Xtst_rated/rated

array([0.33333333, 0.28571429, 0.25      , 0.22222222, 0.2       ,
       0.25      , 0.25      , 0.28571429, 0.25      , 0.27272727,
       0.2       , 0.25      , 0.28571429, 0.23076923, 0.27272727,
       0.25      , 0.27272727, 0.25      , 0.22222222, 0.25      ])

Also in this case the per user, training percentage are not exactly 75% but much closer to it than the previous case. The reason for these fluctuations is due to rounding errors and in principle and can be improven.

## 1.2 Evaluation 

We now consider the evaluation metrics and their normalization. As an example, we consider precision@k and rmse. As a first step, we are interested in determining the maximum achievable precision for precision@k; conventionally, this is set to one to denote a perfect score, in agreement with the normalization of the corresponding distribution. The definition is
$$ p_k = \frac{1}{m} \sum_{\mu=1}^m \frac{1}{k} \sum_{i =1}^{min(k, |D_i|)} I(X_p = X_{tst})^{\mu}_i$$,
where $I()$ is known as an indicator "function", even thought mathematically is a distribution. An example of this class is the Dirac delta function $\delta(X_p - X_{tst})$. The above definition has the (known) problem of not being normalized to 1 if the number of test set elements is less than k, as it can be see explictly from the above equation. Below we show that this is often the case when working with sparse matrices.   

The first thing to do is to simulate a recommendation of 10 elements, where 

In [24]:
Xtst_random.groupby(by='user')['item'].count()

user
1     1
3     4
4     3
5     3
6     2
7     1
8     2
9     3
10    3
11    2
12    4
13    2
14    2
15    1
16    6
17    5
18    4
19    2
Name: item, dtype: int64

In [25]:
Xtst_random[Xtst_random.user==3]

Unnamed: 0,user,item,rating
128,3,29,1
125,3,26,5
149,3,50,1
142,3,43,4


## 2 Movielens 100k dataset

Below we apply the above analysis on the movielens 100k dataset. The size of the dataset enhanches many of the features found above. 

In [26]:
# Select Movielens data size: 100k, 1m, 10m, or 20m
MOVIELENS_DATA_SIZE = '100k'

data = movielens.load_pandas_df(
    size=MOVIELENS_DATA_SIZE,
    header=['userID','movieID','rating','timestamp']
)

# Convert to 32-bit in order to reduce memory consumption 
data.loc[:, 'rating'] = data['rating'].astype(np.int32) 

data.head()

Unnamed: 0,userID,movieID,rating,timestamp
0,196,242,3,881250949
1,186,302,3,891717742
2,22,377,1,878887116
3,244,51,2,880606923
4,166,346,1,886397596


In [31]:
total_rated_ml = data.rating.count()
total_rated_ml

100000

In [41]:
Nusers= len(data.userID.unique())
Nusers

943

In [43]:
Nitems= len(data.movieID.unique())
Nitems

1682

### 2.1 Split data

### 2.1.1. Random Splitter 

In [28]:
#split data using the random split 
Ztr_random, Ztst_random = python_random_split(data, ratio = 0.75, seed= 123)

In [29]:
Ztst_random.head()

Unnamed: 0,userID,movieID,rating,timestamp
42083,600,651,4,888451492
71825,607,494,5,883879556
99535,875,1103,5,876465144
47879,648,238,3,882213535
36734,113,273,4,875935609


In [32]:
print( 'global % of rated items in the train set', (len(Ztr_random.rating)/total_rated_ml)*100 )
print( 'global % of rated items in the test set', (len(Ztst_random.rating)/total_rated_ml)*100 )

global % of rated items in the train set 75.0
global % of rated items in the test set 25.0


In [35]:
#per user % of training example 
(Ztr_random.groupby(by= 'userID').rating.count()/total_rated_ml)*100 

userID
1      0.203
2      0.045
3      0.037
4      0.018
5      0.133
6      0.161
7      0.303
8      0.046
9      0.014
10     0.148
11     0.130
12     0.039
13     0.454
14     0.077
15     0.080
16     0.111
17     0.018
18     0.197
19     0.015
20     0.036
21     0.136
22     0.096
23     0.105
24     0.055
25     0.066
26     0.085
27     0.017
28     0.058
29     0.025
30     0.027
       ...  
914    0.018
915    0.017
916    0.237
917    0.022
918    0.073
919    0.163
920    0.017
921    0.078
922    0.100
923    0.058
924    0.060
925    0.023
926    0.015
927    0.086
928    0.022
929    0.035
930    0.048
931    0.047
932    0.173
933    0.129
934    0.136
935    0.033
936    0.116
937    0.031
938    0.080
939    0.045
940    0.084
941    0.019
942    0.059
943    0.126
Name: rating, Length: 943, dtype: float64

As it can be seen, most users get a % of training examples $\ll 1$%. We can check this explictly by printing the % of users getting more than 1% of the training examples. 

In [46]:
( ((Ztr_random.groupby(by= 'userID').rating.count()/total_rated_ml)*100  >= 0.1).sum()/Nusers )*100  

28.738069989395548

In [36]:
#per user % of test example 
(Ztst_random.groupby(by= 'userID').rating.count()/total_rated_ml)*100 

userID
1      0.069
2      0.017
3      0.017
4      0.006
5      0.042
6      0.050
7      0.100
8      0.013
9      0.008
10     0.036
11     0.051
12     0.012
13     0.182
14     0.021
15     0.024
16     0.029
17     0.010
18     0.080
19     0.005
20     0.012
21     0.043
22     0.032
23     0.046
24     0.013
25     0.012
26     0.022
27     0.008
28     0.021
29     0.009
30     0.016
       ...  
914    0.005
915    0.009
916    0.080
917    0.013
918    0.030
919    0.054
920    0.009
921    0.032
922    0.027
923    0.016
924    0.022
925    0.009
926    0.005
927    0.034
928    0.010
929    0.014
930    0.015
931    0.014
932    0.068
933    0.055
934    0.038
935    0.006
936    0.026
937    0.009
938    0.028
939    0.004
940    0.023
941    0.003
942    0.020
943    0.042
Name: rating, Length: 943, dtype: float64

Also in this case, we can evaluate the % of test examples $\ge 1$%. 

In [47]:
( ((Ztst_random.groupby(by= 'userID').rating.count()/total_rated_ml)*100  >= 0.1).sum()/Nusers )*100  

2.014846235418876

Since we are interested in evaluating the effect of the splitter on the @k metrics with $k=10$, we can find the fraction of users in the test set having at least 10 test examples. As explained in the previous section, this also defines the maximum achievable precision of the recommender. 

In [55]:
( ( (Ztst_random.groupby(by= 'userID').rating.count() ) >= 10 ).sum()/Nusers )

0.7073170731707317

In [89]:
print('average number of per user rated elements', np.mean((Ztst_random.groupby(by= 'userID').rating.count() )) )
print('standard deviation', np.std((Ztst_random.groupby(by= 'userID').rating.count() )) ) 

average number of per user rated elements 26.511134676564158
standard deviation 25.9943225934054


### 2.1.2 Python Stratified Splitter

In [56]:
#split data using the random split 
Ztr_strat, Ztst_strat = python_stratified_split(data, ratio = 0.75, seed= 123, col_user ='userID', col_item= 'itemID')

In [57]:
Ztst_strat.head()

Unnamed: 0,userID,movieID,rating,timestamp
15764,1,196,5,874965677
14792,1,103,1,878542845
8737,1,209,4,888732908
62069,1,191,5,875072956
25721,1,141,3,878542608


In [58]:
print( 'global % of rated items in the train set', (len(Ztr_strat.rating)/total_rated_ml)*100 )
print( 'global % of rated items in the test set', (len(Ztst_strat.rating)/total_rated_ml)*100 )

global % of rated items in the train set 74.992
global % of rated items in the test set 25.008000000000003


In [59]:
#per user % of training example 
(Ztr_strat.groupby(by= 'userID').rating.count()/total_rated_ml)*100 

userID
1      0.204
2      0.046
3      0.040
4      0.018
5      0.131
6      0.158
7      0.302
8      0.044
9      0.016
10     0.138
11     0.136
12     0.038
13     0.477
14     0.074
15     0.078
16     0.105
17     0.021
18     0.208
19     0.015
20     0.036
21     0.134
22     0.096
23     0.113
24     0.051
25     0.058
26     0.080
27     0.019
28     0.059
29     0.026
30     0.032
       ...  
914    0.017
915    0.020
916    0.238
917    0.026
918    0.077
919    0.163
920    0.020
921    0.082
922    0.095
923    0.056
924    0.062
925    0.024
926    0.015
927    0.090
928    0.024
929    0.037
930    0.047
931    0.046
932    0.181
933    0.138
934    0.130
935    0.029
936    0.106
937    0.030
938    0.081
939    0.037
940    0.080
941    0.016
942    0.059
943    0.126
Name: rating, Length: 943, dtype: float64

Print the % of users getting more than 1% of the training examples. 

In [60]:
( ((Ztr_strat.groupby(by= 'userID').rating.count()/total_rated_ml)*100  >= 0.1).sum()/Nusers )*100  

29.056203605514312

In [61]:
#per user % of test example 
(Ztst_strat.groupby(by= 'userID').rating.count()/total_rated_ml)*100 

userID
1      0.068
2      0.016
3      0.014
4      0.006
5      0.044
6      0.053
7      0.101
8      0.015
9      0.006
10     0.046
11     0.045
12     0.013
13     0.159
14     0.024
15     0.026
16     0.035
17     0.007
18     0.069
19     0.005
20     0.012
21     0.045
22     0.032
23     0.038
24     0.017
25     0.020
26     0.027
27     0.006
28     0.020
29     0.008
30     0.011
       ...  
914    0.006
915    0.006
916    0.079
917    0.009
918    0.026
919    0.054
920    0.006
921    0.028
922    0.032
923    0.018
924    0.020
925    0.008
926    0.005
927    0.030
928    0.008
929    0.012
930    0.016
931    0.015
932    0.060
933    0.046
934    0.044
935    0.010
936    0.036
937    0.010
938    0.027
939    0.012
940    0.027
941    0.006
942    0.020
943    0.042
Name: rating, Length: 943, dtype: float64

In [62]:
( ((Ztst_strat.groupby(by= 'userID').rating.count()/total_rated_ml)*100  >= 0.1).sum()/Nusers )*100  

1.8027571580063628

In [63]:
( ( (Ztst_strat.groupby(by= 'userID').rating.count() ) >= 10 ).sum()/Nusers )

0.7020148462354189

In [87]:
print('average number of per user rated elements', np.mean((Ztst_strat.groupby(by= 'userID').rating.count() )) )
print('standard deviation', np.std((Ztst_strat.groupby(by= 'userID').rating.count() )) ) 

average number of per user rated elements 26.51961823966066
standard deviation 25.218943440036778


### 2.1.3 Numpy stratified split 

In [64]:
#to use standard names across the analysis 
header = {
        "col_user": "userID",
        "col_item": "movieID",
        "col_rating": "rating",
    }

#instantiate the splitter 
am = AffinityMatrix(DF = data, **header)

#obtain the sparse matrix 
Z = am.gen_affinity_matrix()

In [66]:
Ztr_np, Ztst_np = numpy_stratified_split(Z, ratio=0.75, seed=123)

In [70]:
#number of rated elements in the train/test set 
Ztr_np_rated = np.sum(Ztr_np != 0, axis=1)  # number of rated items in the train set
Ztst_np_rated = np.sum(Ztst_np != 0, axis=1)  # number of rated items in the test set

#number of ratings per user
Zrated = np.sum(Z != 0, axis=1)

In [69]:
print( 'global % of rated items in the train set', (Ztr_np_rated.sum() / total_rated_ml)*100 )
print( 'global % of rated items in the test set', (Ztst_np_rated.sum() / total_rated_ml)*100 )

global % of rated items in the train set 74.992
global % of rated items in the test set 25.008000000000003


In [72]:
#per user percentage of training examples 
Ztr_np_rated/Zrated

array([0.75      , 0.74193548, 0.74074074, 0.75      , 0.74857143,
       0.74881517, 0.74937965, 0.74576271, 0.72727273, 0.75      ,
       0.75138122, 0.74509804, 0.75      , 0.75510204, 0.75      ,
       0.75      , 0.75      , 0.75090253, 0.75      , 0.75      ,
       0.74860335, 0.75      , 0.74834437, 0.75      , 0.74358974,
       0.74766355, 0.76      , 0.74683544, 0.76470588, 0.74418605,
       0.75      , 0.75609756, 0.75      , 0.75      , 0.76      ,
       0.75      , 0.75438596, 0.75206612, 0.72727273, 0.74285714,
       0.75      , 0.74863388, 0.75113122, 0.74834437, 0.75      ,
       0.74074074, 0.76      , 0.75757576, 0.74883721, 0.75      ,
       0.73913043, 0.75      , 0.75      , 0.75384615, 0.76190476,
       0.7486631 , 0.75471698, 0.75324675, 0.7486911 , 0.75      ,
       0.76190476, 0.75      , 0.75268817, 0.75      , 0.75      ,
       0.73684211, 0.73333333, 0.76470588, 0.75384615, 0.7480916 ,
       0.73684211, 0.75182482, 0.75757576, 0.74358974, 0.74683

In [73]:
#per user percentage of test examples 
Ztst_np_rated/Zrated

array([0.25      , 0.25806452, 0.25925926, 0.25      , 0.25142857,
       0.25118483, 0.25062035, 0.25423729, 0.27272727, 0.25      ,
       0.24861878, 0.25490196, 0.25      , 0.24489796, 0.25      ,
       0.25      , 0.25      , 0.24909747, 0.25      , 0.25      ,
       0.25139665, 0.25      , 0.25165563, 0.25      , 0.25641026,
       0.25233645, 0.24      , 0.25316456, 0.23529412, 0.25581395,
       0.25      , 0.24390244, 0.25      , 0.25      , 0.24      ,
       0.25      , 0.24561404, 0.24793388, 0.27272727, 0.25714286,
       0.25      , 0.25136612, 0.24886878, 0.25165563, 0.25      ,
       0.25925926, 0.24      , 0.24242424, 0.25116279, 0.25      ,
       0.26086957, 0.25      , 0.25      , 0.24615385, 0.23809524,
       0.2513369 , 0.24528302, 0.24675325, 0.2513089 , 0.25      ,
       0.23809524, 0.25      , 0.24731183, 0.25      , 0.25      ,
       0.26315789, 0.26666667, 0.23529412, 0.24615385, 0.2519084 ,
       0.26315789, 0.24817518, 0.24242424, 0.25641026, 0.25316

In [76]:
(Ztst_np_rated >=10).sum()/Nusers 

0.7020148462354189

In [88]:
print('average number of per user rated elements', np.mean(Ztst_np_rated) )
print('standard deviation', np.std(Ztst_np_rated) ) 

average number of per user rated elements 26.51961823966066
standard deviation 25.21894344003685
