K-NN Classification validation notebook.

Prakash Dhimal
George Mason University
CS 584 Theory and Applications of Data Mining
Homework 1

Imports

In [2]:
import math
import multiprocessing
import string
import time
from collections import Counter

import nltk
import numpy as np
import pandas as pd
from nltk.corpus import stopwords
import random

# from scipy.sparse import csr_matrix

In [3]:
nltk.download('punkt')
nltk.download('stopwords')
stemmer = nltk.PorterStemmer()
stop_words = stopwords.words('english')

[nltk_data] Downloading package punkt to /home/dhimal/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /home/dhimal/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Read the training file

In [4]:
train_file = "../../data/1580449515_4035058_train_file.dat"
# training file
train = pd.read_table(train_file, header=None, skip_blank_lines=False)
train.columns = ["label", "data"]
train = train.dropna()  # TODO - should we do this?
train = train.reset_index(drop=True)

In [5]:
train

Unnamed: 0,label,data
0,1,This book is such a life saver. It has been s...
1,1,I bought this a few times for my older son and...
2,1,"This is great for basics, but I wish the space..."
3,1,This book is perfect! I'm a first time new mo...
4,1,During your postpartum stay at the hospital th...
...,...,...
18492,-1,"I really liked this monitor at first, but the ..."
18493,-1,Apparently you get what you pay for. I've use...
18494,-1,The old saying holds true with this product --...
18495,-1,We did a great deal of research before purchas...


Shuffle

In [7]:
train = train.sample(frac=1).reset_index(drop=True)
train = train.sample(frac=1).reset_index(drop=True)
train = train.sample(frac=1).reset_index(drop=True)
train

Unnamed: 0,label,data
0,-1,We absolutely NEVER used this once. I would se...
1,1,We have really enjoyed having this since my so...
2,1,I absolutely love ALL Avent products... their ...
3,-1,"I have mixed feelings about this bed, love it ..."
4,1,I am giving this five stars although it broke ...
...,...,...
18492,-1,We have had this stupid car seat for over a ye...
18493,1,The product itself is great! Works like it's s...
18494,1,"After reading the reviews, I decided to get th..."
18495,-1,My son's bottles leak when they are tipped ove...


Get the data

In [8]:
data = train.data

In [9]:
len(data)

18497

In [12]:
data_length = int(0.75 * len(data))
test_length = len(data) - data_length
print(data_length)
print(test_length)
print(data_length + test_length)

13872
4625
18497


In [13]:
# split the dataset
data_train = data[:data_length]
data_test = data[data_length:]

In [14]:
data_train

0        We absolutely NEVER used this once. I would se...
1        We have really enjoyed having this since my so...
2        I absolutely love ALL Avent products... their ...
3        I have mixed feelings about this bed, love it ...
4        I am giving this five stars although it broke ...
                               ...                        
13867    Poorly made and it does not hold heavy objects...
13868    I've been using these for years and have yet t...
13869    These sleeves are so tight, it is such a hassl...
13870    I was really excited to use this bottle, becau...
13871    I bought this toy for my 10 month old niece as...
Name: data, Length: 13872, dtype: object

In [15]:
data_test

13872    The magnet works, the adhesive works.  The loc...
13873    These are a must if you use Dr Brown's bottles...
13874    We've used this blanket for the beach and for ...
13875    I love these bottles!  My 3 month old baby has...
13876    I wanted to try bathing my baby with these, af...
                               ...                        
18492    We have had this stupid car seat for over a ye...
18493    The product itself is great! Works like it's s...
18494    After reading the reviews, I decided to get th...
18495    My son's bottles leak when they are tipped ove...
18496    The picture on the monitor is great. I like th...
Name: data, Length: 4625, dtype: object

Split the labels

In [16]:
label = train.label

In [17]:
label_train = label[:data_length]
label_test = label[data_length:]

In [18]:
label_train

0       -1
1        1
2        1
3       -1
4        1
        ..
13867   -1
13868    1
13869   -1
13870   -1
13871   -1
Name: label, Length: 13872, dtype: int64

In [19]:
label_test

13872   -1
13873    1
13874    1
13875    1
13876   -1
        ..
18492   -1
18493    1
18494    1
18495   -1
18496   -1
Name: label, Length: 4625, dtype: int64

For Preprocessing:
   * Change the star ratings in numerical values in the document to words 
   * For each word in the document
       * lowercase the word
       * remove if the word is punctiation
       * remove it the word is a stopword
       * remove if length of the word is less than or equal to 3
       * Use the stemmer (PorterStemmer) to stem the words

We need `punctuation` and `stopwords` for english language.
We also need a Stemmer to do some stemming. 

In [20]:
def preprocess(document):
    # preserve any stars ratings, people usualy say I give this product 1 star, 2 star, 3 star, 4 star, and 5 star
    if "1 star" in document:
        document = document.replace("1 star", "onestar")
    if "2 star" in document:
        document = document.replace("2 star", "twostars")
    if "3 star" in document:
        document = document.replace("3 star", "threestars")
    if "4 star" in document:
        document = document.replace("4 star", "fourstars")
    if "5 star" in document:
        document = document.replace("5 star", "fivestars")

    tokens = nltk.word_tokenize(document)
    # delete any digits
    tokens = [word for word in tokens if not word.isdigit()]
    # lower case
    tokens = [word.lower() for word in tokens if type(word) == str]
    # remove punctuations
    tokens = [word for word in tokens if word not in string.punctuation]
    # remove stopwords
    tokens = [word for word in tokens if not word in stop_words]
    # remove any words less and 3 in length
    tokens = [word for word in tokens if not len(word) <= 4]
    # stemming
    tokens = [stemmer.stem(word) for word in tokens]
    return tokens

Let's pre-process the data. We only need to pre-process the training data here.

In [21]:
data_processed = [preprocess(document) for index, document in enumerate(data_train)]

In [22]:
data_processed

[['absolut',
  'never',
  'would',
  'serious',
  'recommend',
  'get',
  'phone',
  'great',
  'folk',
  'smart',
  'phone',
  'whose',
  'spous',
  'equal',
  'amount',
  'watch',
  'littl',
  'daughter',
  'husband',
  'phone',
  'track',
  'everyth',
  'trend'],
 ['realli',
  'enjoy',
  'sinc',
  'heartbeat',
  'music',
  'set',
  'seem',
  'set',
  "'natur",
  'orient',
  'automat',
  'shut-off',
  'voice-activ',
  'modes.w',
  'batteri',
  'alreadi',
  'seven',
  'week',
  'includ',
  'nights/nap',
  'forgot',
  'press',
  'automat',
  'shut-off',
  'overal',
  'great',
  'product',
  'use'],
 ['absolut',
  'avent',
  'product',
  'bottl',
  'bottle-warm',
  'pacifi',
  'etc.both',
  'avent',
  'bottl',
  'first-born',
  'still',
  'second',
  'sometim',
  'slight',
  'shake',
  'found',
  'formula',
  'powder',
  'get',
  'stuck',
  'nippl',
  'bottl',
  'break',
  'happen',
  'bottl',
  'nippl',
  'rins',
  'nippl',
  'problem',
  'bottl',
  'virtual',
  'indestruct',
  'discol

In [23]:
len(data_processed)

13872

#### TF-IDF stuff

This method is to make a list of each word and the document it appears in.

In [24]:
def df_list(document, index, DF_list):
    for token in document:
        try:
            DF_list[token].add(index)
        except:
            DF_list[token] = {index}

This method is to update the df_list above with the number of documents a word appears in. We don't need to keep track of the document index.

In [25]:
def update_DF_list(word, DF_list):
    DF_list[word] = len(DF_list[word])

This method is to create a TF-IDF vector for each document.
The equations in here are taken out of the class slides.

In [26]:
def document_vector(
        document,
        total_vocabulary,
        N,
        DF_list):
    """
    Returns TF-IDF for each document
    """
    # each document need the length of total vocab
    doc_vector = np.zeros((len(total_vocabulary)))

    counter = Counter(document)
    words_count = len(document)

    for token in np.unique(document):
        tf = counter[token] / words_count
        df = doc_freq(DF_list, token)
        idf = math.log((N + 1) / (df + 1))  # why plus 1?
        try:
            ind = total_vocabulary.index(token)
            doc_vector[ind] = tf * idf
        except:
            pass
    return doc_vector

This method is to look up how many times a word appears in a document.

In [27]:
def doc_freq(DF_matrix, word):
    frequenxy = 0
    try:
        frequenxy = DF_matrix[word]
    except:
        pass
    return frequenxy

Make a document frequency list for each word that appears in only in the training data set. 

In [29]:
DF_list_training = {}
# this is going to update the DF_
[df_list(document, index, DF_list_training) for index, document in enumerate(data_processed)]
# again updates the list
[update_DF_list(word, DF_list_training) for word in DF_list_training]

[None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,

Get our vocabulary list from the DF_list

In [30]:
vocab_training = [word for word in DF_list_training]

In [31]:
vocab_training

['absolut',
 'never',
 'would',
 'serious',
 'recommend',
 'get',
 'phone',
 'great',
 'folk',
 'smart',
 'whose',
 'spous',
 'equal',
 'amount',
 'watch',
 'littl',
 'daughter',
 'husband',
 'track',
 'everyth',
 'trend',
 'realli',
 'enjoy',
 'sinc',
 'heartbeat',
 'music',
 'set',
 'seem',
 "'natur",
 'orient',
 'automat',
 'shut-off',
 'voice-activ',
 'modes.w',
 'batteri',
 'alreadi',
 'seven',
 'week',
 'includ',
 'nights/nap',
 'forgot',
 'press',
 'overal',
 'product',
 'use',
 'avent',
 'bottl',
 'bottle-warm',
 'pacifi',
 'etc.both',
 'first-born',
 'still',
 'second',
 'sometim',
 'slight',
 'shake',
 'found',
 'formula',
 'powder',
 'stuck',
 'nippl',
 'break',
 'happen',
 'rins',
 'problem',
 'virtual',
 'indestruct',
 'discolor',
 'anyth',
 'although',
 'believ',
 'mix',
 'feel',
 'time.pro',
 'studi',
 'beauti',
 'affordable.con',
 'slat',
 'toddler',
 'unfortun',
 'realiz',
 'start',
 'sleep',
 'turn',
 'middl',
 'longer',
 'caus',
 'outward',
 'result',
 'remain',
 'pr

Create a TF-IDF matrix

In [32]:
N = len(data_processed)
tf_idf_matrix_training = [document_vector(
    document,
    vocab_training,
    N,
    DF_list_training) for document in data_processed]

In [33]:
tf_idf_matrix_training

[array([0.16540689, 0.10934051, 0.05264961, ..., 0.        , 0.        ,
        0.        ]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0.09510896, 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ]),
 array([0.        , 0.        , 0.07568381, ..., 0.        , 0.        ,
        0.        ]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0.        , 0.        , 0.12974368, ..., 0.        , 0.        ,
        0.        ]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0., 0., 0., ..., 0., 0., 0.]),
 array([0.        , 0.        , 0.04484967, ..., 0.        , 0.        ,
        0.        ]),
 array([0., 0., 0.

Before we get into K-NN, some multiprocessing stuff

#### Multiprocessing

This method will call the label method to label the training instance for the index taken out from the task queue.

In [35]:
def process_tasks(
        task_queue,
        k,
        total_vocabulary,
        train_matrix,
        train_label,
        test_data_raw,
        DF_list,
        test_y):
    """
    This method is to support multi-processing
    """
    while not task_queue.empty():
        task = task_queue.get()
        label(
            vocabulary=total_vocabulary,
            train_data_matrix=train_matrix,
            train_label=train_label,
            test_document_index=task,  # should have the document index
            test_data_raw=test_data_raw,
            DF_list_train=DF_list,
            test_y=test_y,
            k=k)
    return True

This method will add all the tasks (labeling the test instances) to a queue at the begining of the K-NN process

In [36]:
def add_tasks(
        task_queue,
        test_data_list):
    """
    This method is to support multi-processing
    """
    for test_document_index in range(len(test_data_list)):
        task_queue.put(test_document_index)
        print(test_document_index)
    return task_queue

#### K-Nearest Neighbor

This is where the magic happens. Given:
    * TF-IDF matrix
    * Training labels
    * test instance index
    * k
This method is take the current instance and its TF-IDF vector and compute a cosine similarity (our distance measure) with all the training instances. This will be saved to a distance vector.

After sorting the distance vector, this method will select the top K neighbors and look at their labels

In [37]:
def cosine_simimarity(document_a, document_b):
    # normalized cosine similarity
    return np.dot(document_a, document_b) / (np.linalg.norm(document_a) * np.linalg.norm(document_b))

In [38]:
def label(
        vocabulary,
        train_data_matrix,  # This is a TF-IDF matrix
        train_label,
        test_document_index,
        test_data_raw,  # this is the raw test data, not pre-processed yet
        DF_list_train,
        test_y,
        k):
    test_raw = test_data_raw[test_document_index]

    test_processed = preprocess(test_raw)

    # Generate the TF_IDF for the test instance
    test_TF_IDF_vector = document_vector(document=test_processed,
                                         N=len(train_label) + 1,  # because we are adding 1 more?
                                         DF_list=DF_list_train,
                                         total_vocabulary=vocabulary)

    print("Finding label for test item ", test_document_index)

    # cosine similarity
    distances = [(index, (cosine_simimarity(test_TF_IDF_vector, doc_vector))) for index, doc_vector in
                 enumerate(train_data_matrix)]

    # get k-nearest neighbors
    # np.array(distances).sort()[::-1]
    # the closer the documents are by angle, the higher is the Cosine Similarity
    distances.sort(key=lambda x: x[1])
    neighbor_indices = distances[::-1]
    neighbor_indices = neighbor_indices[:k]

    label_list = []
    for value in neighbor_indices:
        label_list.append(train_label[value[0]])

    # get a list of labels from the neighbors and sum the list
    # labels_list = train_label[neighbor_indices].tolist()
    label_sum = sum(label_list)

    # classify based on label_sum
    if label_sum > 0:
        label_this = +1
    else:
        label_this = -1
        # print(test_data[test_document_index])
        # print("labeled as : ", label_this)

    test_y[test_document_index] = label_this

Manages multiple processes calling the label method (which is really where the K-NN happens).

Uses `multiprocessing.Array` to store the labels of the test instances. This was one of the options to share a data structures between multiple processes.

`num_cpus` is set to the number of cpus available - 1 or 2 (for OS and other tasks) 


In [39]:
def k_nearest_neighbor(
        k,
        vocabulary,
        test_data_raw,
        DF_list,
        train_matrix,
        train_label):
    test_y = multiprocessing.Array('i', len(test_data_raw))

    num_cpus = multiprocessing.cpu_count() - 2
    queue = multiprocessing.Queue()
    full_task_queue = add_tasks(
        queue,
        test_data_raw)

    processes = []
    print(f'Running with {num_cpus} processes!')
    start = time.time()
    for n in range(num_cpus):
        process = multiprocessing.Process(
            target=process_tasks, args=(
                full_task_queue,
                k,
                vocabulary,
                train_matrix,
                train_label,
                test_data_raw,
                DF_list,
                test_y,))
        processes.append(process)
        process.start()
    for process in processes:
        process.join()
    print(f'Time taken = {time.time() - start:.10f}')

    review_ratings = []
    for rating in test_y:
        if rating is 1:
            review_ratings.append("+1")
        else:
            review_ratings.append("-1")
    return review_ratings

#### Let's classify

Look at our data again

In [42]:
data_processed

[['absolut',
  'never',
  'would',
  'serious',
  'recommend',
  'get',
  'phone',
  'great',
  'folk',
  'smart',
  'phone',
  'whose',
  'spous',
  'equal',
  'amount',
  'watch',
  'littl',
  'daughter',
  'husband',
  'phone',
  'track',
  'everyth',
  'trend'],
 ['realli',
  'enjoy',
  'sinc',
  'heartbeat',
  'music',
  'set',
  'seem',
  'set',
  "'natur",
  'orient',
  'automat',
  'shut-off',
  'voice-activ',
  'modes.w',
  'batteri',
  'alreadi',
  'seven',
  'week',
  'includ',
  'nights/nap',
  'forgot',
  'press',
  'automat',
  'shut-off',
  'overal',
  'great',
  'product',
  'use'],
 ['absolut',
  'avent',
  'product',
  'bottl',
  'bottle-warm',
  'pacifi',
  'etc.both',
  'avent',
  'bottl',
  'first-born',
  'still',
  'second',
  'sometim',
  'slight',
  'shake',
  'found',
  'formula',
  'powder',
  'get',
  'stuck',
  'nippl',
  'bottl',
  'break',
  'happen',
  'bottl',
  'nippl',
  'rins',
  'nippl',
  'problem',
  'bottl',
  'virtual',
  'indestruct',
  'discol

In [43]:
data_train

0        We absolutely NEVER used this once. I would se...
1        We have really enjoyed having this since my so...
2        I absolutely love ALL Avent products... their ...
3        I have mixed feelings about this bed, love it ...
4        I am giving this five stars although it broke ...
                               ...                        
13867    Poorly made and it does not hold heavy objects...
13868    I've been using these for years and have yet t...
13869    These sleeves are so tight, it is such a hassl...
13870    I was really excited to use this bottle, becau...
13871    I bought this toy for my 10 month old niece as...
Name: data, Length: 13872, dtype: object

In [44]:
vocab_training

['absolut',
 'never',
 'would',
 'serious',
 'recommend',
 'get',
 'phone',
 'great',
 'folk',
 'smart',
 'whose',
 'spous',
 'equal',
 'amount',
 'watch',
 'littl',
 'daughter',
 'husband',
 'track',
 'everyth',
 'trend',
 'realli',
 'enjoy',
 'sinc',
 'heartbeat',
 'music',
 'set',
 'seem',
 "'natur",
 'orient',
 'automat',
 'shut-off',
 'voice-activ',
 'modes.w',
 'batteri',
 'alreadi',
 'seven',
 'week',
 'includ',
 'nights/nap',
 'forgot',
 'press',
 'overal',
 'product',
 'use',
 'avent',
 'bottl',
 'bottle-warm',
 'pacifi',
 'etc.both',
 'first-born',
 'still',
 'second',
 'sometim',
 'slight',
 'shake',
 'found',
 'formula',
 'powder',
 'stuck',
 'nippl',
 'break',
 'happen',
 'rins',
 'problem',
 'virtual',
 'indestruct',
 'discolor',
 'anyth',
 'although',
 'believ',
 'mix',
 'feel',
 'time.pro',
 'studi',
 'beauti',
 'affordable.con',
 'slat',
 'toddler',
 'unfortun',
 'realiz',
 'start',
 'sleep',
 'turn',
 'middl',
 'longer',
 'caus',
 'outward',
 'result',
 'remain',
 'pr

In [45]:
len(vocab_training)

22467

In [46]:
data_test

13872    The magnet works, the adhesive works.  The loc...
13873    These are a must if you use Dr Brown's bottles...
13874    We've used this blanket for the beach and for ...
13875    I love these bottles!  My 3 month old baby has...
13876    I wanted to try bathing my baby with these, af...
                               ...                        
18492    We have had this stupid car seat for over a ye...
18493    The product itself is great! Works like it's s...
18494    After reading the reviews, I decided to get th...
18495    My son's bottles leak when they are tipped ove...
18496    The picture on the monitor is great. I like th...
Name: data, Length: 4625, dtype: object

In [47]:
label_test

13872   -1
13873    1
13874    1
13875    1
13876   -1
        ..
18492   -1
18493    1
18494    1
18495   -1
18496   -1
Name: label, Length: 4625, dtype: int64

In [48]:
label_train

0       -1
1        1
2        1
3       -1
4        1
        ..
13867   -1
13868    1
13869   -1
13870   -1
13871   -1
Name: label, Length: 13872, dtype: int64

Run the K-NN for k = 15

In [49]:
print("INFO: Starting K-NN...")
k = 15
# returns a set of results
ratings = k_nearest_neighbor(
    k=k,
    vocabulary=vocab_training,
    test_data_raw=np.array(data_test),
    DF_list=DF_list_training,
    train_matrix=tf_idf_matrix_training,
    train_label=np.array(label_train))
print("INFO: K-nearest neighbor done!")

INFO: Starting K-NN...
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271

3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450
3451
3452
3453
3454
3455
3456
3457
3458
3459
3460
3461
3462
3463
3464
3465
3466
3467
3468
3469
3470
3471
3472
3473
3474
3475
3476
3477
3478
3479
3480
3481
3482
3483
3484
3485
3486
3487
3488
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520
3521
3522
3523
3524
3525


  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until


Finding label for test item  6
Finding label for test item  7
Finding label for test item  8
Finding label for test item  9
Finding label for test item  10
Finding label for test item  11
Finding label for test item  12
Finding label for test item  13
Finding label for test item  14
Finding label for test item  15
Finding label for test item  16
Finding label for test item  17
Finding label for test item  18
Finding label for test item  19
Finding label for test item  20
Finding label for test item  21
Finding label for test item  22
Finding label for test item  23
Finding label for test item  24
Finding label for test item  25
Finding label for test item  26
Finding label for test item  27
Finding label for test item  28
Finding label for test item  29
Finding label for test item  30
Finding label for test item  31
Finding label for test item  32
Finding label for test item  33
Finding label for test item  34
Finding label for test item  35
Finding label for test item  36
Finding labe

Finding label for test item  258
Finding label for test item  259
Finding label for test item  260
Finding label for test item  261
Finding label for test item  262
Finding label for test item  263
Finding label for test item  264
Finding label for test item  265
Finding label for test item  266
Finding label for test item  267
Finding label for test item  268
Finding label for test item  269
Finding label for test item  270
Finding label for test item  271
Finding label for test item  272
Finding label for test item  273
Finding label for test item  274
Finding label for test item  275
Finding label for test item  276
Finding label for test item  277
Finding label for test item  278
Finding label for test item  279
Finding label for test item  280
Finding label for test item  281
Finding label for test item  282
Finding label for test item  283
Finding label for test item  284
Finding label for test item  285
Finding label for test item  286
Finding label for test item  287
Finding la

Finding label for test item  507
Finding label for test item  508
Finding label for test item  509
Finding label for test item  510
Finding label for test item  511
Finding label for test item  512
Finding label for test item  513
Finding label for test item  514
Finding label for test item  515
Finding label for test item  516
Finding label for test item  517
Finding label for test item  518
Finding label for test item  519
Finding label for test item  520
Finding label for test item  521
Finding label for test item  522
Finding label for test item  523
Finding label for test item  524
Finding label for test item  525
Finding label for test item  526
Finding label for test item  527
Finding label for test item  528
Finding label for test item  529
Finding label for test item  530
Finding label for test item  531
Finding label for test item  532
Finding label for test item  533
Finding label for test item  534
Finding label for test item  535
Finding label for test item  536
Finding la

Finding label for test item  757
Finding label for test item  756
Finding label for test item  759
Finding label for test item  758
Finding label for test item  760
Finding label for test item  761
Finding label for test item  762
Finding label for test item  763
Finding label for test item  764
Finding label for test item  765
Finding label for test item  766
Finding label for test item  767
Finding label for test item  769
Finding label for test item  770
Finding label for test item  768
Finding label for test item  771
Finding label for test item  772
Finding label for test item  773
Finding label for test item  774
Finding label for test item  775
Finding label for test item  776
Finding label for test item  777
Finding label for test item  778
Finding label for test item  779
Finding label for test item  780
Finding label for test item  781
Finding label for test item  782
Finding label for test item  783
Finding label for test item  784
Finding label for test item  785
Finding la

Finding label for test item  1005
Finding label for test item  1006
Finding label for test item  1008
Finding label for test item  1007
Finding label for test item  1009
Finding label for test item  1010
Finding label for test item  1011
Finding label for test item  1012
Finding label for test item  1013
Finding label for test item  1014
Finding label for test item  1015
Finding label for test item  1016
Finding label for test item  1017
Finding label for test item  1018
Finding label for test item  1019
Finding label for test item  1020
Finding label for test item  1021
Finding label for test item  1022
Finding label for test item  1023
Finding label for test item  1024
Finding label for test item  1025
Finding label for test item  1026
Finding label for test item  1027
Finding label for test item  1028
Finding label for test item  1029
Finding label for test item  1030
Finding label for test item  1031
Finding label for test item  1032
Finding label for test item  1033
Finding label 

Finding label for test item  1246
Finding label for test item  1247
Finding label for test item  1248
Finding label for test item  1249
Finding label for test item  1250
Finding label for test item  1251
Finding label for test item  1252
Finding label for test item  1253
Finding label for test item  1254
Finding label for test item  1255
Finding label for test item  1256
Finding label for test item  1257
Finding label for test item  1258
Finding label for test item  1259
Finding label for test item  1260
Finding label for test item  1261
Finding label for test item  1262
Finding label for test item  1263
Finding label for test item  1264
Finding label for test item  1265
Finding label for test item  1266
Finding label for test item  1267
Finding label for test item  1268
Finding label for test item  1269
Finding label for test item  1270
Finding label for test item  1271
Finding label for test item  1272
Finding label for test item  1273
Finding label for test item  1274
Finding label 

Finding label for test item  1487
Finding label for test item  1488
Finding label for test item  1490
Finding label for test item  1489
Finding label for test item  1491
Finding label for test item  1492
Finding label for test item  1493
Finding label for test item  1494
Finding label for test item  1495
Finding label for test item  1496
Finding label for test item  1497
Finding label for test item  1498
Finding label for test item  1499
Finding label for test item  1500
Finding label for test item  1501
Finding label for test item  1502
Finding label for test item  1503
Finding label for test item  1504
Finding label for test item  1505
Finding label for test item  1506
Finding label for test item  1507
Finding label for test item  1508
Finding label for test item  1509
Finding label for test item  1510
Finding label for test item  1511
Finding label for test item  1512
Finding label for test item  1513
Finding label for test item  1514
Finding label for test item  1515
Finding label 

Finding label for test item  1728
Finding label for test item  1729
Finding label for test item  1730
Finding label for test item  1731
Finding label for test item  1732
Finding label for test item  1733
Finding label for test item  1734
Finding label for test item  1735
Finding label for test item  1736
Finding label for test item  1737
Finding label for test item  1738
Finding label for test item  1739
Finding label for test item  1740
Finding label for test item  1741
Finding label for test item  1742
Finding label for test item  1743
Finding label for test item  1744
Finding label for test item  1745
Finding label for test item  1746
Finding label for test item  1747
Finding label for test item  1748
Finding label for test item  1749
Finding label for test item  1750
Finding label for test item  1751
Finding label for test item  1752
Finding label for test item  1753
Finding label for test item  1754
Finding label for test item  1755
Finding label for test item  1756
Finding label 

Finding label for test item  1969
Finding label for test item  1970
Finding label for test item  1971
Finding label for test item  1972
Finding label for test item  1973
Finding label for test item  1974
Finding label for test item  1975
Finding label for test item  1976
Finding label for test item  1977
Finding label for test item  1978
Finding label for test item  1979
Finding label for test item  1980
Finding label for test item  1981
Finding label for test item  1982
Finding label for test item  1983
Finding label for test item  1984
Finding label for test item  1985
Finding label for test item  1986
Finding label for test item  1987
Finding label for test item  1988
Finding label for test item  1989
Finding label for test item  1990
Finding label for test item  1991
Finding label for test item  1992
Finding label for test item  1993
Finding label for test item  1994
Finding label for test item  1995
Finding label for test item  1996
Finding label for test item  1997
Finding label 

Finding label for test item  2210
Finding label for test item  2211
Finding label for test item  2212
Finding label for test item  2213
Finding label for test item  2214
Finding label for test item  2215
Finding label for test item  2217
Finding label for test item  2216
Finding label for test item  2218
Finding label for test item  2219
Finding label for test item  2220
Finding label for test item  2221
Finding label for test item  2222
Finding label for test item  2223
Finding label for test item  2224
Finding label for test item  2225
Finding label for test item  2226
Finding label for test item  2227
Finding label for test item  2229
Finding label for test item  2228
Finding label for test item  2230
Finding label for test item  2231
Finding label for test item  2232
Finding label for test item  2233
Finding label for test item  2234
Finding label for test item  2235
Finding label for test item  2236
Finding label for test item  2237
Finding label for test item  2238
Finding label 

Finding label for test item  2451
Finding label for test item  2452
Finding label for test item  2453
Finding label for test item  2454
Finding label for test item  2455
Finding label for test item  2456
Finding label for test item  2457
Finding label for test item  2458
Finding label for test item  2459
Finding label for test item  2460
Finding label for test item  2461
Finding label for test item  2462
Finding label for test item  2463
Finding label for test item  2464
Finding label for test item  2465
Finding label for test item  2466
Finding label for test item  2467
Finding label for test item  2468
Finding label for test item  2469
Finding label for test item  2470
Finding label for test item  2471
Finding label for test item  2472
Finding label for test item  2473
Finding label for test item  2474
Finding label for test item  2475
Finding label for test item  2476
Finding label for test item  2477
Finding label for test item  2478
Finding label for test item  2479
Finding label 

Finding label for test item  2692
Finding label for test item  2693
Finding label for test item  2694
Finding label for test item  2696
Finding label for test item  2695
Finding label for test item  2697
Finding label for test item  2698
Finding label for test item  2699
Finding label for test item  2700
Finding label for test item  2701
Finding label for test item  2702
Finding label for test item  2703
Finding label for test item  2704
Finding label for test item  2705
Finding label for test item  2706
Finding label for test item  2707
Finding label for test item  2708
Finding label for test item  2709
Finding label for test item  2710
Finding label for test item  2711
Finding label for test item  2712
Finding label for test item  2713
Finding label for test item  2714
Finding label for test item  2715
Finding label for test item  2716
Finding label for test item  2717
Finding label for test item  2718
Finding label for test item  2719
Finding label for test item  2720
Finding label 

Finding label for test item  2933
Finding label for test item  2934
Finding label for test item  2935
Finding label for test item  2936
Finding label for test item  2937
Finding label for test item  2938
Finding label for test item  2939
Finding label for test item  2940
Finding label for test item  2941
Finding label for test item  2942
Finding label for test item  2943
Finding label for test item  2944
Finding label for test item  2945
Finding label for test item  2946
Finding label for test item  2947
Finding label for test item  2948
Finding label for test item  2949
Finding label for test item  2950
Finding label for test item  2951
Finding label for test item  2952
Finding label for test item  2953
Finding label for test item  2954
Finding label for test item  2955
Finding label for test item  2956
Finding label for test item  2957
Finding label for test item  2958
Finding label for test item  2959
Finding label for test item  2960
Finding label for test item  2961
Finding label 

Finding label for test item  3175
Finding label for test item  3174
Finding label for test item  3176
Finding label for test item  3177
Finding label for test item  3178
Finding label for test item  3179
Finding label for test item  3180
Finding label for test item  3181
Finding label for test item  3182
Finding label for test item  3183
Finding label for test item  3184
Finding label for test item  3185
Finding label for test item  3186
Finding label for test item  3187
Finding label for test item  3188
Finding label for test item  3189
Finding label for test item  3190
Finding label for test item  3191
Finding label for test item  3192
Finding label for test item  3193
Finding label for test item  3194
Finding label for test item  3195
Finding label for test item  3196
Finding label for test item  3197
Finding label for test item  3198
Finding label for test item  3199
Finding label for test item  3200
Finding label for test item  3201
Finding label for test item  3202
Finding label 

Finding label for test item  3415
Finding label for test item  3416
Finding label for test item  3417
Finding label for test item  3418
Finding label for test item  3419
Finding label for test item  3420
Finding label for test item  3421
Finding label for test item  3422
Finding label for test item  3423
Finding label for test item  3424
Finding label for test item  3425
Finding label for test item  3426
Finding label for test item  3427
Finding label for test item  3428
Finding label for test item  3429
Finding label for test item  3430
Finding label for test item  3431
Finding label for test item  3432
Finding label for test item  3433
Finding label for test item  3434
Finding label for test item  3435
Finding label for test item  3436
Finding label for test item  3437
Finding label for test item  3438
Finding label for test item  3439
Finding label for test item  3440
Finding label for test item  3441
Finding label for test item  3442
Finding label for test item  3443
Finding label 

Finding label for test item  3656
Finding label for test item  3657
Finding label for test item  3658
Finding label for test item  3659
Finding label for test item  3660
Finding label for test item  3661
Finding label for test item  3662
Finding label for test item  3663
Finding label for test item  3664
Finding label for test item  3665
Finding label for test item  3666
Finding label for test item  3667
Finding label for test item  3668
Finding label for test item  3669
Finding label for test item  3670
Finding label for test item  3671
Finding label for test item  3672
Finding label for test item  3673
Finding label for test item  3674
Finding label for test item  3675
Finding label for test item  3676
Finding label for test item  3677
Finding label for test item  3678
Finding label for test item  3679
Finding label for test item  3680
Finding label for test item  3681
Finding label for test item  3682
Finding label for test item  3683
Finding label for test item  3684
Finding label 

Finding label for test item  3897
Finding label for test item  3898
Finding label for test item  3899
Finding label for test item  3900
Finding label for test item  3901
Finding label for test item  3902
Finding label for test item  3903
Finding label for test item  3904
Finding label for test item  3905
Finding label for test item  3906
Finding label for test item  3907
Finding label for test item  3908
Finding label for test item  3909
Finding label for test item  3910
Finding label for test item  3911
Finding label for test item  3912
Finding label for test item  3913
Finding label for test item  3914
Finding label for test item  3915
Finding label for test item  3916
Finding label for test item  3917
Finding label for test item  3918
Finding label for test item  3919
Finding label for test item  3920
Finding label for test item  3921
Finding label for test item  3922
Finding label for test item  3923
Finding label for test item  3924
Finding label for test item  3925
Finding label 

Finding label for test item  4138
Finding label for test item  4139
Finding label for test item  4140
Finding label for test item  4141
Finding label for test item  4142
Finding label for test item  4143
Finding label for test item  4144
Finding label for test item  4145
Finding label for test item  4146
Finding label for test item  4147
Finding label for test item  4148
Finding label for test item  4149
Finding label for test item  4150
Finding label for test item  4151
Finding label for test item  4152
Finding label for test item  4153
Finding label for test item  4154
Finding label for test item  4155
Finding label for test item  4156
Finding label for test item  4157
Finding label for test item  4158
Finding label for test item  4159
Finding label for test item  4160
Finding label for test item  4162
Finding label for test item  4161
Finding label for test item  4163
Finding label for test item  4164
Finding label for test item  4165
Finding label for test item  4166
Finding label 

Finding label for test item  4379
Finding label for test item  4380
Finding label for test item  4381
Finding label for test item  4382
Finding label for test item  4383
Finding label for test item  4384
Finding label for test item  4385
Finding label for test item  4386
Finding label for test item  4387
Finding label for test item  4388
Finding label for test item  4389
Finding label for test item  4390
Finding label for test item  4391
Finding label for test item  4392
Finding label for test item  4393
Finding label for test item  4394
Finding label for test item  4395
Finding label for test item  4396
Finding label for test item  4397
Finding label for test item  4398
Finding label for test item  4399
Finding label for test item  4400
Finding label for test item  4401
Finding label for test item  4402
Finding label for test item  4403
Finding label for test item  4404
Finding label for test item  4405
Finding label for test item  4406
Finding label for test item  4407
Finding label 

Finding label for test item  4620
Finding label for test item  4621
Finding label for test item  4622
Finding label for test item  4623
Finding label for test item  4624
Time taken = 1690.1177847385
INFO: K-nearest neighbor done!


Time taken = 1690.1177847385

Let's look at the ratings

In [50]:
ratings

['-1',
 '-1',
 '+1',
 '+1',
 '-1',
 '-1',
 '-1',
 '-1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '-1',
 '-1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '-1',
 '+1',
 '-1',
 '+1',
 '-1',
 '+1',
 '+1',
 '-1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '-1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '-1',
 '-1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '-1',
 '+1',
 '-1',
 '+1',
 '-1',
 '-1',
 '+1',
 '+1',
 '-1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '-1',
 '-1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '+1',
 '-1',
 '-1',
 '-1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '+1',
 '-1',
 '+1',
 '+1',
 '-1',
 '+1',

In [75]:
label_test = np.array(label_test)

missed = 0
for i in range(len(ratings)):
    rating = int(ratings[i])
    truth = int(label_test[i])
    if rating is not truth:
        missed = missed + 1
missed

1425

Calculate the % accuracy

In [80]:
correct = (1 - (missed/len(label_test))) * 100
correct

69.1891891891892

Run the K-NN for K = 5

In [83]:
print("INFO: Starting K-NN...")
k = 5
# returns a set of results
ratings_k5 = k_nearest_neighbor(
    k=k,
    vocabulary=vocab_training,
    test_data_raw=np.array(data_test),
    DF_list=DF_list_training,
    train_matrix=tf_idf_matrix_training,
    train_label=np.array(label_train))
print("INFO: K-nearest neighbor done!")

INFO: Starting K-NN...
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271

3469
3470
3471
3472
3473
3474
3475
3476
3477
3478
3479
3480
3481
3482
3483
3484
3485
3486
3487
3488
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520
3521
3522
3523
3524
3525
3526
3527
3528
3529
3530
3531
3532
3533
3534
3535
3536
3537
3538
3539
3540
3541
3542
3543
3544
3545
3546
3547
3548
3549
3550
3551
3552
3553
3554
3555
3556
3557
3558
3559
3560
3561
3562
3563
3564
3565
3566
3567
3568
3569
3570
3571
3572
3573
3574
3575
3576
3577
3578
3579
3580
3581
3582
3583
3584
3585
3586
3587
3588
3589
3590
3591
3592
3593
3594
3595
3596
3597
3598
3599
3600
3601
3602
3603
3604
3605
3606
3607
3608
3609
3610
3611
3612
3613
3614
3615
3616
3617
3618
3619
3620
3621
3622
3623
3624
3625
3626
3627
3628
3629
3630
3631
3632
3633
3634
3635
3636
3637
3638
3639
3640
3641
3642
3643
3644
3645
3646
3647
3648
3649
3650
3651
3652
3653
3654
3655
3656
3657
3658
3659
3660
3661
3662
3663
3664
3665
3666
3667
3668


  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until


Finding label for test item  6
Finding label for test item  7
Finding label for test item  8
Finding label for test item  9
Finding label for test item  10
Finding label for test item  11
Finding label for test item  12
Finding label for test item  13
Finding label for test item  15
Finding label for test item  14
Finding label for test item  16
Finding label for test item  17
Finding label for test item  18
Finding label for test item  19
Finding label for test item  20
Finding label for test item  21
Finding label for test item  22
Finding label for test item  23
Finding label for test item  24
Finding label for test item  25
Finding label for test item  26
Finding label for test item  27
Finding label for test item  28
Finding label for test item  29
Finding label for test item  30
Finding label for test item  31
Finding label for test item  32
Finding label for test item  33
Finding label for test item  34
Finding label for test item  35
Finding label for test item  36
Finding labe

Finding label for test item  258
Finding label for test item  259
Finding label for test item  260
Finding label for test item  261
Finding label for test item  262
Finding label for test item  263
Finding label for test item  264
Finding label for test item  265
Finding label for test item  266
Finding label for test item  267
Finding label for test item  268
Finding label for test item  269
Finding label for test item  270
Finding label for test item  271
Finding label for test item  272
Finding label for test item  273
Finding label for test item  274
Finding label for test item  275
Finding label for test item  276
Finding label for test item  277
Finding label for test item  278
Finding label for test item  279
Finding label for test item  280
Finding label for test item  281
Finding label for test item  282
Finding label for test item  283
Finding label for test item  284
Finding label for test item  285
Finding label for test item  286
Finding label for test item  287
Finding la

Finding label for test item  507
Finding label for test item  508
Finding label for test item  509
Finding label for test item  510
Finding label for test item  511
Finding label for test item  512
Finding label for test item  513
Finding label for test item  514
Finding label for test item  515
Finding label for test item  516
Finding label for test item  517
Finding label for test item  518
Finding label for test item  519
Finding label for test item  520
Finding label for test item  521
Finding label for test item  522
Finding label for test item  523
Finding label for test item  524
Finding label for test item  525
Finding label for test item  526
Finding label for test item  527
Finding label for test item  528
Finding label for test item  529
Finding label for test item  530
Finding label for test item  531
Finding label for test item  532
Finding label for test item  533
Finding label for test item  534
Finding label for test item  535
Finding label for test item  536
Finding la

Finding label for test item  756
Finding label for test item  757
Finding label for test item  758
Finding label for test item  759
Finding label for test item  760
Finding label for test item  761
Finding label for test item  762
Finding label for test item  763
Finding label for test item  764
Finding label for test item  765
Finding label for test item  766
Finding label for test item  767
Finding label for test item  768
Finding label for test item  769
Finding label for test item  770
Finding label for test item  771
Finding label for test item  772
Finding label for test item  773
Finding label for test item  774
Finding label for test item  775
Finding label for test item  776
Finding label for test item  777
Finding label for test item  778
Finding label for test item  779
Finding label for test item  780
Finding label for test item  781
Finding label for test item  782
Finding label for test item  783
Finding label for test item  784
Finding label for test item  785
Finding la

Finding label for test item  1004
Finding label for test item  1006
Finding label for test item  1007
Finding label for test item  1008
Finding label for test item  1009
Finding label for test item  1010
Finding label for test item  1011
Finding label for test item  1012
Finding label for test item  1013
Finding label for test item  1014
Finding label for test item  1015
Finding label for test item  1016
Finding label for test item  1017
Finding label for test item  1018
Finding label for test item  1019
Finding label for test item  1020
Finding label for test item  1021
Finding label for test item  1022
Finding label for test item  1023
Finding label for test item  1024
Finding label for test item  1025
Finding label for test item  1026
Finding label for test item  1027
Finding label for test item  1028
Finding label for test item  1029
Finding label for test item  1030
Finding label for test item  1031
Finding label for test item  1032
Finding label for test item  1034
Finding label 

Finding label for test item  1246
Finding label for test item  1247
Finding label for test item  1248
Finding label for test item  1249
Finding label for test item  1250
Finding label for test item  1251
Finding label for test item  1252
Finding label for test item  1253
Finding label for test item  1254
Finding label for test item  1255
Finding label for test item  1256
Finding label for test item  1257
Finding label for test item  1258
Finding label for test item  1259
Finding label for test item  1260
Finding label for test item  1261
Finding label for test item  1262
Finding label for test item  1263
Finding label for test item  1264
Finding label for test item  1265
Finding label for test item  1266
Finding label for test item  1267
Finding label for test item  1268
Finding label for test item  1269
Finding label for test item  1270
Finding label for test item  1271
Finding label for test item  1272
Finding label for test item  1273
Finding label for test item  1274
Finding label 

Finding label for test item  1487
Finding label for test item  1488
Finding label for test item  1489
Finding label for test item  1490
Finding label for test item  1491
Finding label for test item  1492
Finding label for test item  1493
Finding label for test item  1494
Finding label for test item  1495
Finding label for test item  1496
Finding label for test item  1497
Finding label for test item  1498
Finding label for test item  1499
Finding label for test item  1500
Finding label for test item  1501
Finding label for test item  1502
Finding label for test item  1503
Finding label for test item  1504
Finding label for test item  1505
Finding label for test item  1506
Finding label for test item  1507
Finding label for test item  1508
Finding label for test item  1509
Finding label for test item  1510
Finding label for test item  1511
Finding label for test item  1512
Finding label for test item  1514
Finding label for test item  1513
Finding label for test item  1515
Finding label 

Finding label for test item  1728
Finding label for test item  1729
Finding label for test item  1730
Finding label for test item  1731
Finding label for test item  1732
Finding label for test item  1733
Finding label for test item  1734
Finding label for test item  1735
Finding label for test item  1736
Finding label for test item  1737
Finding label for test item  1738
Finding label for test item  1739
Finding label for test item  1740
Finding label for test item  1741
Finding label for test item  1742
Finding label for test item  1743
Finding label for test item  1744
Finding label for test item  1745
Finding label for test item  1746
Finding label for test item  1747
Finding label for test item  1748
Finding label for test item  1749
Finding label for test item  1750
Finding label for test item  1751
Finding label for test item  1752
Finding label for test item  1753
Finding label for test item  1754
Finding label for test item  1755
Finding label for test item  1756
Finding label 

Finding label for test item  1969
Finding label for test item  1970
Finding label for test item  1971
Finding label for test item  1972
Finding label for test item  1973
Finding label for test item  1974
Finding label for test item  1975
Finding label for test item  1976
Finding label for test item  1977
Finding label for test item  1978
Finding label for test item  1979
Finding label for test item  1980
Finding label for test item  1981
Finding label for test item  1982
Finding label for test item  1983
Finding label for test item  1984
Finding label for test item  1985
Finding label for test item  1986
Finding label for test item  1987
Finding label for test item  1988
Finding label for test item  1989
Finding label for test item  1990
Finding label for test item  1991
Finding label for test item  1992
Finding label for test item  1993
Finding label for test item  1994
Finding label for test item  1995
Finding label for test item  1996
Finding label for test item  1997
Finding label 

Finding label for test item  2210
Finding label for test item  2211
Finding label for test item  2212
Finding label for test item  2213
Finding label for test item  2214
Finding label for test item  2215
Finding label for test item  2216
Finding label for test item  2217
Finding label for test item  2218
Finding label for test item  2219
Finding label for test item  2220
Finding label for test item  2221
Finding label for test item  2222
Finding label for test item  2223
Finding label for test item  2224
Finding label for test item  2225
Finding label for test item  2226
Finding label for test item  2227
Finding label for test item  2228
Finding label for test item  2229
Finding label for test item  2230
Finding label for test item  2231
Finding label for test item  2232
Finding label for test item  2233
Finding label for test item  2234
Finding label for test item  2235
Finding label for test item  2237
Finding label for test item  2236
Finding label for test item  2238
Finding label 

Finding label for test item  2451
Finding label for test item  2452
Finding label for test item  2453
Finding label for test item  2454
Finding label for test item  2455
Finding label for test item  2456
Finding label for test item  2457
Finding label for test item  2458
Finding label for test item  2459
Finding label for test item  2460
Finding label for test item  2461
Finding label for test item  2462
Finding label for test item  2463
Finding label for test item  2464
Finding label for test item  2465
Finding label for test item  2466
Finding label for test item  2467
Finding label for test item  2468
Finding label for test item  2469
Finding label for test item  2470
Finding label for test item  2471
Finding label for test item  2472
Finding label for test item  2473
Finding label for test item  2474
Finding label for test item  2475
Finding label for test item  2476
Finding label for test item  2477
Finding label for test item  2478
Finding label for test item  2479
Finding label 

Finding label for test item  2692
Finding label for test item  2693
Finding label for test item  2694
Finding label for test item  2695
Finding label for test item  2696
Finding label for test item  2697
Finding label for test item  2698
Finding label for test item  2699
Finding label for test item  2700
Finding label for test item  2701
Finding label for test item  2702
Finding label for test item  2703
Finding label for test item  2704
Finding label for test item  2705
Finding label for test item  2706
Finding label for test item  2707
Finding label for test item  2708
Finding label for test item  2709
Finding label for test item  2710
Finding label for test item  2711
Finding label for test item  2712
Finding label for test item  2713
Finding label for test item  2714
Finding label for test item  2715
Finding label for test item  2716
Finding label for test item  2717
Finding label for test item  2718
Finding label for test item  2719
Finding label for test item  2720
Finding label 

Finding label for test item  2933
Finding label for test item  2934
Finding label for test item  2935
Finding label for test item  2936
Finding label for test item  2937
Finding label for test item  2938
Finding label for test item  2939
Finding label for test item  2940
Finding label for test item  2941
Finding label for test item  2943
Finding label for test item  2942
Finding label for test item  2944
Finding label for test item  2946
Finding label for test item  2945
Finding label for test item  2947
Finding label for test item  2948
Finding label for test item  2949
Finding label for test item  2950
Finding label for test item  2951
Finding label for test item  2952
Finding label for test item  2953
Finding label for test item  2954
Finding label for test item  2955
Finding label for test item  2956
Finding label for test item  2957
Finding label for test item  2958
Finding label for test item  2959
Finding label for test item  2960
Finding label for test item  2961
Finding label 

Finding label for test item  3174
Finding label for test item  3175
Finding label for test item  3176
Finding label for test item  3177
Finding label for test item  3178
Finding label for test item  3179
Finding label for test item  3180
Finding label for test item  3181
Finding label for test item  3182
Finding label for test item  3183
Finding label for test item  3184
Finding label for test item  3185
Finding label for test item  3186
Finding label for test item  3187
Finding label for test item  3188
Finding label for test item  3189
Finding label for test item  3190
Finding label for test item  3191
Finding label for test item  3192
Finding label for test item  3193
Finding label for test item  3194
Finding label for test item  3195
Finding label for test item  3196
Finding label for test item  3197
Finding label for test item  3198
Finding label for test item  3199
Finding label for test item  3200
Finding label for test item  3201
Finding label for test item  3202
Finding label 

Finding label for test item  3415
Finding label for test item  3416
Finding label for test item  3417
Finding label for test item  3418
Finding label for test item  3419
Finding label for test item  3420
Finding label for test item  3421
Finding label for test item  3422
Finding label for test item  3423
Finding label for test item  3424
Finding label for test item  3425
Finding label for test item  3426
Finding label for test item  3427
Finding label for test item  3428
Finding label for test item  3429
Finding label for test item  3430
Finding label for test item  3431
Finding label for test item  3433
Finding label for test item  3432
Finding label for test item  3434
Finding label for test item  3435
Finding label for test item  3436
Finding label for test item  3437
Finding label for test item  3438
Finding label for test item  3439
Finding label for test item  3440
Finding label for test item  3442
Finding label for test item  3441
Finding label for test item  3443
Finding label 

Finding label for test item  3656
Finding label for test item  3657
Finding label for test item  3658
Finding label for test item  3659
Finding label for test item  3660
Finding label for test item  3661
Finding label for test item  3662
Finding label for test item  3663
Finding label for test item  3664
Finding label for test item  3665
Finding label for test item  3666
Finding label for test item  3667
Finding label for test item  3668
Finding label for test item  3669
Finding label for test item  3670
Finding label for test item  3671
Finding label for test item  3672
Finding label for test item  3673
Finding label for test item  3674
Finding label for test item  3675
Finding label for test item  3676
Finding label for test item  3677
Finding label for test item  3678
Finding label for test item  3679
Finding label for test item  3680
Finding label for test item  3681
Finding label for test item  3682
Finding label for test item  3683
Finding label for test item  3684
Finding label 

Finding label for test item  3897
Finding label for test item  3898
Finding label for test item  3899
Finding label for test item  3900
Finding label for test item  3901
Finding label for test item  3902
Finding label for test item  3903
Finding label for test item  3904
Finding label for test item  3905
Finding label for test item  3906
Finding label for test item  3907
Finding label for test item  3908
Finding label for test item  3909
Finding label for test item  3910
Finding label for test item  3911
Finding label for test item  3912
Finding label for test item  3913
Finding label for test item  3914
Finding label for test item  3915
Finding label for test item  3916
Finding label for test item  3917
Finding label for test item  3918
Finding label for test item  3919
Finding label for test item  3920
Finding label for test item  3921
Finding label for test item  3922
Finding label for test item  3923
Finding label for test item  3924
Finding label for test item  3925
Finding label 

Finding label for test item  4138
Finding label for test item  4139
Finding label for test item  4140
Finding label for test item  4141
Finding label for test item  4142
Finding label for test item  4143
Finding label for test item  4144
Finding label for test item  4145
Finding label for test item  4146
Finding label for test item  4147
Finding label for test item  4149
Finding label for test item  4148
Finding label for test item  4150
Finding label for test item  4151
Finding label for test item  4152
Finding label for test item  4153
Finding label for test item  4154
Finding label for test item  4155
Finding label for test item  4156
Finding label for test item  4157
Finding label for test item  4158
Finding label for test item  4159
Finding label for test item  4160
Finding label for test item  4161
Finding label for test item  4162
Finding label for test item  4163
Finding label for test item  4164
Finding label for test item  4165
Finding label for test item  4166
Finding label 

Finding label for test item  4379
Finding label for test item  4380
Finding label for test item  4381
Finding label for test item  4382
Finding label for test item  4383
Finding label for test item  4384
Finding label for test item  4385
Finding label for test item  4386
Finding label for test item  4387
Finding label for test item  4388
Finding label for test item  4389
Finding label for test item  4390
Finding label for test item  4391
Finding label for test item  4392
Finding label for test item  4393
Finding label for test item  4394
Finding label for test item  4395
Finding label for test item  4396
Finding label for test item  4397
Finding label for test item  4398
Finding label for test item  4399
Finding label for test item  4400
Finding label for test item  4401
Finding label for test item  4402
Finding label for test item  4403
Finding label for test item  4404
Finding label for test item  4405
Finding label for test item  4406
Finding label for test item  4407
Finding label 

Finding label for test item  4620
Finding label for test item  4621
Finding label for test item  4622
Finding label for test item  4623
Finding label for test item  4624
Time taken = 1755.0066757202
INFO: K-nearest neighbor done!


Time taken = 1755 seconds

Let's look at the accuracy rate with k=5

In [85]:
label_test = np.array(label_test)

missed = 0
for i in range(len(ratings_k5)):
    rating = int(ratings_k5[i])
    truth = int(label_test[i])
    if rating is not truth:
        missed = missed + 1
missed
correct = (1 - (missed/len(label_test))) * 100
correct

67.93513513513514

Run it for K=25

In [86]:
print("INFO: Starting K-NN...")
k = 25
# returns a set of results
ratings_k25 = k_nearest_neighbor(
    k=k,
    vocabulary=vocab_training,
    test_data_raw=np.array(data_test),
    DF_list=DF_list_training,
    train_matrix=tf_idf_matrix_training,
    train_label=np.array(label_train))
print("INFO: K-nearest neighbor done!")

INFO: Starting K-NN...
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271

3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450
3451
3452
3453
3454
3455
3456
3457
3458
3459
3460
3461
3462
3463
3464
3465
3466
3467
3468
3469
3470
3471
3472
3473
3474
3475
3476
3477
3478
3479
3480
3481
3482
3483
3484
3485
3486
3487
3488
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520


  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until


Finding label for test item  6
Finding label for test item  7
Finding label for test item  8
Finding label for test item  9
Finding label for test item  10
Finding label for test item  11
Finding label for test item  12
Finding label for test item  13
Finding label for test item  15
Finding label for test item  14
Finding label for test item  16
Finding label for test item  17
Finding label for test item  18
Finding label for test item  19
Finding label for test item  20
Finding label for test item  21
Finding label for test item  22
Finding label for test item  23
Finding label for test item  24
Finding label for test item  25
Finding label for test item  26
Finding label for test item  27
Finding label for test item  28
Finding label for test item  29
Finding label for test item  30
Finding label for test item  31
Finding label for test item  32
Finding label for test item  33
Finding label for test item  34
Finding label for test item  35
Finding label for test item  36
Finding labe

Finding label for test item  258
Finding label for test item  259
Finding label for test item  260
Finding label for test item  262
Finding label for test item  261
Finding label for test item  263
Finding label for test item  264
Finding label for test item  265
Finding label for test item  266
Finding label for test item  267
Finding label for test item  268
Finding label for test item  269
Finding label for test item  270
Finding label for test item  271
Finding label for test item  272
Finding label for test item  273
Finding label for test item  274
Finding label for test item  275
Finding label for test item  276
Finding label for test item  277
Finding label for test item  278
Finding label for test item  279
Finding label for test item  280
Finding label for test item  281
Finding label for test item  282
Finding label for test item  283
Finding label for test item  284
Finding label for test item  285
Finding label for test item  286
Finding label for test item  287
Finding la

Finding label for test item  506
Finding label for test item  508
Finding label for test item  509
Finding label for test item  510
Finding label for test item  511
Finding label for test item  512
Finding label for test item  513
Finding label for test item  514
Finding label for test item  515
Finding label for test item  516
Finding label for test item  518
Finding label for test item  517
Finding label for test item  519
Finding label for test item  520
Finding label for test item  521
Finding label for test item  522
Finding label for test item  523
Finding label for test item  524
Finding label for test item  525
Finding label for test item  526
Finding label for test item  527
Finding label for test item  528
Finding label for test item  529
Finding label for test item  530
Finding label for test item  531
Finding label for test item  532
Finding label for test item  534
Finding label for test item  533
Finding label for test item  535
Finding label for test item  536
Finding la

Finding label for test item  756
Finding label for test item  757
Finding label for test item  758
Finding label for test item  761
Finding label for test item  760
Finding label for test item  759
Finding label for test item  762
Finding label for test item  763
Finding label for test item  764
Finding label for test item  765
Finding label for test item  766
Finding label for test item  767
Finding label for test item  768
Finding label for test item  769
Finding label for test item  770
Finding label for test item  771
Finding label for test item  772
Finding label for test item  773
Finding label for test item  774
Finding label for test item  775
Finding label for test item  776
Finding label for test item  777
Finding label for test item  778
Finding label for test item  779
Finding label for test item  780
Finding label for test item  781
Finding label for test item  782
Finding label for test item  783
Finding label for test item  784
Finding label for test item  785
Finding la

Finding label for test item  1005
Finding label for test item  1006
Finding label for test item  1007
Finding label for test item  1008
Finding label for test item  1009
Finding label for test item  1010
Finding label for test item  1011
Finding label for test item  1012
Finding label for test item  1013
Finding label for test item  1014
Finding label for test item  1015
Finding label for test item  1016
Finding label for test item  1017
Finding label for test item  1018
Finding label for test item  1019
Finding label for test item  1020
Finding label for test item  1021
Finding label for test item  1022
Finding label for test item  1023
Finding label for test item  1024
Finding label for test item  1025
Finding label for test item  1026
Finding label for test item  1027
Finding label for test item  1028
Finding label for test item  1029
Finding label for test item  1030
Finding label for test item  1031
Finding label for test item  1032
Finding label for test item  1035
Finding label 

Finding label for test item  1246
Finding label for test item  1247
Finding label for test item  1248
Finding label for test item  1249
Finding label for test item  1250
Finding label for test item  1251
Finding label for test item  1252
Finding label for test item  1253
Finding label for test item  1254
Finding label for test item  1255
Finding label for test item  1256
Finding label for test item  1257
Finding label for test item  1258
Finding label for test item  1259
Finding label for test item  1260
Finding label for test item  1261
Finding label for test item  1262
Finding label for test item  1263
Finding label for test item  1264
Finding label for test item  1265
Finding label for test item  1266
Finding label for test item  1267
Finding label for test item  1268
Finding label for test item  1269
Finding label for test item  1271
Finding label for test item  1270
Finding label for test item  1272
Finding label for test item  1273
Finding label for test item  1274
Finding label 

Finding label for test item  1487
Finding label for test item  1488
Finding label for test item  1489
Finding label for test item  1490
Finding label for test item  1491
Finding label for test item  1492
Finding label for test item  1493
Finding label for test item  1494
Finding label for test item  1495
Finding label for test item  1496
Finding label for test item  1497
Finding label for test item  1498
Finding label for test item  1499
Finding label for test item  1500
Finding label for test item  1501
Finding label for test item  1502
Finding label for test item  1503
Finding label for test item  1504
Finding label for test item  1505
Finding label for test item  1506
Finding label for test item  1507
Finding label for test item  1509
Finding label for test item  1508
Finding label for test item  1510
Finding label for test item  1511
Finding label for test item  1512
Finding label for test item  1513
Finding label for test item  1514
Finding label for test item  1515
Finding label 

Finding label for test item  1728
Finding label for test item  1729
Finding label for test item  1730
Finding label for test item  1731
Finding label for test item  1732
Finding label for test item  1733
Finding label for test item  1734
Finding label for test item  1735
Finding label for test item  1736
Finding label for test item  1737
Finding label for test item  1738
Finding label for test item  1739
Finding label for test item  1740
Finding label for test item  1741
Finding label for test item  1742
Finding label for test item  1743
Finding label for test item  1744
Finding label for test item  1745
Finding label for test item  1746
Finding label for test item  1747
Finding label for test item  1748
Finding label for test item  1749
Finding label for test item  1750
Finding label for test item  1751
Finding label for test item  1752
Finding label for test item  1753
Finding label for test item  1754
Finding label for test item  1755
Finding label for test item  1756
Finding label 

Finding label for test item  1969
Finding label for test item  1970
Finding label for test item  1971
Finding label for test item  1972
Finding label for test item  1973
Finding label for test item  1974
Finding label for test item  1975
Finding label for test item  1976
Finding label for test item  1977
Finding label for test item  1978
Finding label for test item  1979
Finding label for test item  1980
Finding label for test item  1981
Finding label for test item  1982
Finding label for test item  1983
Finding label for test item  1984
Finding label for test item  1985
Finding label for test item  1986
Finding label for test item  1987
Finding label for test item  1988
Finding label for test item  1989
Finding label for test item  1990
Finding label for test item  1991
Finding label for test item  1992
Finding label for test item  1993
Finding label for test item  1994
Finding label for test item  1995
Finding label for test item  1996
Finding label for test item  1997
Finding label 

Finding label for test item  2210
Finding label for test item  2211
Finding label for test item  2212
Finding label for test item  2213
Finding label for test item  2214
Finding label for test item  2215
Finding label for test item  2216
Finding label for test item  2217
Finding label for test item  2218
Finding label for test item  2220
Finding label for test item  2219
Finding label for test item  2221
Finding label for test item  2222
Finding label for test item  2223
Finding label for test item  2224
Finding label for test item  2225
Finding label for test item  2226
Finding label for test item  2227
Finding label for test item  2228
Finding label for test item  2229
Finding label for test item  2230
Finding label for test item  2231
Finding label for test item  2232
Finding label for test item  2233
Finding label for test item  2234
Finding label for test item  2235
Finding label for test item  2236
Finding label for test item  2237
Finding label for test item  2238
Finding label 

Finding label for test item  2451
Finding label for test item  2452
Finding label for test item  2453
Finding label for test item  2454
Finding label for test item  2455
Finding label for test item  2456
Finding label for test item  2457
Finding label for test item  2458
Finding label for test item  2459
Finding label for test item  2460
Finding label for test item  2461
Finding label for test item  2462
Finding label for test item  2463
Finding label for test item  2464
Finding label for test item  2465
Finding label for test item  2466
Finding label for test item  2467
Finding label for test item  2468
Finding label for test item  2469
Finding label for test item  2470
Finding label for test item  2471
Finding label for test item  2472
Finding label for test item  2473
Finding label for test item  2474
Finding label for test item  2475
Finding label for test item  2476
Finding label for test item  2477
Finding label for test item  2478
Finding label for test item  2479
Finding label 

Finding label for test item  2692
Finding label for test item  2693
Finding label for test item  2694
Finding label for test item  2695
Finding label for test item  2696
Finding label for test item  2697
Finding label for test item  2698
Finding label for test item  2699
Finding label for test item  2700
Finding label for test item  2701
Finding label for test item  2702
Finding label for test item  2703
Finding label for test item  2704
Finding label for test item  2705
Finding label for test item  2706
Finding label for test item  2707
Finding label for test item  2708
Finding label for test item  2709
Finding label for test item  2710
Finding label for test item  2711
Finding label for test item  2712
Finding label for test item  2713
Finding label for test item  2714
Finding label for test item  2715
Finding label for test item  2716
Finding label for test item  2717
Finding label for test item  2718
Finding label for test item  2719
Finding label for test item  2720
Finding label 

Finding label for test item  2933
Finding label for test item  2934
Finding label for test item  2935
Finding label for test item  2936
Finding label for test item  2937
Finding label for test item  2938
Finding label for test item  2939
Finding label for test item  2940
Finding label for test item  2941
Finding label for test item  2942
Finding label for test item  2943
Finding label for test item  2944
Finding label for test item  2945
Finding label for test item  2946
Finding label for test item  2947
Finding label for test item  2948
Finding label for test item  2949
Finding label for test item  2950
Finding label for test item  2951
Finding label for test item  2952
Finding label for test item  2953
Finding label for test item  2954
Finding label for test item  2955
Finding label for test item  2956
Finding label for test item  2957
Finding label for test item  2958
Finding label for test item  2959
Finding label for test item  2960
Finding label for test item  2961
Finding label 

Finding label for test item  3174
Finding label for test item  3175
Finding label for test item  3176
Finding label for test item  3177
Finding label for test item  3178
Finding label for test item  3179
Finding label for test item  3180
Finding label for test item  3181
Finding label for test item  3182
Finding label for test item  3183
Finding label for test item  3184
Finding label for test item  3185
Finding label for test item  3186
Finding label for test item  3187
Finding label for test item  3188
Finding label for test item  3189
Finding label for test item  3190
Finding label for test item  3191
Finding label for test item  3192
Finding label for test item  3193
Finding label for test item  3194
Finding label for test item  3195
Finding label for test item  3196
Finding label for test item  3197
Finding label for test item  3198
Finding label for test item  3199
Finding label for test item  3200
Finding label for test item  3201
Finding label for test item  3202
Finding label 

Finding label for test item  3415
Finding label for test item  3416
Finding label for test item  3417
Finding label for test item  3418
Finding label for test item  3419
Finding label for test item  3420
Finding label for test item  3421
Finding label for test item  3422
Finding label for test item  3423
Finding label for test item  3424
Finding label for test item  3425
Finding label for test item  3426
Finding label for test item  3427
Finding label for test item  3428
Finding label for test item  3429
Finding label for test item  3430
Finding label for test item  3431
Finding label for test item  3432
Finding label for test item  3433
Finding label for test item  3434
Finding label for test item  3435
Finding label for test item  3436
Finding label for test item  3437
Finding label for test item  3438
Finding label for test item  3439
Finding label for test item  3440
Finding label for test item  3442
Finding label for test item  3441
Finding label for test item  3443
Finding label 

Finding label for test item  3656
Finding label for test item  3657
Finding label for test item  3658
Finding label for test item  3659
Finding label for test item  3660
Finding label for test item  3661
Finding label for test item  3662
Finding label for test item  3663
Finding label for test item  3664
Finding label for test item  3665
Finding label for test item  3666
Finding label for test item  3668
Finding label for test item  3667
Finding label for test item  3669
Finding label for test item  3670
Finding label for test item  3671
Finding label for test item  3672
Finding label for test item  3673
Finding label for test item  3674
Finding label for test item  3675
Finding label for test item  3676
Finding label for test item  3677
Finding label for test item  3678
Finding label for test item  3679
Finding label for test item  3680
Finding label for test item  3681
Finding label for test item  3682
Finding label for test item  3683
Finding label for test item  3684
Finding label 

Finding label for test item  3897
Finding label for test item  3898
Finding label for test item  3899
Finding label for test item  3901
Finding label for test item  3900
Finding label for test item  3902
Finding label for test item  3903
Finding label for test item  3904
Finding label for test item  3905
Finding label for test item  3906
Finding label for test item  3907
Finding label for test item  3908
Finding label for test item  3909
Finding label for test item  3910
Finding label for test item  3911
Finding label for test item  3912
Finding label for test item  3914
Finding label for test item  3913
Finding label for test item  3915
Finding label for test item  3916
Finding label for test item  3917
Finding label for test item  3918
Finding label for test item  3919
Finding label for test item  3920
Finding label for test item  3921
Finding label for test item  3922
Finding label for test item  3923
Finding label for test item  3925
Finding label for test item  3924
Finding label 

Finding label for test item  4138
Finding label for test item  4139
Finding label for test item  4140
Finding label for test item  4141
Finding label for test item  4142
Finding label for test item  4143
Finding label for test item  4144
Finding label for test item  4146
Finding label for test item  4145
Finding label for test item  4147
Finding label for test item  4148
Finding label for test item  4149
Finding label for test item  4150
Finding label for test item  4151
Finding label for test item  4152
Finding label for test item  4153
Finding label for test item  4154
Finding label for test item  4155
Finding label for test item  4156
Finding label for test item  4157
Finding label for test item  4158
Finding label for test item  4159
Finding label for test item  4160
Finding label for test item  4161
Finding label for test item  4162
Finding label for test item  4163
Finding label for test item  4164
Finding label for test item  4165
Finding label for test item  4166
Finding label 

Finding label for test item  4379
Finding label for test item  4380
Finding label for test item  4381
Finding label for test item  4382
Finding label for test item  4383
Finding label for test item  4384
Finding label for test item  4385
Finding label for test item  4386
Finding label for test item  4387
Finding label for test item  4388
Finding label for test item  4389
Finding label for test item  4390
Finding label for test item  4391
Finding label for test item  4392
Finding label for test item  4393
Finding label for test item  4394
Finding label for test item  4395
Finding label for test item  4396
Finding label for test item  4397
Finding label for test item  4398
Finding label for test item  4399
Finding label for test item  4400
Finding label for test item  4401
Finding label for test item  4402
Finding label for test item  4403
Finding label for test item  4404
Finding label for test item  4405
Finding label for test item  4406
Finding label for test item  4407
Finding label 

Finding label for test item  4620
Finding label for test item  4621
Finding label for test item  4622
Finding label for test item  4623
Finding label for test item  4624
Time taken = 1746.6559686661
INFO: K-nearest neighbor done!


Time taken = 1746.65596 seconds

In [87]:
label_test = np.array(label_test)

missed = 0
for i in range(len(ratings_k25)):
    rating = int(ratings_k25[i])
    truth = int(label_test[i])
    if rating is not truth:
        missed = missed + 1
missed
correct = (1 - (missed/len(label_test))) * 100
correct

70.11891891891892

Let's do it again with k=55

In [88]:
print("INFO: Starting K-NN...")
k = 55
# returns a set of results
ratings_k55 = k_nearest_neighbor(
    k=k,
    vocabulary=vocab_training,
    test_data_raw=np.array(data_test),
    DF_list=DF_list_training,
    train_matrix=tf_idf_matrix_training,
    train_label=np.array(label_train))
print("INFO: K-nearest neighbor done!")

INFO: Starting K-NN...
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271

3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
3274
3275
3276
3277
3278
3279
3280
3281
3282
3283
3284
3285
3286
3287
3288
3289
3290
3291
3292
3293
3294
3295
3296
3297
3298
3299
3300
3301
3302
3303
3304
3305
3306
3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450


  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until
  This is separate from the ipykernel package so we can avoid doing imports until


Finding label for test item  6
Finding label for test item  7
Finding label for test item  8
Finding label for test item  9
Finding label for test item  10
Finding label for test item  11
Finding label for test item  12
Finding label for test item  13
Finding label for test item  15
Finding label for test item  16
Finding label for test item  14
Finding label for test item  17
Finding label for test item  18
Finding label for test item  19
Finding label for test item  20
Finding label for test item  21
Finding label for test item  22
Finding label for test item  23
Finding label for test item  24
Finding label for test item  25
Finding label for test item  26
Finding label for test item  27
Finding label for test item  28
Finding label for test item  29
Finding label for test item  30
Finding label for test item  31
Finding label for test item  32
Finding label for test item  33
Finding label for test item  34
Finding label for test item  35
Finding label for test item  36
Finding labe

Finding label for test item  258
Finding label for test item  259
Finding label for test item  260
Finding label for test item  261
Finding label for test item  262
Finding label for test item  263
Finding label for test item  264
Finding label for test item  265
Finding label for test item  266
Finding label for test item  267
Finding label for test item  268
Finding label for test item  269
Finding label for test item  270
Finding label for test item  271
Finding label for test item  272
Finding label for test item  273
Finding label for test item  274
Finding label for test item  275
Finding label for test item  276
Finding label for test item  277
Finding label for test item  278
Finding label for test item  279
Finding label for test item  280
Finding label for test item  281
Finding label for test item  282
Finding label for test item  283
Finding label for test item  284
Finding label for test item  285
Finding label for test item  286
Finding label for test item  287
Finding la

Finding label for test item  507
Finding label for test item  508
Finding label for test item  509
Finding label for test item  510
Finding label for test item  511
Finding label for test item  512
Finding label for test item  513
Finding label for test item  514
Finding label for test item  515
Finding label for test item  516
Finding label for test item  517
Finding label for test item  518
Finding label for test item  520
Finding label for test item  519
Finding label for test item  521
Finding label for test item  522
Finding label for test item  523
Finding label for test item  524
Finding label for test item  525
Finding label for test item  526
Finding label for test item  527
Finding label for test item  528
Finding label for test item  529
Finding label for test item  530
Finding label for test item  531
Finding label for test item  532
Finding label for test item  533
Finding label for test item  534
Finding label for test item  535
Finding label for test item  536
Finding la

Finding label for test item  756
Finding label for test item  757
Finding label for test item  758
Finding label for test item  759
Finding label for test item  760
Finding label for test item  761
Finding label for test item  762
Finding label for test item  763
Finding label for test item  764
Finding label for test item  765
Finding label for test item  766
Finding label for test item  767
Finding label for test item  768
Finding label for test item  769
Finding label for test item  771
Finding label for test item  770
Finding label for test item  772
Finding label for test item  773
Finding label for test item  774
Finding label for test item  775
Finding label for test item  776
Finding label for test item  777
Finding label for test item  778
Finding label for test item  779
Finding label for test item  780
Finding label for test item  781
Finding label for test item  782
Finding label for test item  783
Finding label for test item  784
Finding label for test item  785
Finding la

Finding label for test item  1005
Finding label for test item  1006
Finding label for test item  1008
Finding label for test item  1007
Finding label for test item  1009
Finding label for test item  1010
Finding label for test item  1011
Finding label for test item  1012
Finding label for test item  1013
Finding label for test item  1014
Finding label for test item  1015
Finding label for test item  1016
Finding label for test item  1017
Finding label for test item  1018
Finding label for test item  1019
Finding label for test item  1020
Finding label for test item  1021
Finding label for test item  1022
Finding label for test item  1023
Finding label for test item  1024
Finding label for test item  1025
Finding label for test item  1027
Finding label for test item  1026
Finding label for test item  1028
Finding label for test item  1029
Finding label for test item  1030
Finding label for test item  1031
Finding label for test item  1032
Finding label for test item  1033
Finding label 

Finding label for test item  1246
Finding label for test item  1247
Finding label for test item  1248
Finding label for test item  1249
Finding label for test item  1250
Finding label for test item  1251
Finding label for test item  1252
Finding label for test item  1253
Finding label for test item  1254
Finding label for test item  1255
Finding label for test item  1256
Finding label for test item  1257
Finding label for test item  1258
Finding label for test item  1259
Finding label for test item  1260
Finding label for test item  1261
Finding label for test item  1262
Finding label for test item  1263
Finding label for test item  1264
Finding label for test item  1265
Finding label for test item  1266
Finding label for test item  1267
Finding label for test item  1268
Finding label for test item  1269
Finding label for test item  1270
Finding label for test item  1271
Finding label for test item  1272
Finding label for test item  1273
Finding label for test item  1274
Finding label 

Finding label for test item  1487
Finding label for test item  1488
Finding label for test item  1491
Finding label for test item  1489
Finding label for test item  1490
Finding label for test item  1492
Finding label for test item  1493
Finding label for test item  1494
Finding label for test item  1495
Finding label for test item  1496
Finding label for test item  1497
Finding label for test item  1498
Finding label for test item  1499
Finding label for test item  1500
Finding label for test item  1501
Finding label for test item  1502
Finding label for test item  1503
Finding label for test item  1504
Finding label for test item  1505
Finding label for test item  1506
Finding label for test item  1507
Finding label for test item  1508
Finding label for test item  1509
Finding label for test item  1510
Finding label for test item  1511
Finding label for test item  1512
Finding label for test item  1513
Finding label for test item  1514
Finding label for test item  1515
Finding label 

Finding label for test item  1728
Finding label for test item  1729
Finding label for test item  1730
Finding label for test item  1731
Finding label for test item  1732
Finding label for test item  1733
Finding label for test item  1734
Finding label for test item  1735
Finding label for test item  1736
Finding label for test item  1737
Finding label for test item  1738
Finding label for test item  1739
Finding label for test item  1740
Finding label for test item  1741
Finding label for test item  1742
Finding label for test item  1743
Finding label for test item  1744
Finding label for test item  1745
Finding label for test item  1746
Finding label for test item  1747
Finding label for test item  1748
Finding label for test item  1749
Finding label for test item  1750
Finding label for test item  1751
Finding label for test item  1752
Finding label for test item  1753
Finding label for test item  1754
Finding label for test item  1755
Finding label for test item  1756
Finding label 

Finding label for test item  1969
Finding label for test item  1970
Finding label for test item  1971
Finding label for test item  1972
Finding label for test item  1973
Finding label for test item  1974
Finding label for test item  1975
Finding label for test item  1976
Finding label for test item  1977
Finding label for test item  1978
Finding label for test item  1979
Finding label for test item  1980
Finding label for test item  1981
Finding label for test item  1982
Finding label for test item  1983
Finding label for test item  1984
Finding label for test item  1985
Finding label for test item  1986
Finding label for test item  1987
Finding label for test item  1988
Finding label for test item  1989
Finding label for test item  1990
Finding label for test item  1991
Finding label for test item  1993
Finding label for test item  1992
Finding label for test item  1995
Finding label for test item  1994
Finding label for test item  1996
Finding label for test item  1997
Finding label 

Finding label for test item  2210
Finding label for test item  2211
Finding label for test item  2212
Finding label for test item  2213
Finding label for test item  2214
Finding label for test item  2215
Finding label for test item  2216
Finding label for test item  2217
Finding label for test item  2218
Finding label for test item  2219
Finding label for test item  2220
Finding label for test item  2221
Finding label for test item  2222
Finding label for test item  2223
Finding label for test item  2224
Finding label for test item  2225
Finding label for test item  2226
Finding label for test item  2227
Finding label for test item  2228
Finding label for test item  2229
Finding label for test item  2230
Finding label for test item  2231
Finding label for test item  2232
Finding label for test item  2233
Finding label for test item  2234
Finding label for test item  2235
Finding label for test item  2236
Finding label for test item  2237
Finding label for test item  2238
Finding label 

Finding label for test item  2451
Finding label for test item  2452
Finding label for test item  2453
Finding label for test item  2454
Finding label for test item  2455
Finding label for test item  2456
Finding label for test item  2457
Finding label for test item  2458
Finding label for test item  2459
Finding label for test item  2460
Finding label for test item  2461
Finding label for test item  2462
Finding label for test item  2463
Finding label for test item  2464
Finding label for test item  2465
Finding label for test item  2466
Finding label for test item  2467
Finding label for test item  2468
Finding label for test item  2469
Finding label for test item  2470
Finding label for test item  2471
Finding label for test item  2472
Finding label for test item  2473
Finding label for test item  2474
Finding label for test item  2475
Finding label for test item  2476
Finding label for test item  2477
Finding label for test item  2478
Finding label for test item  2479
Finding label 

Finding label for test item  2692
Finding label for test item  2693
Finding label for test item  2694
Finding label for test item  2695
Finding label for test item  2696
Finding label for test item  2697
Finding label for test item  2698
Finding label for test item  2699
Finding label for test item  2700
Finding label for test item  2701
Finding label for test item  2702
Finding label for test item  2703
Finding label for test item  2704
Finding label for test item  2705
Finding label for test item  2706
Finding label for test item  2707
Finding label for test item  2708
Finding label for test item  2709
Finding label for test item  2710
Finding label for test item  2711
Finding label for test item  2712
Finding label for test item  2713
Finding label for test item  2714
Finding label for test item  2715
Finding label for test item  2716
Finding label for test item  2717
Finding label for test item  2718
Finding label for test item  2719
Finding label for test item  2720
Finding label 

Finding label for test item  2933
Finding label for test item  2934
Finding label for test item  2935
Finding label for test item  2936
Finding label for test item  2937
Finding label for test item  2938
Finding label for test item  2939
Finding label for test item  2940
Finding label for test item  2941
Finding label for test item  2942
Finding label for test item  2943
Finding label for test item  2944
Finding label for test item  2945
Finding label for test item  2946
Finding label for test item  2947
Finding label for test item  2948
Finding label for test item  2949
Finding label for test item  2950
Finding label for test item  2952
Finding label for test item  2951
Finding label for test item  2953
Finding label for test item  2954
Finding label for test item  2955
Finding label for test item  2956
Finding label for test item  2957
Finding label for test item  2958
Finding label for test item  2959
Finding label for test item  2960
Finding label for test item  2961
Finding label 

Finding label for test item  3174
Finding label for test item  3175
Finding label for test item  3176
Finding label for test item  3177
Finding label for test item  3178
Finding label for test item  3179
Finding label for test item  3180
Finding label for test item  3181
Finding label for test item  3182
Finding label for test item  3183
Finding label for test item  3184
Finding label for test item  3185
Finding label for test item  3186
Finding label for test item  3187
Finding label for test item  3188
Finding label for test item  3189
Finding label for test item  3190
Finding label for test item  3191
Finding label for test item  3194
Finding label for test item  3192
Finding label for test item  3193
Finding label for test item  3195
Finding label for test item  3196
Finding label for test item  3197
Finding label for test item  3198
Finding label for test item  3199
Finding label for test item  3200
Finding label for test item  3201
Finding label for test item  3202
Finding label 

Finding label for test item  3415
Finding label for test item  3416
Finding label for test item  3417
Finding label for test item  3418
Finding label for test item  3419
Finding label for test item  3420
Finding label for test item  3421
Finding label for test item  3422
Finding label for test item  3424
Finding label for test item  3423
Finding label for test item  3425
Finding label for test item  3426
Finding label for test item  3427
Finding label for test item  3428
Finding label for test item  3429
Finding label for test item  3430
Finding label for test item  3431
Finding label for test item  3432
Finding label for test item  3433
Finding label for test item  3434
Finding label for test item  3435
Finding label for test item  3436
Finding label for test item  3437
Finding label for test item  3438
Finding label for test item  3439
Finding label for test item  3440
Finding label for test item  3441
Finding label for test item  3442
Finding label for test item  3443
Finding label 

Finding label for test item  3656
Finding label for test item  3657
Finding label for test item  3658
Finding label for test item  3659
Finding label for test item  3660
Finding label for test item  3661
Finding label for test item  3662
Finding label for test item  3663
Finding label for test item  3664
Finding label for test item  3665
Finding label for test item  3666
Finding label for test item  3667
Finding label for test item  3668
Finding label for test item  3669
Finding label for test item  3670
Finding label for test item  3671
Finding label for test item  3672
Finding label for test item  3673
Finding label for test item  3674
Finding label for test item  3675
Finding label for test item  3676
Finding label for test item  3677
Finding label for test item  3678
Finding label for test item  3679
Finding label for test item  3680
Finding label for test item  3682
Finding label for test item  3683
Finding label for test item  3681
Finding label for test item  3684
Finding label 

Finding label for test item  3897
Finding label for test item  3898
Finding label for test item  3899
Finding label for test item  3900
Finding label for test item  3901
Finding label for test item  3902
Finding label for test item  3903
Finding label for test item  3904
Finding label for test item  3905
Finding label for test item  3906
Finding label for test item  3907
Finding label for test item  3908
Finding label for test item  3909
Finding label for test item  3910
Finding label for test item  3911
Finding label for test item  3912
Finding label for test item  3913
Finding label for test item  3914
Finding label for test item  3915
Finding label for test item  3916
Finding label for test item  3917
Finding label for test item  3918
Finding label for test item  3919
Finding label for test item  3920
Finding label for test item  3921
Finding label for test item  3922
Finding label for test item  3923
Finding label for test item  3924
Finding label for test item  3925
Finding label 

Finding label for test item  4138
Finding label for test item  4139
Finding label for test item  4140
Finding label for test item  4141
Finding label for test item  4142
Finding label for test item  4143
Finding label for test item  4144
Finding label for test item  4145
Finding label for test item  4146
Finding label for test item  4147
Finding label for test item  4148
Finding label for test item  4149
Finding label for test item  4150
Finding label for test item  4151
Finding label for test item  4152
Finding label for test item  4153
Finding label for test item  4154
Finding label for test item  4155
Finding label for test item  4156
Finding label for test item  4157
Finding label for test item  4158
Finding label for test item  4159
Finding label for test item  4160
Finding label for test item  4161
Finding label for test item  4162
Finding label for test item  4163
Finding label for test item  4164
Finding label for test item  4165
Finding label for test item  4166
Finding label 

Finding label for test item  4379
Finding label for test item  4380
Finding label for test item  4381
Finding label for test item  4382
Finding label for test item  4383
Finding label for test item  4384
Finding label for test item  4385
Finding label for test item  4386
Finding label for test item  4387
Finding label for test item  4388
Finding label for test item  4389
Finding label for test item  4390
Finding label for test item  4391
Finding label for test item  4392
Finding label for test item  4393
Finding label for test item  4394
Finding label for test item  4395
Finding label for test item  4396
Finding label for test item  4398
Finding label for test item  4397
Finding label for test item  4399
Finding label for test item  4400
Finding label for test item  4401
Finding label for test item  4402
Finding label for test item  4403
Finding label for test item  4404
Finding label for test item  4405
Finding label for test item  4406
Finding label for test item  4407
Finding label 

Finding label for test item  4620
Finding label for test item  4621
Finding label for test item  4622
Finding label for test item  4623
Finding label for test item  4624
Time taken = 1732.4060928822
INFO: K-nearest neighbor done!


In [89]:
label_test = np.array(label_test)

missed = 0
for i in range(len(ratings_k55)):
    rating = int(ratings_k55[i])
    truth = int(label_test[i])
    if rating is not truth:
        missed = missed + 1
missed
correct = (1 - (missed/len(label_test))) * 100
correct

68.3027027027027