In [4]:
from src.classifiers.knn import Knn
from src.evaluation.evaluation import Evaluator
from src.preprocessing.preprocessing import Preprocessor
from src.data_representations.bow import BOW

# Bag of words experiments

This notebook contains experiments for artist classification with k nearest neighbors with a set based bag of words approach as a representation for the lyrics (i.e. each lyric is represented by a set its unique words) in different configurations of different training and test sizes as well as distance metrics.

The first part provides a tutorial on how to run the experiments and specify core hyperparameters and settings.

The second part showcases experiments with their results.

-------------------
# Tutorial

To run the experiments, either add your datasets to the data folder or change these variables to the paths to your datasets.

In [None]:
filepath_train = "./data/songs_train.txt"
filepath_test = "./data/songs_test.txt"

To run experiments with custom settings, change `read_limit` in the respective `Preprocessor`s to your training and test sizes

In [None]:
dataset_train = Preprocessor(filepath=filepath_train, read_limit=10000)
dataset_test = Preprocessor(filepath=filepath_test, read_limit=100)

Create numerical representations of labels for mapping

In [None]:
label_to_num = {artist:i for i, artist in enumerate(set(dataset_train.artists) | set(dataset_test.artists))}
num_to_label = {value:key for key, value in label_to_num.items()}

Create training and testing examples and labels

In [None]:
training_examples = [BOW(tok) for tok in dataset_train.tokenized]
training_labels = [label_to_num[label] for label in dataset_train.artists]
test_examples = dataset_test.BOW()
test_labels = [label_to_num[label] for label in dataset_test.artists]

Initialize the classifier

In [None]:
classifier = Knn(training_examples, training_labels)

To use multiprocessing for the classifier, specify `multi_process`, the default is 1

In [None]:
classifier = Knn(training_examples, training_labels, multi_process=8)

Choose the distance metric by changing the `measure` argument in `classifier.predict` from the following options

- Overlap coefficient: `"overlap"`

- Jaccard index: `"jaccard"`

- Sørensen-Dice coefficient: `"dsc"`

In [None]:
# Test predictions
predictions = classifier.predict(test_examples, k=4, measure="jaccard")

and Tversky: `"tversky"` with specifying $\alpha$ and $\beta$ as follows:

In [None]:
# Test predictions
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=1, beta=1)

Initialize evaluation with the following

In [None]:
evaluator = Evaluator(test_labels, predictions)

and evaluate with accuracy, micro and macro $F_1$, precision and recall as follows 

In [None]:
evaluator.accuracy()
evaluator.micro_precision()
evaluator.micro_recall()
evaluator.micro_f1()
evaluator.macro_precision()
evaluator.macro_recall()
evaluator.macro_f1()

------------
# Experiments

In [None]:
# Define filepaths
filepath_train = "./data/songs_train.txt"
filepath_test = "./data/songs_test.txt"

The experiments 1) and 2) are conducted with $k=4$, experiment 3) is conducted with $k=4$ for Sørensen-Dice and $k \in \{1,2,...,5\}$.

The distance metrics used here are:

- Tversky index: $S(X,Y) = \frac{|X \cap Y|}{|X \cap Y| + \alpha |X \setminus Y| + \beta |Y \setminus X|}$

- Overlap coefficient: $overlap(X,Y) = \frac{|X \cap Y|}{min(|X|,|Y|)}$

- Jaccard index: $J(A,B) = \frac{|A\cap B|}{|A \cup B|} = \frac{|A\cap B|}{|A|+|B|-|A \cap B|}$

- Sørensen-Dice coefficient: $DSC(X,Y) = \frac{2|X\cap Y|}{|X|+|Y|}$



In [None]:
# Read datasets
dataset_train = Preprocessor(filepath=filepath_train, read_limit=10000)
dataset_test = Preprocessor(filepath=filepath_test, read_limit=100)
# Create numerical representations of labels for mapping
label_to_num = {artist:i for i, artist in enumerate(set(dataset_train.artists) | set(dataset_test.artists))}
num_to_label = {value:key for key, value in label_to_num.items()}
# Initiate Knn classifier
training_examples = [BOW(tok) for tok in dataset_train.tokenized]
training_labels = [label_to_num[label] for label in dataset_train.artists]
classifier = Knn(training_examples, training_labels)

## 1) 10k training/100 test sizes

### Loading data and initializing KNN

In [2]:
# Read datasets
dataset_train = Preprocessor(filepath=filepath_train, read_limit=10000)
dataset_test = Preprocessor(filepath=filepath_test, read_limit=100)
# Create numerical representations of labels for mapping
label_to_num = {artist:i for i, artist in enumerate(set(dataset_train.artists) | set(dataset_test.artists))}
num_to_label = {value:key for key, value in label_to_num.items()}
# Initiate Knn classifier
training_examples = [BOW(tok) for tok in dataset_train.tokenized]
training_labels = [label_to_num[label] for label in dataset_train.artists]
classifier = Knn(training_examples, training_labels)

### Jaccard

In [3]:
# Test predictions
test_examples = dataset_test.BOW()
test_labels = [label_to_num[label] for label in dataset_test.artists]
predictions = classifier.predict(test_examples, k=4, measure="jaccard")
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.07
Micro Precision:
 0.3181818181818182
Micro Recall:
 0.07
Micro F-Score:
 0.11475409836065574


### Sørensen-Dice

In [4]:
predictions = classifier.predict(test_examples, k=4, measure="dsc")
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.07
Micro Precision:
 0.3181818181818182
Micro Recall:
 0.07
Micro F-Score:
 0.11475409836065574


### Overlap index

In [5]:
predictions = classifier.predict(test_examples, k=4, measure="overlap")
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.01
Micro Precision:
 0.047619047619047616
Micro Recall:
 0.01
Micro F-Score:
 0.01652892561983471


### Different Tversky settings
Experiments with different settings of $\alpha$ and $\beta$ for the Tversky index

#### $\alpha=0.1$, $\beta=0.9$

In [6]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0.1, beta=0.9)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.03
Micro Precision:
 0.15789473684210525
Micro Recall:
 0.03
Micro F-Score:
 0.050420168067226885


#### $\alpha=0.2$, $\beta=0.8$

In [7]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0.2, beta=0.8)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05
Micro Precision:
 0.3125
Micro Recall:
 0.05
Micro F-Score:
 0.08620689655172414


#### $\alpha=0.3$, $\beta=0.7$

In [8]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0.3, beta=0.7)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05
Micro Precision:
 0.3125
Micro Recall:
 0.05
Micro F-Score:
 0.08620689655172414


#### $\alpha=\frac{1}{3}$, $\beta=\frac{2}{3}$

In [9]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=1/3, beta=2/3)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05
Micro Precision:
 0.2777777777777778
Micro Recall:
 0.05
Micro F-Score:
 0.08474576271186442


#### $\alpha=0.4$, $\beta=0.6$

In [10]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0.4, beta=0.6)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05
Micro Precision:
 0.22727272727272727
Micro Recall:
 0.05
Micro F-Score:
 0.08196721311475409


#### $\alpha=0.6$, $\beta=0.4$

In [11]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0.6, beta=0.4)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05
Micro Precision:
 0.20833333333333334
Micro Recall:
 0.05
Micro F-Score:
 0.08064516129032258


#### $\alpha=\frac{2}{3}$, $\beta=\frac{1}{3}$

In [12]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=2/3, beta=1/3)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05
Micro Precision:
 0.20833333333333334
Micro Recall:
 0.05
Micro F-Score:
 0.08064516129032258


#### $\alpha=0.7$, $\beta=0.3$

In [13]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0.7, beta=0.3)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.06
Micro Precision:
 0.2222222222222222
Micro Recall:
 0.06
Micro F-Score:
 0.09448818897637795


#### $\alpha=0.8$, $\beta=0.2$

In [14]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0.8, beta=0.2)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.06
Micro Precision:
 0.18181818181818182
Micro Recall:
 0.06
Micro F-Score:
 0.09022556390977443


#### $\alpha=0.1$, $\beta=0.9$

In [15]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0.9, beta=0.1)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.01
Micro Precision:
 0.03125
Micro Recall:
 0.01
Micro F-Score:
 0.015151515151515152


#### $\alpha=1$, $\beta=0$

In [17]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=1, beta=0)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.01
Micro Precision:
 0.03225806451612903
Micro Recall:
 0.01
Micro F-Score:
 0.015267175572519083


#### $\alpha=0$, $\beta=1$

In [16]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0, beta=1)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.02
Micro Precision:
 0.11764705882352941
Micro Recall:
 0.02
Micro F-Score:
 0.03418803418803419


## 2) 20k training/100 test size

### Loading data and initializing KNN

In [None]:
# Read datasets
dataset_train = Preprocessor(filepath=filepath_train, read_limit=20000)
dataset_test = Preprocessor(filepath=filepath_test, read_limit=100)
# Create numerical representations of labels for mapping
label_to_num = {artist:i for i, artist in enumerate(set(dataset_train.artists) | set(dataset_test.artists))}
num_to_label = {value:key for key, value in label_to_num.items()}
# Initiate Knn classifier
training_examples = [BOW(tok) for tok in dataset_train.tokenized]
training_labels = [label_to_num[label] for label in dataset_train.artists]
classifier = Knn(training_examples, training_labels)

### Jaccard

In [None]:
# Test predictions
test_examples = dataset_test.BOW()
test_labels = [label_to_num[label] for label in dataset_test.artists]
predictions = classifier.predict(test_examples, k=4, measure="jaccard")
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.09
Micro Precision:
 0.42857142857142855
Micro Recall:
 0.09
Micro F-Score:
 0.1487603305785124


### Sørensen-Dice

In [None]:
# Test predictions
test_examples = dataset_test.BOW()
test_labels = [label_to_num[label] for label in dataset_test.artists]
predictions = classifier.predict(test_examples, k=4, measure="dsc")
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.09
Micro Precision:
 0.42857142857142855
Micro Recall:
 0.09
Micro F-Score:
 0.1487603305785124


### Overlap index

In [None]:
# Test predictions
test_examples = dataset_test.BOW()
test_labels = [label_to_num[label] for label in dataset_test.artists]
predictions = classifier.predict(test_examples, k=4, measure="overlap")
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.01
Micro Precision:
 0.05555555555555555
Micro Recall:
 0.01
Micro F-Score:
 0.016949152542372885


### Tversky index

In [None]:
predictions = classifier.predict(test_examples, k=4, measure="tversky", alpha=0.7, beta=0.3)
# Run evaluation of algorithms performance
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05
Micro Precision:
 0.25
Micro Recall:
 0.05
Micro F-Score:
 0.08333333333333334


## 3) Full training/test sizes (46,120/5,765)

### Loading data and initializing KNN

In [None]:
# Read datasets
dataset_train = Preprocessor(filepath=filepath_train, read_limit=46120)
dataset_test = Preprocessor(filepath=filepath_test, read_limit=5765)
# Create numerical representations of labels for mapping
label_to_num = {artist:i for i, artist in enumerate(set(dataset_train.artists) | set(dataset_test.artists))}
num_to_label = {value:key for key, value in label_to_num.items()}
# Initiate Knn classifier
training_examples = [BOW(tok) for tok in dataset_train.tokenized]
training_labels = [label_to_num[label] for label in dataset_train.artists]
classifier = Knn(training_examples, training_labels)

#### Sørensen-Dice

In [4]:
# Run evaluation of algorithms performance
test_examples = [BOW(tok) for tok in dataset_test.tokenized]
test_labels = [label_to_num[label] for label in dataset_test.artists]
predictions = classifier.predict(test_examples, k=4, measure="dsc")
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05013009540329575
Micro Precision:
 0.05029585798816568
Micro Recall:
 0.05013009540329575
Micro F-Score:
 0.05021283989227695


### Jaccard
Experiments with $k \in \{1,2,...,5\}$

In [5]:
# Read dataset
filepath_train = "./data/songs_train.txt"
dataset_train = Preprocessor(filepath=filepath_train, read_limit=46120)
filepath_test = "./data/songs_test.txt"
dataset_test = Preprocessor(filepath=filepath_test, read_limit=5765)
# Create numerical representations of labels for mapping
label_to_num = {artist:i for i, artist in enumerate(set(dataset_train.artists) | set(dataset_test.artists))}
num_to_label = {value:key for key, value in label_to_num.items()}
# Initiate Knn classifier
training_examples = [BOW(tok) for tok in dataset_train.tokenized]
training_labels = [label_to_num[label] for label in dataset_train.artists]
classifier = Knn(training_examples, training_labels)
# Run evaluation of algorithms performance
test_examples = [BOW(tok) for tok in dataset_test.tokenized]
test_labels = [label_to_num[label] for label in dataset_test.artists]

In [2]:
predictions = classifier.predict(test_examples, k=4, measure="jaccard")
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05013009540329575
Micro Precision:
 0.05029585798816568
Micro Recall:
 0.05013009540329575
Micro F-Score:
 0.05021283989227695


In [6]:
predictions = classifier.predict(test_examples, k=3, measure="jaccard")
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.049262792714657416
Micro Precision:
 0.04941708717591787
Micro Recall:
 0.049262792714657416
Micro F-Score:
 0.04933981931897151


In [7]:
predictions = classifier.predict(test_examples, k=2, measure="jaccard")
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.04700780572419774
Micro Precision:
 0.04715503741082304
Micro Recall:
 0.04700780572419774
Micro F-Score:
 0.0470813064628214


In [8]:
predictions = classifier.predict(test_examples, k=1, measure="jaccard")
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.04700780572419774
Micro Precision:
 0.04715503741082304
Micro Recall:
 0.04700780572419774
Micro F-Score:
 0.0470813064628214


In [9]:
predictions = classifier.predict(test_examples, k=5, measure="jaccard")
evaluator = Evaluator(test_labels, predictions)
print("Accuracy:\n", evaluator.accuracy())
print("Micro Precision:\n", evaluator.micro_precision())
print("Micro Recall:\n", evaluator.micro_recall())
print("Micro F-Score:\n", evaluator.micro_fscore())

Accuracy:
 0.05151777970511708
Micro Precision:
 0.051688130873651233
Micro Recall:
 0.05151777970511708
Micro F-Score:
 0.051602814698983576
