This python program takes the maharshipandya/spotify-tracks-dataset dataset and computes genre-specific Gaussian Naive Bayes classifiers for a given songs popularity. The popularity is divided into three categories - low, medium, and high. This entire script can be run simply by running the full_feature_workflow() function after the other functions are defined.

This script works with Python 3.11.3 and uses the datasets, numpy, and sklearn libraries. The sklearn library is used only for evaluation of the model by generating accuracy, recall, precision, and f1 scores.

Each function is described in detail.

In [5]:
from datasets import load_dataset
import numpy as np
from sklearn.metrics import accuracy_score, recall_score, precision_score, f1_score

The load_data function takes the dataset name as returns the model data which includes the track id and all continuous features used to model the classifier, as well as a lookup dataset that contains the track name, names of artists on the track, and the name of the album.

In [6]:
def load_data(dataset_name="maharshipandya/spotify-tracks-dataset"):
    """
    Load and preprocess the dataset.
    """
    model = load_dataset(dataset_name, split="train")
    model = model.remove_columns("track_id")
    model = model.rename_column("Unnamed: 0", "track_id")
    lookup = model
    model = model.remove_columns(["artists", "album_name", "track_name", "explicit",
                                  "key", "time_signature", "mode"])
    lookup = lookup.select_columns(["track_id", "track_name", "artists", "album_name"])
    
    return model, lookup

The encode_genres function takes the dataset and encodes the genre from a string to a unique integer. It also returns a dictionary to decode the function later.

In [7]:
def encode_genres(dataset):
    """
    Encode genres into numeric values and add as a new column.

    Parameters:
    dataset (Dataset): The dataset to be processed.

    Returns:
    Dataset: The dataset with the new encoded column added.
    """
    dataset = dataset.rename_column('track_genre', 'track_genre_name')
    # Extract unique genres
    unique_genres = set(dataset['track_genre_name'])
    # Create a mapping from genre to unique integer
    genre_to_id = {genre: idx for idx, genre in enumerate(unique_genres)}

    # Function to encode genre and add as a new column
    def add_encoded_genre(record):
        record['track_genre'] = genre_to_id[record['track_genre_name']]
        return record

    # Apply the function to each record
    encoded_dataset = dataset.map(add_encoded_genre)
    return encoded_dataset, genre_to_id

The bin_popularity function evenly splits the popularity rating of every song of a given dataset into 0 (low), 1(medium), 3(high) and returns the dataset with binned popularity ratings.

In [8]:
def bin_popularity(dataset):
    """
    Bin the popularity into three categories: low (0), medium (1), and high (2).
    """
    def binning(record):
        popularity = record['popularity']
        if popularity < 34:   # Assuming 0-33 as low
            record['popularity'] = 0
        elif popularity < 67: # Assuming 34-66 as medium
            record['popularity'] = 1
        else:                # Assuming 67-100 as high
            record['popularity'] = 2
        return record

    binned_dataset = dataset.map(binning)
    return binned_dataset

The split_dataset_by_genre function takes the dataset and returns a dictionary, where the key is the genre identifier and the value is the genre-specific dataset. Given the entire dataset of 114 genres, this will return a dictionary with keys 0 to 113 and their respective genres, each with 1000 tracks.

In [9]:
def split_dataset_by_genre(dataset):
    """
    Split the dataset into separate datasets for each genre.

    Parameters:
    model_ds (Dataset): The model dataset to be split.

    Returns:
    dict: A dictionary where keys are genres and values are datasets for each genre.
    """
    genre_datasets = {}

    # Get unique genres
    unique_genres = set(record['track_genre'] for record in dataset)

    # Split the dataset by genre
    for genre in unique_genres:
        genre_dataset = dataset.filter(lambda record: record['track_genre'] == genre)

        # Store each genre-specific dataset
        genre_datasets[genre] = genre_dataset

    return genre_datasets

The normalize_features function normalizes the given dataset using min/max normalization. It should be invoked after the datasets are divided by genre, as training the model for combinations of genres results in poor model performance. Only by training on tracks that belong to the same genre, does the model perform well. The normalize_features function is invoked by the preprocess_genre_datasets function which divides the dataset by genres. 

In [10]:
def normalize_features(model_ds):
    """
    Normalize the continuous features in the modeling dataset to a range between 0 and 1.
    """
    features_to_normalize = ['duration_ms', 'loudness', 'tempo',
                             'danceability', 'speechiness', 'acousticness',
                             'instrumentalness', 'energy', 'liveness', 'valence'] 

    # Calculate min and max for each feature
    min_max_values = {}
    for feature in features_to_normalize:
        values = [record[feature] for record in model_ds]
        min_max_values[feature] = (min(values), max(values))

    # Normalize the features
    def normalize(record):
        for feature in features_to_normalize:
            min_val, max_val = min_max_values[feature]
            record[feature] = (record[feature] - min_val) / (max_val - min_val) if max_val != min_val else 0
        return record

    normalized_dataset = model_ds.map(normalize)

    return normalized_dataset

As mentioned, the preprocess_genre_datasets takes the dataset and subdivides it into genre-specific dictionaries, with the genre as the key, and the dataset as the value.

In [11]:
def preprocess_genre_datasets(genre_datasets):
    """
    Apply normalization to continuous features for each genre-specific dataset.

    Parameters:
    genre_to_datasets (dict): Dictionary with genres as keys and datasets as values.

    Returns:
    dict: Updated dictionary with preprocessed datasets.
    """
    processed_datasets = {}

    for genre, dataset in genre_datasets.items():
        # Apply normalization and encoding to the dataset
        processed_dataset = normalize_features(dataset)

        # Store the preprocessed dataset in the dictionary
        processed_datasets[genre] = processed_dataset

    return processed_datasets

The split_test_dataset function takes a given dataset and divides it into a training and testing set. This same function can be used for the training / validation split as well.

In [12]:
def split_test_dataset(dataset, test_size=0.1):
    """
    Split a dataset into two subsets with an 80/20 split.

    Parameters:
    dataset (Dataset): The dataset to be split.
    
    Returns:
    Two datasets representing the 80% and 20% splits.
    """
    # Select datasets based on indices
    split = dataset.train_test_split(test_size = test_size)
    train_set = split['train']
    test_set = split['test']

    return train_set, test_set

The train_popularity_classifier_continuous trains a Gaussian Naive Bayes classifier to predict the popularity of a given song based on the genre and several continuous features. It first populates X_train with values for each feature. It uses the popularity label of the training data to divide the feature values according to the popularity of the song, with three distinct categories of popularity, low(0), medium(1), and high(2). It then calculates the mean and variance of each popularity category.

It also calculates the log of prior probability of each category, the likelihood of a track from the given genre having a particular probability. The function returns a dictionary of parameters, where the key is the popularity class and the values are the means, variances, and prior probability of those classes.

In [13]:
def train_popularity_classifier_continuous(train_set, features):
    """
    Train a Gaussian Naive Bayes classifier for popularity prediction using continuous features.

    Parameters:
    train_set (Dataset): The training dataset.
    features (list): List of feature names to be used for training.

    Returns:
    dict: Trained model parameters.
    """
    # Extract popularity classes and features from the dataset
    X_train = []
    for record in train_set:
        record_features = []
        for feature in features:
            if feature in record:
                record_features.append(record[feature])
        X_train.append(record_features)
    Y_train = [record['popularity'] for record in train_set]

    # Calculate mean, variance, and prior probabilities for each popularity class
    unique_classes = set(Y_train)
    model_params = {}

    for cls in unique_classes:
        class_params = {}
        class_indices = [i for i, y in enumerate(Y_train) if y == cls]
        class_data = [X_train[i] for i in class_indices]
        for i, feature in enumerate(features):
            means = np.mean([x[i] for x in class_data], axis=0)
            variances = np.var([x[i] for x in class_data], axis=0)
            class_params[feature] = (means, variances)  # Use feature name as key
        prior_probability = np.log(len(class_indices) / len(Y_train))
        model_params[cls] = {'params': class_params, 'prior': prior_probability}

    return model_params

The predict_popularity_continuous funciton takes the parameters calculated in the train_popularity_classifier_continuous function as well as the input feature values and the names of the input features. It calculates a prediction of the popularity of a given test song according to those parameters (parameters calculated in the above training function). 

It first initializes a dictionary of log probabilities to 0. It then calculates the log likelihood contribution of each feature using the Gaussian probability density function. It sums the log likehood contributions of all features, and repeats this for all classes. The class with the highest posterior probability is selected as the predicted class.

In [14]:
def predict_popularity_continuous(model_params, input_features, features):
    log_probabilities = {popularity: 0 for popularity in model_params.keys()}
    
    for popularity, params in model_params.items():
        means_variances = params['params']
        log_prior = params['prior']
        log_likelihood = 0
        for feature, value in zip(features, input_features):
            mean, variance = means_variances[feature]
            if variance != 0:
                log_likelihood += -0.5 * np.log(2 * np.pi * variance) - 0.5 * ((value - mean) ** 2 / variance)
        log_probabilities[popularity] = log_likelihood + log_prior
    
    return max(log_probabilities, key=log_probabilities.get)

The full_feature_workflow function is essentially the main function, which uses the functions defined above and trains models on 90% of each genre and tests each genre against its model. It outputs the average accuracy, precision, recall, and f1 score as well as those metrics for the 10 top performing genres and 10 worst performing genres according to their respective models.

In [15]:
def full_feature_workflow():
    # Load Data
    dataset_name = "maharshipandya/spotify-tracks-dataset"
    dataset, lookup = load_data(dataset_name)

    # Encode Genres and Bin Popularity
    dataset, genre_to_id = encode_genres(dataset)
    dataset = bin_popularity(dataset)

    # Split by Genre
    genre_datasets = split_dataset_by_genre(dataset)

    # Preprocess Genre Datasets
    preprocessed_datasets = preprocess_genre_datasets(genre_datasets)

    # Define features for continuous and nominal models
    features_cont = ['duration_ms', 'loudness', 'tempo', 'danceability', 'energy', 'speechiness', 'acousticness', 'instrumentalness', 'liveness', 'valence']

    # Initialize containers for final evaluation metrics
    final_results = {}
    total_accuracy, total_recall, total_precision, total_f1 = 0, 0, 0, 0
    num_genres = 0

    # Iterative Training, Validation, and Testing
    for genre, dataset in preprocessed_datasets.items():
        # Split into training and test sets
        train_set, test_set = split_test_dataset(dataset)

        # Train Models
        model_params = train_popularity_classifier_continuous(train_set, features_cont)

        # Evaluate on Test Set
        y_test = [record['popularity'] for record in test_set]
        y_pred = [predict_popularity_continuous(model_params, [record[feature] for feature in features_cont], features_cont) for record in test_set]

        # Calculate metrics
        accuracy = accuracy_score(y_test, y_pred)
        recall = recall_score(y_test, y_pred, average='macro', zero_division=0)
        precision = precision_score(y_test, y_pred, average='macro', zero_division=0)
        f1 = f1_score(y_test, y_pred, average='macro', zero_division=0)
        final_results[genre] = (format(accuracy, '.2f'), format(recall, '.2f'), format(precision, '.2f'), format(f1, '.2f'))
        
        # Accumulate total metrics for averaging
        total_accuracy += accuracy
        total_recall += recall
        total_precision += precision
        total_f1 += f1
        num_genres += 1
        
    # Calculate average metrics
    avg_accuracy = format(total_accuracy / num_genres, '.2f')
    avg_recall = format(total_recall / num_genres, '.2f')
    avg_precision = format(total_precision / num_genres, '.2f')
    avg_f1 = format(total_f1 / num_genres, '.2f')
    print(f"Average Metrics across all genres - Accuracy: {avg_accuracy}, Recall: {avg_recall}, Precision: {avg_precision}, F1 Score: {avg_f1}\n")
    
    id_to_genre_name = {v: k for k, v in genre_to_id.items()}
    
    # Sort genres by accuracy and print top 10 and bottom 10
    sorted_genres = sorted(final_results.items(), key=lambda x: x[1][0], reverse=True)  # Sort by accuracy
    print("Top 10 Genres:")
    for genre_id, metrics in sorted_genres[:10]:
        genre_name = id_to_genre_name.get(genre_id, "Unknown Genre")
        print(f"{genre_name}: Accuracy: {metrics[0]}, Recall: {metrics[1]}, Precision: {metrics[2]}, F1 Score: {metrics[3]}")

    print("\nBottom 10 Genres:")
    for genre_id, metrics in sorted_genres[-10:]:
        genre_name = id_to_genre_name.get(genre_id, "Unknown Genre")
        print(f"{genre_name}: Accuracy: {metrics[0]}, Recall: {metrics[1]}, Precision: {metrics[2]}, F1 Score: {metrics[3]}")

In [16]:
# Execute the main workflow
full_feature_workflow()

Map:   0%|          | 0/114000 [00:00<?, ? examples/s]

Map:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/114000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Average Metrics across all genres - Accuracy: 0.66, Recall: 0.46, Precision: 0.46, F1 Score: 0.43

Top 10 Genres:
forro: Accuracy: 1.00, Recall: 1.00, Precision: 1.00, F1 Score: 1.00
iranian: Accuracy: 1.00, Recall: 1.00, Precision: 1.00, F1 Score: 1.00
sertanejo: Accuracy: 1.00, Recall: 1.00, Precision: 1.00, F1 Score: 1.00
samba: Accuracy: 0.99, Recall: 0.75, Precision: 0.99, F1 Score: 0.83
mpb: Accuracy: 0.97, Recall: 0.49, Precision: 0.49, F1 Score: 0.49
romance: Accuracy: 0.96, Recall: 0.48, Precision: 0.50, F1 Score: 0.49
pagode: Accuracy: 0.96, Recall: 0.33, Precision: 0.33, F1 Score: 0.33
tango: Accuracy: 0.96, Recall: 0.49, Precision: 0.49, F1 Score: 0.49
brazil: Accuracy: 0.94, Recall: 0.68, Precision: 0.68, F1 Score: 0.68
detroit-techno: Accuracy: 0.94, Recall: 0.65, Precision: 0.59, F1 Score: 0.61

Bottom 10 Genres:
acoustic: Accuracy: 0.41, Recall: 0.45, Precision: 0.48, F1 Score: 0.36
j-rock: Accuracy: 0.40, Recall: 0.31, Precision: 0.40, F1 Score: 0.30
folk: Accuracy: 0.