# 1. Setting Up the Working Environment

In [None]:
from google.colab import drive
drive.mount("/content/drive/")

%cd '/content/drive/MyDrive/DS340W'

# 2. Feature Extraction from Audio Files

In this code, we have a function designed to extract various musical features from audio files using the librosa library. These features include tempo, chroma frequency, mel-frequency cepstral coefficients (MFCCs), and spectral contrast.

In [17]:
import numpy as np

# Function to calculate multiple features of songs
def extract_features(file_path):
     src, sr = librosa.load(file_path, sr=None, mono=True)
     tempo, _ = librosa.beat.beat_track(y=src, sr=sr)
     chroma_stft = librosa.feature.chroma_stft(y=src, sr=sr)
     mfcc = librosa.feature.mfcc(y=src, sr=sr)
     spectral_contrast = librosa.feature.spectral_contrast(y=src, sr=sr)

     # Previously we flattened the feature matrix, now we only calculate statistics
     chroma_stft_mean = np.mean(chroma_stft, axis=1)
     chroma_stft_std = np.std(chroma_stft, axis=1)
     mfcc_mean = np.mean(mfcc, axis=1)
     mfcc_std = np.std(mfcc, axis=1)
     spectral_contrast_mean = np.mean(spectral_contrast, axis=1)
     spectral_contrast_std = np.std(spectral_contrast, axis=1)

     #Create a feature vector containing statistics
     features_vector = np.hstack([tempo, chroma_stft_mean, chroma_stft_std,
                                  mfcc_mean, mfcc_std,
                                  spectral_contrast_mean, spectral_contrast_std])
     return features_vector

This function delves into the heart of the audio, quantifying the nuances of music into a structured form that can be analyzed and processed by machine learning algorithms.

# 3. Assembling the Dataset

Here, we gather the extracted features from both AI-generated and human-created music files, compiling them into a comprehensive dataset with corresponding labels.

In [18]:
import os
import matplotlib.pyplot as plt
import librosa
import librosa.display
import pandas as pd

#Create an empty list to store features and labels
data = []

# Traverse the music created by AI
for file in os.listdir('/content/drive/MyDrive/DS340W/AI_Music'):
     file_path = os.path.join('/content/drive/MyDrive/DS340W/AI_Music', file)
     features_vector = extract_features(file_path)
     # Create a tuple containing feature vectors and labels
     sample = (features_vector, 'AI')
     data.append(sample)

# Traverse music created by humans
for file in os.listdir('/content/drive/MyDrive/DS340W/Human_Music'):
     file_path = os.path.join('/content/drive/MyDrive/DS340W/Human_Music', file)
     features_vector = extract_features(file_path)
     # Create a tuple containing feature vectors and labels
     sample = (features_vector, 'Human')
     data.append(sample)

# We first separate the feature vectors and labels
features_list = [sample[0] for sample in data] # Get all feature vectors here
labels_list = [sample[1] for sample in data] # Get all labels here

# Now convert the feature vector to a DataFrame
df_features = pd.DataFrame(features_list)

# Add label as a new column of DataFrame
df_features['label'] = labels_list

df = df_features

This portion of code walks through the musical repository, translating each piece into a dataset ready to inform and train the keen mind of an AI.

# 4. Preparing Data for Model Training

This snippet is about splitting the dataset into training and testing sets. It ensures that the machine learning models have a set of data to learn from as well as a separate set to validate their predictions.

In [19]:
from sklearn.model_selection import train_test_split

# Assume df is your DataFrame containing features and labels
X = df.drop('label', axis=1) #Features
y = df['label'] # label

# Split the data set, 80% for training and 20% for testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Check the size of the split data set
print("Training set size:", X_train.shape)
print("Test set size:", X_test.shape)

Training set size: (80, 79)
Test set size: (20, 79)


By segmenting the data, we provide a solid foundation for the algorithms to train on and later demonstrate their predictive capabilities.

# 5. Random Forest Classification

It trains a Random Forest classifier on the training data and evaluates its performance on the test data. Random Forest is an ensemble learning method that can be very effective for classification tasks.

In [20]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score

#Initialize the random forest classifier
clf = RandomForestClassifier(n_estimators=100, random_state=42)

#Train model
clf.fit(X_train, y_train)

# Use the model to make predictions
y_pred = clf.predict(X_test)

# Evaluate the model
print("Random Forest Accuracy:", accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))

Random Forest Accuracy: 1.0
              precision    recall  f1-score   support

          AI       1.00      1.00      1.00        12
       Human       1.00      1.00      1.00         8

    accuracy                           1.00        20
   macro avg       1.00      1.00      1.00        20
weighted avg       1.00      1.00      1.00        20



# 6. Logistic Regression Analysis

A Logistic Regression model is applied to the dataset to predict whether the music was created by AI or a human, and its performance metrics are calculated.

In [21]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

#Initialize logistic regression model
log_reg = LogisticRegression(max_iter=1000)

#Train model
log_reg.fit(X_train, y_train)

# Use the model to make predictions
y_pred_log_reg = log_reg.predict(X_test)

# Evaluate the model
print("Logistic Regression Accuracy:", accuracy_score(y_test, y_pred_log_reg))
print(classification_report(y_test, y_pred_log_reg))

Logistic Regression Accuracy: 0.8
              precision    recall  f1-score   support

          AI       0.83      0.83      0.83        12
       Human       0.75      0.75      0.75         8

    accuracy                           0.80        20
   macro avg       0.79      0.79      0.79        20
weighted avg       0.80      0.80      0.80        20



# 7. Support Vector Machine (SVM) Classifier

The code demonstrates the training of a Support Vector Machine (SVM) with a linear kernel to classify the music pieces. The SVM's performance is then evaluated.

In [22]:
from sklearn.svm import SVC

svm_model = SVC(kernel='linear')  # You can choose other kernels such as 'rbf', 'poly', etc.
svm_model.fit(X_train, y_train)
y_pred_svm = svm_model.predict(X_test)

print("SVM Accuracy:", accuracy_score(y_test, y_pred_svm))
print(classification_report(y_test, y_pred_svm))


SVM Accuracy: 0.85
              precision    recall  f1-score   support

          AI       0.85      0.92      0.88        12
       Human       0.86      0.75      0.80         8

    accuracy                           0.85        20
   macro avg       0.85      0.83      0.84        20
weighted avg       0.85      0.85      0.85        20



The SVM classifier had an accuracy of 0.85, with higher precision and recall for AI-created music compared to human. This shows that it was better at identifying AI-created music correctly, but still performed reasonably well overall.

# 8. K-Nearest Neighbors (K-NN) Classification

This part uses the K-Nearest Neighbors algorithm to classify the music samples. It determines the label of a sample based on the majority vote of its nearest neighbors.

In [23]:
from sklearn.neighbors import KNeighborsClassifier

knn_model = KNeighborsClassifier(n_neighbors=3)  # The number of neighbors can be tuned
knn_model.fit(X_train, y_train)
y_pred_knn = knn_model.predict(X_test)

print("K-NN Accuracy:", accuracy_score(y_test, y_pred_knn))
print(classification_report(y_test, y_pred_knn))


K-NN Accuracy: 0.8
              precision    recall  f1-score   support

          AI       0.83      0.83      0.83        12
       Human       0.75      0.75      0.75         8

    accuracy                           0.80        20
   macro avg       0.79      0.79      0.79        20
weighted avg       0.80      0.80      0.80        20



With an accuracy of 0.8, K-NN showed similar precision and recall for both AI and human-created music. While not as high-performing as ensemble methods or the decision tree, it still managed to classify the majority of the instances correctly.

# 9. Decision Tree Classifier

A Decision Tree classifier is being utilized. It creates a model that predicts the class of a sample by learning simple decision rules inferred from the training data.

In [24]:
from sklearn.tree import DecisionTreeClassifier

tree_model = DecisionTreeClassifier(random_state=42)
tree_model.fit(X_train, y_train)
y_pred_tree = tree_model.predict(X_test)

print("Decision Tree Accuracy:", accuracy_score(y_test, y_pred_tree))
print(classification_report(y_test, y_pred_tree))


Decision Tree Accuracy: 0.95
              precision    recall  f1-score   support

          AI       1.00      0.92      0.96        12
       Human       0.89      1.00      0.94         8

    accuracy                           0.95        20
   macro avg       0.94      0.96      0.95        20
weighted avg       0.96      0.95      0.95        20



The decision tree classifier also performed very well with an accuracy of 0.95. It showed perfect recall for human-created music and very high precision for AI-created music. This model is very close to the ensemble methods, suggesting that the dataset might have distinct, well-defined patterns.

# 10. Gradient Boosting Classifier

The Gradient Boosting Classifier is a powerful ensemble technique that builds one tree at a time and corrects for the mistakes of previous trees.

In [25]:
from sklearn.ensemble import GradientBoostingClassifier

gb_model = GradientBoostingClassifier(random_state=42)
gb_model.fit(X_train, y_train)
y_pred_gb = gb_model.predict(X_test)

print("Gradient Boosting Accuracy:", accuracy_score(y_test, y_pred_gb))
print(classification_report(y_test, y_pred_gb))


Gradient Boosting Accuracy: 1.0
              precision    recall  f1-score   support

          AI       1.00      1.00      1.00        12
       Human       1.00      1.00      1.00         8

    accuracy                           1.00        20
   macro avg       1.00      1.00      1.00        20
weighted avg       1.00      1.00      1.00        20



# 11. Conslusion

The performance ranking of the classifiers based on accuracy would be:

Gradient Boosting & Random Forest (1.0 accuracy)

Decision Tree (0.95 accuracy)

SVM (0.85 accuracy)

Logistic Regression & K-NN (0.8 accuracy)



It's clear from the results that the ensemble methods (Gradient Boosting and Random Forest) are the most accurate for this particular task. However, the decision tree also provides a high level of accuracy and may offer advantages in terms of simplicity and interpretability.

The Logistic Regression model, while not as accurate as the ensemble methods or the decision tree, still provides reasonable accuracy and might be preferred when interpretability or computational simplicity is more critical.


It is important to note that while accuracy is a useful metric, it is not the only consideration. The choice of model can depend on various factors including interpretability, computational cost, and the specific requirements of the task at hand. Additionally, care must be taken to ensure that the models are not overfitting to the training data and that they will generalize well to new, unseen data. Therefore, further evaluation using cross-validation or on an independent test set would be advisable before deploying any of these models into production.