# Music Emotion Recognition Using Backpropagation Neural Network and ABC Algorithm

**Breif**:   
Through this project, we want to explore a way to better predict a piece of random audio into a specific emotion region located in Russell's Emotion Quadrant.  

![Russell's Emotion Quadrant](https://www.researchgate.net/profile/Yi-Hsuan-Yang/publication/254004106/figure/fig1/AS:298208942149638@1448109960909/The-2D-valence-arousal-emotion-space-Russell-1980-the-position-of-the-affective.pngv)



## Procedures Overview

1. Data Processing
2. Featurization using Librosa
3. Model training
4. Evaluate

In [None]:
##### xiyah #####
# mount google drive
# since I uploaded all data into google drive.
from google.colab import drive
drive.mount('/content/drive')


# install librosa
# after installation can comment out
# "-q > /dev/null" helps hide the install message
!pip install librosa -q > /dev/null
##### xiyah #####

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# import packages
import os
import librosa
import glob
import pandas as pd
import numpy as np

# used for storing features into files
import pickle

# used for calculating statistical features
from scipy.stats import skew, kurtosis

## Small Demo Section

In [None]:
##### xiyah #####
# demo on a piece of audio
# !! For loading data and all other places using path,
# you can comment out my folder path and create your own instead of deleting them. !!
audio_demo_dir = "/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/MER_audio_taffc_dataset/Q1/MT0000040632.mp3"

# Load the audio as a waveform `y` - 1D Numpy Array, shape (662823,)
#      represents the amplitude of the sound signal at different times
# Store the sampling rate as `sr`  - expressed in Hertz (Hz),
#      is the number of samples of audio carried per second.
#      It defines how many data points are recorded in the audio per second.
# Notice: librosa.load can be used to decide how long of the audio you would like
#      to pick in seconds. Use as librosa.load(audio_dir, duration=15) for first 15 secs.
y_demo, sr_demo = librosa.load(audio_demo_dir)
print(f"waveform: {y_demo}")
print(f"sampling rate: {sr_demo}")
##### xiyah #####

waveform: [ 1.4363665e-14  1.7338858e-14 -1.7440724e-15 ... -2.9188269e-07
 -3.2757399e-07  5.5034837e-07]
sampling rate: 22050


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# computing chromagram(提取音乐片段的色度频谱特征)
chroma_stft_demo = librosa.feature.chroma_stft(y=y_demo, sr=sr_demo)
print(f"Shape of feature chroma stft for demo: {chroma_stft_demo.shape}")
print(chroma_stft_demo)

Shape of feature chroma stft for demo: (12, 1295)
[[0.35400018 0.31895828 0.15409946 ... 0.13093439 0.08016849 0.4637019 ]
 [0.58096725 0.44877502 0.24988233 ... 0.20016244 0.18250564 0.6199982 ]
 [0.6362316  0.24465618 0.15558887 ... 0.1040313  0.12916775 0.47586495]
 ...
 [0.44917685 0.27947944 0.35670373 ... 0.8553974  0.9214992  0.913139  ]
 [0.3240944  0.28253257 0.27381223 ... 0.3078815  0.37076196 0.5504805 ]
 [0.24673125 0.2669974  0.15903513 ... 0.14373718 0.10572162 0.3213499 ]]


comments: (12, 1295) indicates there are 12 pitches(音高) in the music demo, 1295(个时间窗口，每个窗口都有一个对应的色度特征向量) time windows, each window has a corresponding chromagram feature vector.

## Data Preparation

In [None]:
##### xiyah #####
# I would like to know what are the top-100 features Panda R. and his group used
# use features.csv and top100_features.csv
# !! For loading data and all other places using path,
# you can comment out my folder path and create your own instead of deleting them. !!
top100 = pd.read_csv("/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/Features - Audio MER/top100_features.csv")
##### xiyah #####

In [None]:
##### xiyah #####
print("Take a look on top100 dataset")
top100.head()
##### xiyah #####

Take a look on top100 dataset


Unnamed: 0,SongID,F0525,F1152,ORIGINAL-TEXTURE-Musical Layers (Mean),F1166,F0133,F0136,F0507,ORIGINAL-TEXTURE-Musical Layers (Std),F0404,...,F0011,F0905,F0899,ORIGINAL-EXPRESSIVE_TECHNIQUES-Glissando Length (Std),F0328,F1101,F0782,F0923,ORIGINAL-EXPRESSIVE_TECHNIQUES-Glissando Slope (Std),Quadrant
0,MT0000004637,4.6433,9.2878,0.92722,700750.0,0.46626,3.1386,0.16951,0.74902,0.085051,...,4.1456,0.32337,0.47764,0.074137,0.70133,5.8488,0.57884,0.2881,1044.4,Q3
1,MT0000011357,1.5461,29.366,0.98409,1817800.0,0.3293,1.7539,0.30058,0.70333,0.084305,...,0.86509,-0.4802,-0.44169,0.0,0.43577,5.5128,0.65528,-0.26923,0.0,Q2
2,MT0000011975,2.1486,40.437,1.0521,1277900.0,0.33701,2.247,0.16411,0.72762,0.10638,...,4.2906,-0.2206,-0.27754,0.0,0.44737,4.7568,0.68354,-0.095701,0.0,Q2
3,MT0000040632,4.6632,25.739,2.6838,600090.0,0.65021,3.9683,0.075348,1.1579,0.085314,...,5.2102,-0.10119,-0.10105,0.038622,0.65352,6.1356,0.64734,0.03194,937.15,Q1
4,MT0000044741,3.5518,24.134,1.8191,1186400.0,0.38171,2.4845,0.18646,0.89834,0.10498,...,3.6752,0.051116,-0.039488,0.09964,0.77563,5.7044,0.66908,0.46226,1162.1,Q3


In [None]:
##### xiyah #####
# audios with top100 features
# type(features)
# pandas.core.indexes.base.Index
features = top100.columns[1:-1]
print(f"shape of top100 dataset: {top100.shape}")
print(f"top100 features names: {features}")
##### xiyah #####

shape of top100 dataset: (900, 102)
top100 features names: Index(['F0525', 'F1152', 'ORIGINAL-TEXTURE-Musical Layers (Mean)', 'F1166',
       'F0133', 'F0136', 'F0507', 'ORIGINAL-TEXTURE-Musical Layers (Std)',
       'F0404', 'ORIGINAL-EXPRESSIVE_TECHNIQUES-Tremolo Notes in Cents (Mean)',
       'F0945', 'F1151', 'F1497', 'F0529',
       'ORIGINAL-TEXTURE-State Transitions ML1 -> ML0 (Per Sec)', 'F0151',
       'F1194', 'F0121', 'F0927',
       'ORIGINAL-EXPRESSIVE_TECHNIQUES-Vibrato Extent (Std)',
       'ORIGINAL-EXPRESSIVE_TECHNIQUES-Tremolo Higher Notes Coverage (C4+)',
       'F0438', 'F1489', 'F0247', 'F0021', 'F0246', 'F1089', 'F0469',
       'ORIGINAL-EXPRESSIVE_TECHNIQUES-Tremolo Notes in Cents (Max)',
       'ORIGINAL-EXPRESSIVE_TECHNIQUES-Vibrato Rate (Kurtosis)', 'F0772',
       'ORIGINAL-EXPRESSIVE_TECHNIQUES-Vibrato Base Freq (Kurtosis)',
       'ORIGINAL-EXPRESSIVE_TECHNIQUES-Vibrato to Non Vibrato Notes Ratio',
       'F0909', 'F1071', 'F0466', 'F1077',
       'ORIGINAL

In [None]:
##### xiyah #####
feature_lookup = pd.read_csv("/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/Features - Audio MER/features.csv", sep=';')
##### xiyah #####

In [None]:
##### xiyah #####
print("Take a look on feature_lookup dataset")
feature_lookup.head()
##### xiyah #####

Take a look on feature_lookup dataset


Unnamed: 0,Feature,Name,Toolbox,Category,Family,Parent,Description
0,F0001,Root-Mean-Square Energy (mean),MIR Toolbox 1.6.1,Dynamics,Root-Mean-Square Energy,Root-Mean-Square Energy,The root mean square energy calculates the glo...
1,F0002,Root-Mean-Square Energy (std),MIR Toolbox 1.6.1,Dynamics,Root-Mean-Square Energy,Root-Mean-Square Energy,The root mean square energy calculates the glo...
2,F0003,Root-Mean-Square Energy (skewness),MIR Toolbox 1.6.1,Dynamics,Root-Mean-Square Energy,Root-Mean-Square Energy,The root mean square energy calculates the glo...
3,F0004,Root-Mean-Square Energy (kurtosis),MIR Toolbox 1.6.1,Dynamics,Root-Mean-Square Energy,Root-Mean-Square Energy,The root mean square energy calculates the glo...
4,F0005,Root-Mean-Square Energy (max),MIR Toolbox 1.6.1,Dynamics,Root-Mean-Square Energy,Root-Mean-Square Energy,The root mean square energy calculates the glo...


In [None]:
##### xiyah #####
# feature_lookup
lookup_table = feature_lookup.iloc[:, 0:2]
print(f"shape of features dataset: {feature_lookup.shape}")
print("features and their meanings:")
lookup_table.head()
##### xiyah #####

shape of features dataset: (1603, 7)
features and their meanings:


Unnamed: 0,Feature,Name
0,F0001,Root-Mean-Square Energy (mean)
1,F0002,Root-Mean-Square Energy (std)
2,F0003,Root-Mean-Square Energy (skewness)
3,F0004,Root-Mean-Square Energy (kurtosis)
4,F0005,Root-Mean-Square Energy (max)


In [None]:
##### xiyah #####
# join features and feature_lookup to see what exactly are the top 100 features.
features_df = pd.DataFrame(features, columns=['Feature'])
merged_df = pd.merge(features_df, feature_lookup, how='left', on='Feature')[["Feature", "Name", "Toolbox", "Category", "Description"]]
merged_df.head()

Unnamed: 0,Feature,Name,Toolbox,Category,Description
0,F0525,MFCC1 (mean),Marsyas,Tone Color,to be added later
1,F1152,FFT Spectrum - Average Power Spectrum (median),PsySound 3,Tone Color,to be added later
2,ORIGINAL-TEXTURE-Musical Layers (Mean),,,,
3,F1166,FFT Spectrum - Spectral 2nd Moment (median),PsySound 3,Tone Color,to be added later
4,F0133,Spectral Skewness (std),MIR Toolbox 1.6.1,Tone Color,Coefficient of skewness. The third central mom...


In [None]:
##### xiyah #####
# Find how many features has no corresponding names
num_null = merged_df["Name"].isna().sum()
print(f"There are {num_null} NaNs in the 'Name' columns, out of 100 in total")

# examine what are they(with NaN in Name)
merged_df[merged_df["Name"].isna()].head()
##### xiyah #####

There are 29 NaNs in the 'Name' columns, out of 100 in total


Unnamed: 0,Feature,Name,Toolbox,Category,Description
2,ORIGINAL-TEXTURE-Musical Layers (Mean),,,,
7,ORIGINAL-TEXTURE-Musical Layers (Std),,,,
9,ORIGINAL-EXPRESSIVE_TECHNIQUES-Tremolo Notes i...,,,,
14,ORIGINAL-TEXTURE-State Transitions ML1 -> ML0 ...,,,,
19,ORIGINAL-EXPRESSIVE_TECHNIQUES-Vibrato Extent ...,,,,


By looking back into papers, we can notice that all these features comes from the three algorithms mentioned by Renato Panda and other authors in the paper _"MUSICAL TEXTURE AND EXPRESSIVITY FEATURES FOR MUSIC EMOTION RECOGNITION"_, section 3.3.3. Since they cannot be generated by Librosa, we decided to first drop these features.

In [None]:
##### xiyah #####
# examine those with Name != NaN
merged_df[-merged_df["Name"].isna()].head()
##### xiyah #####

Unnamed: 0,Feature,Name,Toolbox,Category,Description
0,F0525,MFCC1 (mean),Marsyas,Tone Color,to be added later
1,F1152,FFT Spectrum - Average Power Spectrum (median),PsySound 3,Tone Color,to be added later
3,F1166,FFT Spectrum - Spectral 2nd Moment (median),PsySound 3,Tone Color,to be added later
4,F0133,Spectral Skewness (std),MIR Toolbox 1.6.1,Tone Color,Coefficient of skewness. The third central mom...
5,F0136,Spectral Skewness (max),MIR Toolbox 1.6.1,Tone Color,Coefficient of skewness. The third central mom...


In [None]:
folder_path_list = []
for i in range(1,5):
  folder_path_list.append('/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/MER_audio_taffc_dataset/{}'.format("Q"+str(i)))


**After examing the features, we noticed that there are only a few overlapping features between Renato Panda and his teams work and librosa. Therefore, we decided to keep using librosa to convert our own features, in total 19 features instead of the top 100 we would origianlly like to use.**

## Featurization

In [None]:
# ##### yulu, xiyah, xiangyu #####

# ######!!!!!! Don't run this code !!!!!!######

# # This chunk of code is designed to run ONLY one-time
# # to store all the data. If you run this, the original data will be
# # replaced.

# # store

# #### function: get_audio_files ####
# # return audio files
# def get_audio_files(folder_path):
#     audio_files = []
#     audio_extensions = ['*.mp3']
#     for extension in audio_extensions:
#         audio_files.extend(glob.glob(os.path.join(folder_path, extension)))
#     return audio_files

# #### change to process q1-q4 simultaneously ####
# folder_path_list = []
# for i in range(1,5):
#   folder_path_list.append('/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/MER_audio_taffc_dataset/{}'.format("Q"+str(i)))

# #### initialization ####
# label_index = 0

# # record the progress
# from tqdm.notebook import tqdm

# #### run for loop to do featurization and store ####
# for folder_path in tqdm(folder_path_list, desc='Processing folders'):
#   label_index += 1
#   files = get_audio_files(folder_path)

#   #### initialization ####
#   # store waveform x's
#   xs=[]
#   # store sampling rate sr's
#   srs=[]

#   # get spectral feature and rhythm feature
#   chroma_stft_demos=[]
#   chroma_cqt_demos=[]
#   chroma_vqt_demos=[]
#   chroma_cens_demos=[] #添加一个list, Haoruo

#   melspectrogram_demos=[]
#   mfcc_demos=[]
#   rms_demos=[]

#   spectral_centroid_demos=[]
#   spectral_bandwidth_demos=[]
#   spectral_contrast_demos=[]
#   spectral_flatness_demos=[]
#   spectral_rolloff_demos=[]

#   # poly_features = [] -- not use because it shows coefficients of fitting an nth-order polynomial to the columns of a spectrogram.
#   tonnetz_demos=[]
#   zero_crossing_rate_demos=[]

#   tempo = []
#   tempogram = []
#   fourier_tempogram = []
#   tempogram_ratio = []

#   labels = []

#   if label_index <= 2:
#     continue

#   # for files in each of the Q1, Q2, Q3, Q4 folder
#   for file in tqdm(files, desc=f'Folder {label_index}/{len(folder_path_list)}: Processing files', leave=False):
#     # load audio file
#     # x: waveform; sr: sampling rate
#     x, sr = librosa.load(file)
#     xs.append(x)
#     srs.append(sr)

#     # convert audio to features
#     chroma_stft_demos.append(librosa.feature.chroma_stft(y=x, sr=sr))
#     chroma_cqt_demos.append(librosa.feature.chroma_cqt(y=x, sr=sr))
#     chroma_cens_demos.append(librosa.feature.chroma_cens(y=x, sr=sr))
#     chroma_vqt_demos.append(librosa.feature.chroma_vqt(y=x, sr=sr,intervals='ji5'))  #4

#     melspectrogram_demos.append(librosa.feature.melspectrogram(y=x, sr=sr))
#     mfcc_demos.append(librosa. feature.mfcc(y=x, sr=sr))
#     rms_demos.append(librosa. feature.rms(y=x))                                      #7

#     spectral_centroid_demos.append(librosa.feature.spectral_centroid(y=x, sr=sr))
#     spectral_bandwidth_demos.append(librosa.feature.spectral_bandwidth(y=x, sr=sr))
#     spectral_contrast_demos.append(librosa.feature.spectral_contrast(y=x, sr=sr))
#     spectral_flatness_demos.append(librosa.feature.spectral_flatness(y=x))
#     spectral_rolloff_demos.append(librosa.feature.spectral_rolloff(y=x, sr=sr))     #12

#     tonnetz_demos.append(librosa.feature.tonnetz(y=x, sr=sr))
#     zero_crossing_rate_demos.append(librosa.feature.zero_crossing_rate(y=x))        #14

#     tempo.append(librosa.feature.tempo(y=x, sr=sr))
#     tempogram.append(librosa.feature.tempogram(y=x, sr=sr))
#     fourier_tempogram.append(librosa.feature.fourier_tempogram(y=x, sr=sr))
#     tempogram_ratio.append(librosa.feature.tempogram_ratio(y=x, sr=sr))       #18

#     # # store converted features into list

#     # chroma_stft_demos.append(chroma_stft_demo)
#     # chroma_cqt_demos.append(chroma_cqt_demo)
#     # chroma_vqt_demos.append(chroma_vqt_demo)
#     # chroma_cens_demos.append(chroma_cens_demo) #添加了一个, Haoruo

#     # melspectrogram_demos.append(melspectrogram_demo)
#     # mfcc_demos.append(mfcc_demo)
#     # rms_demos.append(rms_demo)

#     # spectral_centroid_demos.append(spectral_centroid_demo)
#     # spectral_bandwidth_demos.append(spectral_bandwidth_demo)
#     # spectral_contrast_demos.append(spectral_contrast_demo)
#     # spectral_flatness_demos.append(spectral_flatness_demo)
#     # spectral_rolloff_demos.append(spectral_rolloff_demo)

#     # tonnetz_demos.append(tonnetz_demo)
#     # zero_crossing_rate_demos.append(zero_crossing_rate_demo)

#     # tempo.append(tempo_file)
#     # tempogram.append(tempogram_file)
#     # fourier_tempogram.append(fourier_tempogram_file)
#     # tempogram_ratio.append(tempogram_ratio_file)
#     labels.append(label_index)                                               #19

#   # outer for loop
#   features_dict = {
#     'xs': xs,
#     'srs': srs,
#     'chroma_stft_demos': chroma_stft_demos,
#     'chroma_cqt_demos': chroma_cqt_demos,
#     'chroma_vqt_demos': chroma_vqt_demos,
#     'chroma_cens_demos': chroma_cens_demos,
#     'melspectrogram_demos': melspectrogram_demos,
#     'mfcc_demos': mfcc_demos,
#     'rms_demos': rms_demos,
#     'spectral_centroid_demos': spectral_centroid_demos,
#     'spectral_bandwidth_demos': spectral_bandwidth_demos,
#     'spectral_contrast_demos': spectral_contrast_demos,
#     'spectral_flatness_demos': spectral_flatness_demos,
#     'spectral_rolloff_demos': spectral_rolloff_demos,
#     'tonnetz_demos': tonnetz_demos,
#     'zero_crossing_rate_demos': zero_crossing_rate_demos,
#     'tempo': tempo,
#     'tempogram': tempogram,
#     'fourier_tempogram': fourier_tempogram,
#     'tempogram_ratio': tempogram_ratio,
#     'labels': labels
#   }

#   # save by segments to release RAM
#   np.save(f"/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q{label_index}.npy", features_dict, allow_pickle=True)

#   del features_dict

#   ##### yulu, xiyah, xiangyu#####

Processing folders:   0%|          | 0/4 [00:00<?, ?it/s]

Folder 3/4: Processing files:   0%|          | 0/225 [00:00<?, ?it/s]

Folder 4/4: Processing files:   0%|          | 0/225 [00:00<?, ?it/s]

In [None]:
#### Xiangyu ####
# Rhythm features

# q1_tempo = []
# q1_tempogram = []
# q1_fourier_tempogram = []
# q1_tempogram_ratio = []

# for demo in q1_files:
#   x , sr = librosa.load(q1_file)
#   tempo = librosa.feature.tempo(y=x, sr=sr)
#   tempogram = librosa.feature.tempogram(y=x, sr=sr)
#   fourier_tempogram = librosa.feature.fourier_tempogram(y=x, sr=sr)
#   tempogram_ratio = librosa.feature.tempogram_ratio(y=x, sr=sr)

#   q1_xs.append(x)
#   q1_srs.append(sr)

#   q1_tempo.append(tempo)
#   q1_tempogram.append(tempogram)
#   q1_fourier_tempogram.append(fourier_tempogram)
#   q1_tempogram_ratio.append(tempogram_ratio)

#### save the feature array in a separate file ####
# xs and srs + 18 features + 1 label
# features_dict = {
#     'xs': xs,
#     'srs': srs,
#     'chroma_stft_demos': chroma_stft_demos,
#     'chroma_cqt_demos': chroma_cqt_demos,
#     'chroma_vqt_demos': chroma_vqt_demos,
#     'chroma_cens_demos': chroma_cens_demos,
#     'melspectrogram_demos': melspectrogram_demos,
#     'mfcc_demos': mfcc_demos,
#     'rms_demos': rms_demos,
#     'spectral_centroid_demos': spectral_centroid_demos,
#     'spectral_bandwidth_demos': spectral_bandwidth_demos,
#     'spectral_contrast_demos': spectral_contrast_demos,
#     'spectral_flatness_demos': spectral_flatness_demos,
#     'spectral_rolloff_demos': spectral_rolloff_demos,
#     'tonnetz_demos': tonnetz_demos,
#     'zero_crossing_rate_demos': zero_crossing_rate_demos,
#     'tempo': tempo,
#     'tempogram': tempogram,
#     'fourier_tempogram': fourier_tempogram,
#     'tempogram_ratio': tempogram_ratio,
#     'labels': labels
# }


# file_path = '/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/MER_audio_taffc_dataset/audio_features.pkl'
# with open(file_path, 'wb') as file:
#     pickle.dump(features_dict, file)
#### Xiangyu ####

#### xiyah ####
# pickle file is hard to read: we keep geting "run out of input" error
# thus decide to save .npy file
# np.save('/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features.npy', features_dict, allow_pickle=True)
#### xiyah ####

## Adding Features Based on Basic Features

In [None]:
### Xiangyu ###
# When we need the raw feature array
# Load the feature dictionary from a file

# loaded_features_dict = {}
# file_path = '/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/MER_audio_taffc_dataset/audio_features.pkl'

# if os.path.getsize(file_path) > 0: # make sure the file is not empty
#   with open(file_path, 'rb') as file:
#     unpickler = pickle.Unpickler(file)
#     loaded_features_dict = unpickler.load()
#     #loaded_features_dict = pickle.load(file)
### Xiangyu ###

### Xiyah ###
# q1 data
data_q1 = np.load('/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q1.npy', allow_pickle=True)
#print(f"q1 data: {data_q1.item()}")
# q2 data
data_q2 = np.load('/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q2.npy', allow_pickle=True)
#print(f"q2 data: {data_q2.item()}")
# q3 data
data_q3 = np.load('/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q3.npy', allow_pickle=True)
#print(f"q3 data: {data_q3.item()}")
# q4 data
data_q4 = np.load('/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q4.npy', allow_pickle=True)
#print(f"q4 data: {data_q4.item()}")
# data.item()['xs'] # list of numpy array
### Xiyah ###

In [None]:
### Xiyah ###
# define a columns' names variable
df_columns = data_q1.item().keys()
df_columns
### Xiyah ###

dict_keys(['xs', 'srs', 'chroma_stft_demos', 'chroma_cqt_demos', 'chroma_vqt_demos', 'chroma_cens_demos', 'melspectrogram_demos', 'mfcc_demos', 'rms_demos', 'spectral_centroid_demos', 'spectral_bandwidth_demos', 'spectral_contrast_demos', 'spectral_flatness_demos', 'spectral_rolloff_demos', 'tonnetz_demos', 'zero_crossing_rate_demos', 'tempo', 'tempogram', 'fourier_tempogram', 'tempogram_ratio', 'labels'])

In [None]:
### Xiyah ###
# convert all .npy to pd.DataFrame
df_q1 = pd.DataFrame(data_q1.item(), columns=df_columns)
df_q2 = pd.DataFrame(data_q2.item(), columns=df_columns)
df_q3 = pd.DataFrame(data_q3.item(), columns=df_columns)
df_q4 = pd.DataFrame(data_q4.item(), columns=df_columns)
### Xiyah ###

In [None]:
### Xiyah ###
# preview all dataframe
df_q1.head()
### Xiyah ###

Unnamed: 0,xs,srs,chroma_stft_demos,chroma_cqt_demos,chroma_vqt_demos,chroma_cens_demos,melspectrogram_demos,mfcc_demos,rms_demos,spectral_centroid_demos,...,spectral_contrast_demos,spectral_flatness_demos,spectral_rolloff_demos,tonnetz_demos,zero_crossing_rate_demos,tempo,tempogram,fourier_tempogram,tempogram_ratio,labels
0,"[2.5885166e-15, -3.28127e-16, 1.3183922e-14, 2...",22050,"[[0.64465576, 0.5051329, 0.6534084, 0.679588, ...","[[0.5660721, 0.5691674, 0.5468383, 0.45331684,...","[[0.848989, 0.8981467, 0.73216736, 0.62612474,...","[[0.15784575, 0.15223381, 0.14725037, 0.142912...","[[1.3533841e-08, 3.5398793e-06, 0.00014950961,...","[[-556.7651, -511.8035, -422.1127, -383.47598,...","[[0.0005485677, 0.0020291524, 0.0029873797, 0....","[[2600.362152387303, 2068.7657054944157, 1942....",...,"[[3.8653191299605147, 17.12577068070992, 14.64...","[[0.041992728, 0.00600211, 0.004467108, 0.0047...","[[5264.8681640625, 4166.6748046875, 3929.80957...","[[-0.07177943340504166, -0.06567968650338962, ...","[[0.0654296875, 0.087890625, 0.1142578125, 0.1...",[129.19921875],"[[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,...","[[(166.63167+0j), (168.14557+0j), (169.65591+0...","[[0.5631348673732673, 0.5627342679353834, 0.56...",1
1,"[-2.860234e-14, 2.5666e-14, 1.2625685e-14, 5.9...",22050,"[[0.54650044, 0.23805048, 0.119608514, 0.28680...","[[0.3125394, 0.27293065, 0.23888744, 0.2133149...","[[0.5071617, 0.41103512, 0.4162637, 0.29573256...","[[0.08206277, 0.08034206, 0.07872083, 0.077143...","[[8.769024e-09, 3.6914357e-06, 2.9017758e-05, ...","[[-502.408, -498.12592, -482.74106, -464.25244...","[[0.00015566013, 0.00077594863, 0.0017064415, ...","[[3185.2521348690234, 2261.5056267393707, 1867...",...,"[[9.966209676668395, 20.626088089396767, 19.39...","[[0.118682206, 0.009566744, 0.0024827898, 0.00...","[[6815.2587890625, 5469.43359375, 4306.640625,...","[[-0.06426588180285142, -0.08388056436385227, ...","[[0.08251953125, 0.09375, 0.1015625, 0.0893554...",[161.4990234375],"[[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,...","[[(152.49486+0j), (154.0267+0j), (155.5576+0j)...","[[0.2510897665830859, 0.2515960540053639, 0.25...",1
2,"[1.4363665e-14, 1.7338858e-14, -1.7440724e-15,...",22050,"[[0.35400018, 0.31895828, 0.15409946, 0.094428...","[[0.14923647, 0.15812233, 0.16552325, 0.166325...","[[0.24184267, 0.19096148, 0.16628368, 0.155282...","[[0.0, 0.0, 0.00010032334, 0.0004665737, 0.001...","[[2.2634141e-07, 0.00012189508, 0.00091573584,...","[[-533.9376, -523.7987, -495.95483, -466.6701,...","[[0.0001440642, 0.0008707615, 0.0016941759, 0....","[[3886.240396747828, 2602.2479592907566, 2583....",...,"[[11.01802515115709, 4.337050531702491, 13.062...","[[0.175672, 0.025252823, 0.010537109, 0.008343...","[[8559.4482421875, 6933.69140625, 6912.1582031...","[[0.08830271464977643, 0.1027511107443902, 0.1...","[[0.10107421875, 0.12109375, 0.14892578125, 0....",[99.38401442307692],"[[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,...","[[(160.38797+0j), (162.057+0j), (163.72563+0j)...","[[0.447771511272498, 0.4476321622012972, 0.447...",1
3,"[3.948538e-14, 2.4908268e-14, 2.3690624e-14, 3...",22050,"[[0.1104949, 0.054469876, 0.04289725, 0.038043...","[[0.11308139, 0.11504669, 0.10645489, 0.100842...","[[0.22638988, 0.19970289, 0.25529122, 0.247800...","[[0.021481985, 0.02449442, 0.027625265, 0.0307...","[[3.1630414e-06, 0.00047005177, 0.0046568206, ...","[[-567.22174, -488.08493, -394.5883, -341.3959...","[[0.0010538041, 0.0049711214, 0.011370153, 0.0...","[[1260.2291554594187, 1460.3761663407904, 1368...",...,"[[2.145884914440444, 11.173691122415924, 26.00...","[[0.0030659311, 0.0018041453, 0.0006550406, 0....","[[2368.65234375, 2702.4169921875, 2487.0849609...","[[0.13569760071274048, 0.1758088452068583, 0.2...","[[0.04638671875, 0.06884765625, 0.09521484375,...",[123.046875],"[[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,...","[[(142.60179+0j), (143.91019+0j), (145.21745+0...","[[0.5211342742544658, 0.522310718300107, 0.523...",1
4,"[1.5478067e-15, -4.7274973e-16, 3.9259797e-16,...",22050,"[[0.5735273, 0.44081706, 0.2916847, 0.1408742,...","[[0.4814759, 0.54028213, 0.64797264, 0.4830193...","[[0.37801835, 0.5129215, 0.62981725, 0.7347456...","[[0.51716256, 0.5277147, 0.538432, 0.5492283, ...","[[8.511694e-11, 3.06712e-08, 4.1023623e-07, 2....","[[-546.37854, -546.37854, -545.20844, -534.487...","[[1.50278765e-05, 6.5735556e-05, 0.00023117632...","[[2549.7987453674305, 1975.966686504611, 1500....",...,"[[9.420645564884744, 14.93348029801513, 11.337...","[[0.26400164, 0.018927738, 0.0046294658, 0.000...","[[5609.3994140625, 3779.0771484375, 2411.71875...","[[0.0640847017379245, 0.07782309922824504, 0.1...","[[0.07470703125, 0.09375, 0.11279296875, 0.128...",[103.359375],"[[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,...","[[(133.48157+0j), (135.05981+0j), (136.63963+0...","[[0.5156068248079573, 0.5133771694838867, 0.51...",1


In [None]:
df_q1["chroma_stft_demos"][0].mean(axis=1) # 横着的

array([0.3402438 , 0.4108848 , 0.32207942, 0.43924117, 0.29412463,
       0.38178524, 0.39732665, 0.29183748, 0.34578878, 0.32593077,
       0.47563004, 0.4188336 ], dtype=float32)

In [None]:
### Xiangyu, Xiyah ###
# The following is to add more statitical parameters
# mean, max, min, std, skewness, kurtosis
# Skewness measures the asymmetry of the probability distribution of a real-valued random variable about its mean.
#   In simpler terms, it indicates whether the data points are skewed to one side (left or right) of the distribution.
# Kurtosis measures the "tailedness" of the probability distribution of a real-valued random variable.
#   It provides insights into the shape of the tails and the peak of the distribution compared to a normal distribution.

### 1.chroma_stft ###
# (12, 1295)
# 12 represents the chroma bins (one for each of the 12 different pitch classes in Western music)
# 1295 represents the number of time frames in your analysis
# uses the Short-Time Fourier Transform which is computationally efficient but less musically intuitive.
# Calculate mean, std, sum
# The average energy in each chroma bin across all frames can give you a sense of the predominant pitches in the piece.
df_q1['chroma_stft_mean'] = df_q1['chroma_stft_demos'].apply(lambda x: x.mean(axis=1))
# This measures the variability of the energy in each chroma bin, which can indicate how consistently a pitch class is used.
df_q1['chroma_stft_std'] = df_q1['chroma_stft_demos'].apply(lambda x: x.std(axis=1))
# Summing up the energy across all bins and frames can give you an overall intensity measure of the chromatic content.
df_q1['chroma_stft_sum'] = df_q1['chroma_stft_demos'].apply(lambda x: x.sum())

# Calculate skewness and kurtosis
# Note: skew() and kurtosis() functions from scipy.stats compute the skewness and kurtosis over the entire dataset
# If you need these metrics calculated for each row individually (if 'chroma_stft_demos' contains lists, for instance),
# you would need to apply these functions row-wise. Here's how to do it for skewness and kurtosis if 'chroma_stft_demos' contains lists:
#   For row-wise operations, assuming each row in 'chroma_stft_demos' is a list or similar iterable
df_q1['chroma_stft_skew'] = df_q1['chroma_stft_demos'].apply(lambda x: skew(x, axis=1))
df_q1['chroma_stft_kurtosis'] = df_q1['chroma_stft_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q2
df_q2['chroma_stft_mean'] = df_q2['chroma_stft_demos'].apply(lambda x: x.mean(axis=1))
df_q2['chroma_stft_std'] = df_q2['chroma_stft_demos'].apply(lambda x: x.std(axis=1))
df_q2['chroma_stft_sum'] = df_q2['chroma_stft_demos'].apply(lambda x: x.sum())
df_q2['chroma_stft_skew'] = df_q2['chroma_stft_demos'].apply(lambda x: skew(x, axis=1))
df_q2['chroma_stft_kurtosis'] = df_q2['chroma_stft_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q3
df_q3['chroma_stft_mean'] = df_q3['chroma_stft_demos'].apply(lambda x: x.mean(axis=1))
df_q3['chroma_stft_std'] = df_q3['chroma_stft_demos'].apply(lambda x: x.std(axis=1))
df_q3['chroma_stft_sum'] = df_q3['chroma_stft_demos'].apply(lambda x: x.sum())
df_q3['chroma_stft_skew'] = df_q3['chroma_stft_demos'].apply(lambda x: skew(x, axis=1))
df_q3['chroma_stft_kurtosis'] = df_q3['chroma_stft_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q4
df_q4['chroma_stft_mean'] = df_q4['chroma_stft_demos'].apply(lambda x: x.mean(axis=1))
df_q4['chroma_stft_std'] = df_q4['chroma_stft_demos'].apply(lambda x: x.std(axis=1))
df_q4['chroma_stft_sum'] = df_q4['chroma_stft_demos'].apply(lambda x: x.sum())
df_q4['chroma_stft_skew'] = df_q4['chroma_stft_demos'].apply(lambda x: skew(x, axis=1))
df_q4['chroma_stft_kurtosis'] = df_q4['chroma_stft_demos'].apply(lambda x: kurtosis(x, axis=1))

print("1. Finish adding chroma_stft's additional features.")
### Xiangyu, Xiyah ###

1. Finish adding chroma_stft's additional features.


In [None]:
### 2.chroma_cqt ###
# (12, 1295)
# is a chromagram derived from the Constant-Q transform (CQT) of an audio signal.
# uses the Constant-Q Transform which is more aligned with musical theory.
# calculate mean, std, sum, skewness, kurtosis
df_q1['chroma_cqt_mean'] = df_q1['chroma_cqt_demos'].apply(lambda x: x.mean(axis=1))
df_q1['chroma_cqt_std'] = df_q1['chroma_cqt_demos'].apply(lambda x: x.std(axis=1))
df_q1['chroma_cqt_sum'] = df_q1['chroma_cqt_demos'].apply(lambda x: x.sum())
df_q1['chroma_cqt_skew'] = df_q1['chroma_cqt_demos'].apply(lambda x: skew(x, axis=1))
df_q1['chroma_cqt_kurtosis'] = df_q1['chroma_cqt_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q2
df_q2['chroma_cqt_mean'] = df_q2['chroma_cqt_demos'].apply(lambda x: x.mean(axis=1))
df_q2['chroma_cqt_std'] = df_q2['chroma_cqt_demos'].apply(lambda x: x.std(axis=1))
df_q2['chroma_cqt_sum'] = df_q2['chroma_cqt_demos'].apply(lambda x: x.sum())
df_q2['chroma_cqt_skew'] = df_q2['chroma_cqt_demos'].apply(lambda x: skew(x, axis=1))
df_q2['chroma_cqt_kurtosis'] = df_q2['chroma_cqt_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q3
df_q3['chroma_cqt_mean'] = df_q3['chroma_cqt_demos'].apply(lambda x: x.mean(axis=1))
df_q3['chroma_cqt_std'] = df_q3['chroma_cqt_demos'].apply(lambda x: x.std(axis=1))
df_q3['chroma_cqt_sum'] = df_q3['chroma_cqt_demos'].apply(lambda x: x.sum())
df_q3['chroma_cqt_skew'] = df_q3['chroma_cqt_demos'].apply(lambda x: skew(x, axis=1))
df_q3['chroma_cqt_kurtosis'] = df_q3['chroma_cqt_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q4
df_q4['chroma_cqt_mean'] = df_q4['chroma_cqt_demos'].apply(lambda x: x.mean(axis=1))
df_q4['chroma_cqt_std'] = df_q4['chroma_cqt_demos'].apply(lambda x: x.std(axis=1))
df_q4['chroma_cqt_sum'] = df_q4['chroma_cqt_demos'].apply(lambda x: x.sum())
df_q4['chroma_cqt_skew'] = df_q4['chroma_cqt_demos'].apply(lambda x: skew(x, axis=1))
df_q4['chroma_cqt_kurtosis'] = df_q4['chroma_cqt_demos'].apply(lambda x: kurtosis(x, axis=1))

print("2. Finish adding chroma_cqt's additional features.")

2. Finish adding chroma_cqt's additional features.


In [None]:
### 3.chroma_vqt ###
# (12, 1295)
# Variable-Q chromagram
# This differs from CQT-based chroma by supporting non-equal temperament intervals.
# Note: unlike CQT- and STFT-based chroma, VQT chroma does not aggregate energy from neighboring frequency bands.
#   As a result, the number of chroma features produced is equal to the number of intervals used, or equivalently,
#   the number of bins per octave in the underlying VQT representation.
# calculate mean, std, sum, skewness, kurtosis
df_q1['chroma_vqt_mean'] = df_q1['chroma_vqt_demos'].apply(lambda x: x.mean(axis=1))
df_q1['chroma_vqt_std'] = df_q1['chroma_vqt_demos'].apply(lambda x: x.std(axis=1))
df_q1['chroma_vqt_sum'] = df_q1['chroma_vqt_demos'].apply(lambda x: x.sum())
df_q1['chroma_vqt_skew'] = df_q1['chroma_vqt_demos'].apply(lambda x: skew(x, axis=1))
df_q1['chroma_vqt_kurtosis'] = df_q1['chroma_vqt_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q2
df_q2['chroma_vqt_mean'] = df_q2['chroma_vqt_demos'].apply(lambda x: x.mean(axis=1))
df_q2['chroma_vqt_std'] = df_q2['chroma_vqt_demos'].apply(lambda x: x.std(axis=1))
df_q2['chroma_vqt_sum'] = df_q2['chroma_vqt_demos'].apply(lambda x: x.sum())
df_q2['chroma_vqt_skew'] = df_q2['chroma_vqt_demos'].apply(lambda x: skew(x, axis=1))
df_q2['chroma_vqt_kurtosis'] = df_q2['chroma_vqt_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q3
df_q3['chroma_vqt_mean'] = df_q3['chroma_vqt_demos'].apply(lambda x: x.mean(axis=1))
df_q3['chroma_vqt_std'] = df_q3['chroma_vqt_demos'].apply(lambda x: x.std(axis=1))
df_q3['chroma_vqt_sum'] = df_q3['chroma_vqt_demos'].apply(lambda x: x.sum())
df_q3['chroma_vqt_skew'] = df_q3['chroma_vqt_demos'].apply(lambda x: skew(x, axis=1))
df_q3['chroma_vqt_kurtosis'] = df_q3['chroma_vqt_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q4
df_q4['chroma_vqt_mean'] = df_q4['chroma_vqt_demos'].apply(lambda x: x.mean(axis=1))
df_q4['chroma_vqt_std'] = df_q4['chroma_vqt_demos'].apply(lambda x: x.std(axis=1))
df_q4['chroma_vqt_sum'] = df_q4['chroma_vqt_demos'].apply(lambda x: x.sum())
df_q4['chroma_vqt_skew'] = df_q4['chroma_vqt_demos'].apply(lambda x: skew(x, axis=1))
df_q4['chroma_vqt_kurtosis'] = df_q4['chroma_vqt_demos'].apply(lambda x: kurtosis(x, axis=1))

print("3. Finish adding chroma_vqt's additional features.")

3. Finish adding chroma_vqt's additional features.


In [None]:
### 4.melspectrogram ###
# (128, 1295)
# Compute a mel-scaled spectrogram.
# is a Mel spectrogram, which is a representation of the short-term power spectrum of a sound,
# mapped onto the Mel scale. The Mel scale is a perceptual scale of pitches judged by listeners
# to be equal in distance from one another, making it more closely aligned with human auditory perception
# than the linearly-spaced frequency bands used in a standard spectrogram
# calculate mean, std, min, max, sum, skewness, kurtosis
df_q1['melspectrogram_mean'] = df_q1['melspectrogram_demos'].apply(lambda x: x.mean(axis=1))
df_q1['melspectrogram_std'] = df_q1['melspectrogram_demos'].apply(lambda x: x.std(axis=1))
df_q1['melspectrogram_min'] = df_q1['melspectrogram_demos'].apply(lambda x: x.min(axis=1))
df_q1['melspectrogram_max'] = df_q1['melspectrogram_demos'].apply(lambda x: x.max(axis=1))
df_q1['melspectrogram_sum'] = df_q1['melspectrogram_demos'].apply(lambda x: x.sum())
df_q1['melspectrogram_skew'] = df_q1['melspectrogram_demos'].apply(lambda x: skew(x, axis=1))
df_q1['melspectrogram_kurtosis'] = df_q1['melspectrogram_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q2
df_q2['melspectrogram_mean'] = df_q2['melspectrogram_demos'].apply(lambda x: x.mean(axis=1))
df_q2['melspectrogram_std'] = df_q2['melspectrogram_demos'].apply(lambda x: x.std(axis=1))
df_q2['melspectrogram_min'] = df_q2['melspectrogram_demos'].apply(lambda x: x.min(axis=1))
df_q2['melspectrogram_max'] = df_q2['melspectrogram_demos'].apply(lambda x: x.max(axis=1))
df_q2['melspectrogram_sum'] = df_q2['melspectrogram_demos'].apply(lambda x: x.sum())
df_q2['melspectrogram_skew'] = df_q2['melspectrogram_demos'].apply(lambda x: skew(x, axis=1))
df_q2['melspectrogram_kurtosis'] = df_q2['melspectrogram_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q3
df_q3['melspectrogram_mean'] = df_q3['melspectrogram_demos'].apply(lambda x: x.mean(axis=1))
df_q3['melspectrogram_std'] = df_q3['melspectrogram_demos'].apply(lambda x: x.std(axis=1))
df_q3['melspectrogram_min'] = df_q3['melspectrogram_demos'].apply(lambda x: x.min(axis=1))
df_q3['melspectrogram_max'] = df_q3['melspectrogram_demos'].apply(lambda x: x.max(axis=1))
df_q3['melspectrogram_sum'] = df_q3['melspectrogram_demos'].apply(lambda x: x.sum())
df_q3['melspectrogram_skew'] = df_q3['melspectrogram_demos'].apply(lambda x: skew(x, axis=1))
df_q3['melspectrogram_kurtosis'] = df_q3['melspectrogram_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q4
df_q4['melspectrogram_mean'] = df_q4['melspectrogram_demos'].apply(lambda x: x.mean(axis=1))
df_q4['melspectrogram_std'] = df_q4['melspectrogram_demos'].apply(lambda x: x.std(axis=1))
df_q4['melspectrogram_min'] = df_q4['melspectrogram_demos'].apply(lambda x: x.min(axis=1))
df_q4['melspectrogram_max'] = df_q4['melspectrogram_demos'].apply(lambda x: x.max(axis=1))
df_q4['melspectrogram_sum'] = df_q4['melspectrogram_demos'].apply(lambda x: x.sum())
df_q4['melspectrogram_skew'] = df_q4['melspectrogram_demos'].apply(lambda x: skew(x, axis=1))
df_q4['melspectrogram_kurtosis'] = df_q4['melspectrogram_demos'].apply(lambda x: kurtosis(x, axis=1))

print("4. Finish adding melspectrogram's additional features.")

4. Finish adding melspectrogram's additional features.


In [None]:
### 5.mfcc ###
# (20, 1295)
# MFCCs are derived from the cepstral representation of the audio,
#   where the term "cepstrum" refers to the inverse Fourier transform (IFT) of the logarithm of the estimated spectrum of a signal.
#   The Mel-frequency aspect emphasizes sounds perceptible to the human ear by mimicking the nonlinear human ear perception of sound,
#   thus making MFCCs highly effective for audio and speech analysis.
# shape (n_mfcc, n_frames), where n_mfcc is the number of MFCCs extracted per frame (typically 13-20 for most applications).
# calculate mean, std, min, max, skewness, kurtosis, median
df_q1['mfcc_mean'] = df_q1['mfcc_demos'].apply(lambda x: x.mean(axis=1))
df_q1['mfcc_std'] = df_q1['mfcc_demos'].apply(lambda x: x.std(axis=1))
df_q1['mfcc_min'] = df_q1['mfcc_demos'].apply(lambda x: x.min(axis=1))
df_q1['mfcc_max'] = df_q1['mfcc_demos'].apply(lambda x: x.max(axis=1))
df_q1['mfcc_skew'] = df_q1['mfcc_demos'].apply(lambda x: skew(x, axis=1))
df_q1['mfcc_kurtosis'] = df_q1['mfcc_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q1['mfcc_median'] = df_q1['mfcc_demos'].apply(lambda x: np.median(x, axis=1))

# df_q2
df_q2['mfcc_mean'] = df_q2['mfcc_demos'].apply(lambda x: x.mean(axis=1))
df_q2['mfcc_std'] = df_q2['mfcc_demos'].apply(lambda x: x.std(axis=1))
df_q2['mfcc_min'] = df_q2['mfcc_demos'].apply(lambda x: x.min(axis=1))
df_q2['mfcc_max'] = df_q2['mfcc_demos'].apply(lambda x: x.max(axis=1))
df_q2['mfcc_skew'] = df_q2['mfcc_demos'].apply(lambda x: skew(x, axis=1))
df_q2['mfcc_kurtosis'] = df_q2['mfcc_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q2['mfcc_median'] = df_q2['mfcc_demos'].apply(lambda x: np.median(x, axis=1))

# df_q3
df_q3['mfcc_mean'] = df_q3['mfcc_demos'].apply(lambda x: x.mean(axis=1))
df_q3['mfcc_std'] = df_q3['mfcc_demos'].apply(lambda x: x.std(axis=1))
df_q3['mfcc_min'] = df_q3['mfcc_demos'].apply(lambda x: x.min(axis=1))
df_q3['mfcc_max'] = df_q3['mfcc_demos'].apply(lambda x: x.max(axis=1))
df_q3['mfcc_skew'] = df_q3['mfcc_demos'].apply(lambda x: skew(x, axis=1))
df_q3['mfcc_kurtosis'] = df_q3['mfcc_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q3['mfcc_median'] = df_q3['mfcc_demos'].apply(lambda x: np.median(x, axis=1))

# df_q4
df_q4['mfcc_mean'] = df_q4['mfcc_demos'].apply(lambda x: x.mean(axis=1))
df_q4['mfcc_std'] = df_q4['mfcc_demos'].apply(lambda x: x.std(axis=1))
df_q4['mfcc_min'] = df_q4['mfcc_demos'].apply(lambda x: x.min(axis=1))
df_q4['mfcc_max'] = df_q4['mfcc_demos'].apply(lambda x: x.max(axis=1))
df_q4['mfcc_skew'] = df_q4['mfcc_demos'].apply(lambda x: skew(x, axis=1))
df_q4['mfcc_kurtosis'] = df_q4['mfcc_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q4['mfcc_median'] = df_q4['mfcc_demos'].apply(lambda x: np.median(x, axis=1))

print("5. Finish adding mfcc's additional features.")

5. Finish adding mfcc's additional features.


In [None]:
### 6.rms ###
# (1, 1295)
# It is a statistical measure used to determine the average power or energy of a signal.
# Mathematically, it's the square root of the average of the squares of the values.
# calculate mean, std, min, max, skewness, kurtosis
df_q1['rms_mean'] = df_q1['rms_demos'].apply(lambda x: x.mean(axis=1))
df_q1['rms_std'] = df_q1['rms_demos'].apply(lambda x: x.std(axis=1))
df_q1['rms_min'] = df_q1['rms_demos'].apply(lambda x: x.min(axis=1))
df_q1['rms_max'] = df_q1['rms_demos'].apply(lambda x: x.max(axis=1))
df_q1['rms_skew'] = df_q1['rms_demos'].apply(lambda x: skew(x, axis=1))
df_q1['rms_kurtosis'] = df_q1['rms_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q2
df_q2['rms_mean'] = df_q2['rms_demos'].apply(lambda x: x.mean(axis=1))
df_q2['rms_std'] = df_q2['rms_demos'].apply(lambda x: x.std(axis=1))
df_q2['rms_min'] = df_q2['rms_demos'].apply(lambda x: x.min(axis=1))
df_q2['rms_max'] = df_q2['rms_demos'].apply(lambda x: x.max(axis=1))
df_q2['rms_skew'] = df_q2['rms_demos'].apply(lambda x: skew(x, axis=1))
df_q2['rms_kurtosis'] = df_q2['rms_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q3
df_q3['rms_mean'] = df_q3['rms_demos'].apply(lambda x: x.mean(axis=1))
df_q3['rms_std'] = df_q3['rms_demos'].apply(lambda x: x.std(axis=1))
df_q3['rms_min'] = df_q3['rms_demos'].apply(lambda x: x.min(axis=1))
df_q3['rms_max'] = df_q3['rms_demos'].apply(lambda x: x.max(axis=1))
df_q3['rms_skew'] = df_q3['rms_demos'].apply(lambda x: skew(x, axis=1))
df_q3['rms_kurtosis'] = df_q3['rms_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q4
df_q4['rms_mean'] = df_q4['rms_demos'].apply(lambda x: x.mean(axis=1))
df_q4['rms_std'] = df_q4['rms_demos'].apply(lambda x: x.std(axis=1))
df_q4['rms_min'] = df_q4['rms_demos'].apply(lambda x: x.min(axis=1))
df_q4['rms_max'] = df_q4['rms_demos'].apply(lambda x: x.max(axis=1))
df_q4['rms_skew'] = df_q4['rms_demos'].apply(lambda x: skew(x, axis=1))
df_q4['rms_kurtosis'] = df_q4['rms_demos'].apply(lambda x: kurtosis(x, axis=1))

print("6. Finish adding rms's additional features.")

6. Finish adding rms's additional features.


In [None]:
### 7.spectral_centroid ###
# (1, 1295)
# Each value in the array represents the spectral centroid of a corresponding frame of the audio signal,
#   measured in Hertz (Hz). Higher values indicate that the energy of the spectrum is concentrated towards
#   higher frequencies, which is typically perceived as a brighter or sharper sound.
# calculate mean, std, min, max, skewness, kurtosis
df_q1['spectral_centroid_mean'] = df_q1['spectral_centroid_demos'].apply(lambda x: x.mean(axis=1))
df_q1['spectral_centroid_std'] = df_q1['spectral_centroid_demos'].apply(lambda x: x.std(axis=1))
df_q1['spectral_centroid_min'] = df_q1['spectral_centroid_demos'].apply(lambda x: x.min(axis=1))
df_q1['spectral_centroid_max'] = df_q1['spectral_centroid_demos'].apply(lambda x: x.max(axis=1))
df_q1['spectral_centroid_skew'] = df_q1['spectral_centroid_demos'].apply(lambda x: skew(x, axis=1))
df_q1['spectral_centroid_kurtosis'] = df_q1['spectral_centroid_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q2
df_q2['spectral_centroid_mean'] = df_q2['spectral_centroid_demos'].apply(lambda x: x.mean(axis=1))
df_q2['spectral_centroid_std'] = df_q2['spectral_centroid_demos'].apply(lambda x: x.std(axis=1))
df_q2['spectral_centroid_min'] = df_q2['spectral_centroid_demos'].apply(lambda x: x.min(axis=1))
df_q2['spectral_centroid_max'] = df_q2['spectral_centroid_demos'].apply(lambda x: x.max(axis=1))
df_q2['spectral_centroid_skew'] = df_q2['spectral_centroid_demos'].apply(lambda x: skew(x, axis=1))
df_q2['spectral_centroid_kurtosis'] = df_q2['spectral_centroid_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q3
df_q3['spectral_centroid_mean'] = df_q3['spectral_centroid_demos'].apply(lambda x: x.mean(axis=1))
df_q3['spectral_centroid_std'] = df_q3['spectral_centroid_demos'].apply(lambda x: x.std(axis=1))
df_q3['spectral_centroid_min'] = df_q3['spectral_centroid_demos'].apply(lambda x: x.min(axis=1))
df_q3['spectral_centroid_max'] = df_q3['spectral_centroid_demos'].apply(lambda x: x.max(axis=1))
df_q3['spectral_centroid_skew'] = df_q3['spectral_centroid_demos'].apply(lambda x: skew(x, axis=1))
df_q3['spectral_centroid_kurtosis'] = df_q3['spectral_centroid_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q4
df_q4['spectral_centroid_mean'] = df_q4['spectral_centroid_demos'].apply(lambda x: x.mean(axis=1))
df_q4['spectral_centroid_std'] = df_q4['spectral_centroid_demos'].apply(lambda x: x.std(axis=1))
df_q4['spectral_centroid_min'] = df_q4['spectral_centroid_demos'].apply(lambda x: x.min(axis=1))
df_q4['spectral_centroid_max'] = df_q4['spectral_centroid_demos'].apply(lambda x: x.max(axis=1))
df_q4['spectral_centroid_skew'] = df_q4['spectral_centroid_demos'].apply(lambda x: skew(x, axis=1))
df_q4['spectral_centroid_kurtosis'] = df_q4['spectral_centroid_demos'].apply(lambda x: kurtosis(x, axis=1))

print("7. Finish adding spectral_centroid's additional features.")

7. Finish adding spectral_centroid's additional features.


In [None]:
### 8.spectral_bandwidth ###
# (1, 1295)
# The spectral bandwidth is often defined with respect to the spectral centroid
#   and indicates the frequency range within which a significant portion of the signal's energy is contained.
# calculate mean, std, min, max, skewness, kurtosis
df_q1['spectral_bandwidth_mean'] = df_q1['spectral_bandwidth_demos'].apply(lambda x: x.mean(axis=1))
df_q1['spectral_bandwidth_std'] = df_q1['spectral_bandwidth_demos'].apply(lambda x: x.std(axis=1))
df_q1['spectral_bandwidth_min'] = df_q1['spectral_bandwidth_demos'].apply(lambda x: x.min(axis=1))
df_q1['spectral_bandwidth_max'] = df_q1['spectral_bandwidth_demos'].apply(lambda x: x.max(axis=1))
df_q1['spectral_bandwidth_skew'] = df_q1['spectral_bandwidth_demos'].apply(lambda x: skew(x, axis=1))
df_q1['spectral_bandwidth_kurtosis'] = df_q1['spectral_bandwidth_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q2
df_q2['spectral_bandwidth_mean'] = df_q2['spectral_bandwidth_demos'].apply(lambda x: x.mean(axis=1))
df_q2['spectral_bandwidth_std'] = df_q2['spectral_bandwidth_demos'].apply(lambda x: x.std(axis=1))
df_q2['spectral_bandwidth_min'] = df_q2['spectral_bandwidth_demos'].apply(lambda x: x.min(axis=1))
df_q2['spectral_bandwidth_max'] = df_q2['spectral_bandwidth_demos'].apply(lambda x: x.max(axis=1))
df_q2['spectral_bandwidth_skew'] = df_q2['spectral_bandwidth_demos'].apply(lambda x: skew(x, axis=1))
df_q2['spectral_bandwidth_kurtosis'] = df_q2['spectral_bandwidth_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q3
df_q3['spectral_bandwidth_mean'] = df_q3['spectral_bandwidth_demos'].apply(lambda x: x.mean(axis=1))
df_q3['spectral_bandwidth_std'] = df_q3['spectral_bandwidth_demos'].apply(lambda x: x.std(axis=1))
df_q3['spectral_bandwidth_min'] = df_q3['spectral_bandwidth_demos'].apply(lambda x: x.min(axis=1))
df_q3['spectral_bandwidth_max'] = df_q3['spectral_bandwidth_demos'].apply(lambda x: x.max(axis=1))
df_q3['spectral_bandwidth_skew'] = df_q3['spectral_bandwidth_demos'].apply(lambda x: skew(x, axis=1))
df_q3['spectral_bandwidth_kurtosis'] = df_q3['spectral_bandwidth_demos'].apply(lambda x: kurtosis(x, axis=1))

# df_q4
df_q4['spectral_bandwidth_mean'] = df_q4['spectral_bandwidth_demos'].apply(lambda x: x.mean(axis=1))
df_q4['spectral_bandwidth_std'] = df_q4['spectral_bandwidth_demos'].apply(lambda x: x.std(axis=1))
df_q4['spectral_bandwidth_min'] = df_q4['spectral_bandwidth_demos'].apply(lambda x: x.min(axis=1))
df_q4['spectral_bandwidth_max'] = df_q4['spectral_bandwidth_demos'].apply(lambda x: x.max(axis=1))
df_q4['spectral_bandwidth_skew'] = df_q4['spectral_bandwidth_demos'].apply(lambda x: skew(x, axis=1))
df_q4['spectral_bandwidth_kurtosis'] = df_q4['spectral_bandwidth_demos'].apply(lambda x: kurtosis(x, axis=1))

print("8. Finish adding spectral_bandwidth's additional features.")

8. Finish adding spectral_bandwidth's additional features.


In [None]:
### 9.spectral_contrast ###
# (7, 1295)
# shape (n_bands, n_frames), where n_bands is the number of frequency bands (usually 6 or 7, including the bass band) analyzed
#  Each element in the array represents the contrast value (in dB) for a specific frequency band and time frame.
#   Higher values indicate a greater contrast between the spectral peaks and valleys within that band,
#   suggesting more pronounced frequency components.
# calculate mean, std, min, max, skewness, kurtosis, range
df_q1['spectral_contrast_mean'] = df_q1['spectral_contrast_demos'].apply(lambda x: x.mean(axis=1))
df_q1['spectral_contrast_std'] = df_q1['spectral_contrast_demos'].apply(lambda x: x.std(axis=1))
df_q1['spectral_contrast_min'] = df_q1['spectral_contrast_demos'].apply(lambda x: x.min(axis=1))
df_q1['spectral_contrast_max'] = df_q1['spectral_contrast_demos'].apply(lambda x: x.max(axis=1))
df_q1['spectral_contrast_skew'] = df_q1['spectral_contrast_demos'].apply(lambda x: skew(x, axis=1))
df_q1['spectral_contrast_kurtosis'] = df_q1['spectral_contrast_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q1['spectral_contrast_range'] = df_q1['spectral_contrast_max'] - df_q1['spectral_contrast_min']

# df_q2
df_q2['spectral_contrast_mean'] = df_q2['spectral_contrast_demos'].apply(lambda x: x.mean(axis=1))
df_q2['spectral_contrast_std'] = df_q2['spectral_contrast_demos'].apply(lambda x: x.std(axis=1))
df_q2['spectral_contrast_min'] = df_q2['spectral_contrast_demos'].apply(lambda x: x.min(axis=1))
df_q2['spectral_contrast_max'] = df_q2['spectral_contrast_demos'].apply(lambda x: x.max(axis=1))
df_q2['spectral_contrast_skew'] = df_q2['spectral_contrast_demos'].apply(lambda x: skew(x, axis=1))
df_q2['spectral_contrast_kurtosis'] = df_q2['spectral_contrast_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q2['spectral_contrast_range'] = df_q2['spectral_contrast_max'] - df_q2['spectral_contrast_min']

# df_q3
df_q3['spectral_contrast_mean'] = df_q3['spectral_contrast_demos'].apply(lambda x: x.mean(axis=1))
df_q3['spectral_contrast_std'] = df_q3['spectral_contrast_demos'].apply(lambda x: x.std(axis=1))
df_q3['spectral_contrast_min'] = df_q3['spectral_contrast_demos'].apply(lambda x: x.min(axis=1))
df_q3['spectral_contrast_max'] = df_q3['spectral_contrast_demos'].apply(lambda x: x.max(axis=1))
df_q3['spectral_contrast_skew'] = df_q3['spectral_contrast_demos'].apply(lambda x: skew(x, axis=1))
df_q3['spectral_contrast_kurtosis'] = df_q3['spectral_contrast_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q3['spectral_contrast_range'] = df_q3['spectral_contrast_max'] - df_q3['spectral_contrast_min']

# df_q4
df_q4['spectral_contrast_mean'] = df_q4['spectral_contrast_demos'].apply(lambda x: x.mean(axis=1))
df_q4['spectral_contrast_std'] = df_q4['spectral_contrast_demos'].apply(lambda x: x.std(axis=1))
df_q4['spectral_contrast_min'] = df_q4['spectral_contrast_demos'].apply(lambda x: x.min(axis=1))
df_q4['spectral_contrast_max'] = df_q4['spectral_contrast_demos'].apply(lambda x: x.max(axis=1))
df_q4['spectral_contrast_skew'] = df_q4['spectral_contrast_demos'].apply(lambda x: skew(x, axis=1))
df_q4['spectral_contrast_kurtosis'] = df_q4['spectral_contrast_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q4['spectral_contrast_range'] = df_q4['spectral_contrast_max'] - df_q4['spectral_contrast_min']

print("9. Finish adding spectral_contrast's additional features.")

9. Finish adding spectral_contrast's additional features.


In [None]:
### 10.spectral_flatness ###
# (1, 1295)
# Each value in the array represents the spectral flatness of the corresponding frame,
#   typically ranging from 0 to 1. Values close to 1 suggest a very flat (noise-like) spectrum,
#   whereas values close to 0 suggest a peaky (tonal) spectrum.
# calculate mean, std, min, max, skewness, kurtosis, range
df_q1['spectral_flatness_mean'] = df_q1['spectral_flatness_demos'].apply(lambda x: x.mean(axis=1))
df_q1['spectral_flatness_std'] = df_q1['spectral_flatness_demos'].apply(lambda x: x.std(axis=1))
df_q1['spectral_flatness_min'] = df_q1['spectral_flatness_demos'].apply(lambda x: x.min(axis=1))
df_q1['spectral_flatness_max'] = df_q1['spectral_flatness_demos'].apply(lambda x: x.max(axis=1))
df_q1['spectral_flatness_skew'] = df_q1['spectral_flatness_demos'].apply(lambda x: skew(x, axis=1))
df_q1['spectral_flatness_kurtosis'] = df_q1['spectral_flatness_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q1['spectral_flatness_range'] = df_q1['spectral_flatness_max'] - df_q1['spectral_flatness_min']

# df_q2
df_q2['spectral_flatness_mean'] = df_q2['spectral_flatness_demos'].apply(lambda x: x.mean(axis=1))
df_q2['spectral_flatness_std'] = df_q2['spectral_flatness_demos'].apply(lambda x: x.std(axis=1))
df_q2['spectral_flatness_min'] = df_q2['spectral_flatness_demos'].apply(lambda x: x.min(axis=1))
df_q2['spectral_flatness_max'] = df_q2['spectral_flatness_demos'].apply(lambda x: x.max(axis=1))
df_q2['spectral_flatness_skew'] = df_q2['spectral_flatness_demos'].apply(lambda x: skew(x, axis=1))
df_q2['spectral_flatness_kurtosis'] = df_q2['spectral_flatness_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q2['spectral_flatness_range'] = df_q2['spectral_flatness_max'] - df_q2['spectral_flatness_min']

# df_q3
df_q3['spectral_flatness_mean'] = df_q3['spectral_flatness_demos'].apply(lambda x: x.mean(axis=1))
df_q3['spectral_flatness_std'] = df_q3['spectral_flatness_demos'].apply(lambda x: x.std(axis=1))
df_q3['spectral_flatness_min'] = df_q3['spectral_flatness_demos'].apply(lambda x: x.min(axis=1))
df_q3['spectral_flatness_max'] = df_q3['spectral_flatness_demos'].apply(lambda x: x.max(axis=1))
df_q3['spectral_flatness_skew'] = df_q3['spectral_flatness_demos'].apply(lambda x: skew(x, axis=1))
df_q3['spectral_flatness_kurtosis'] = df_q3['spectral_flatness_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q3['spectral_flatness_range'] = df_q3['spectral_flatness_max'] - df_q3['spectral_flatness_min']

# df_q4
df_q4['spectral_flatness_mean'] = df_q4['spectral_flatness_demos'].apply(lambda x: x.mean(axis=1))
df_q4['spectral_flatness_std'] = df_q4['spectral_flatness_demos'].apply(lambda x: x.std(axis=1))
df_q4['spectral_flatness_min'] = df_q4['spectral_flatness_demos'].apply(lambda x: x.min(axis=1))
df_q4['spectral_flatness_max'] = df_q4['spectral_flatness_demos'].apply(lambda x: x.max(axis=1))
df_q4['spectral_flatness_skew'] = df_q4['spectral_flatness_demos'].apply(lambda x: skew(x, axis=1))
df_q4['spectral_flatness_kurtosis'] = df_q4['spectral_flatness_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q4['spectral_flatness_range'] = df_q4['spectral_flatness_max'] - df_q4['spectral_flatness_min']

print("10. Finish adding spectral_flatness's additional features.")

10. Finish adding spectral_flatness's additional features.


In [None]:
### 11.spectral_rolloff ###
# (1, 1295)
# Each value in the array represents the spectral rolloff frequency
#   for a corresponding frame, measured in Hertz (Hz). This frequency can be understood as the boundary
#   that encapsulates a certain percentage of the spectral energy, starting from the lowest frequency.
# calculate mean, std, min, max, skewness, kurtosis, range
df_q1['spectral_rolloff_mean'] = df_q1['spectral_rolloff_demos'].apply(lambda x: x.mean(axis=1))
df_q1['spectral_rolloff_std'] = df_q1['spectral_rolloff_demos'].apply(lambda x: x.std(axis=1))
df_q1['spectral_rolloff_min'] = df_q1['spectral_rolloff_demos'].apply(lambda x: x.min(axis=1))
df_q1['spectral_rolloff_max'] = df_q1['spectral_rolloff_demos'].apply(lambda x: x.max(axis=1))
df_q1['spectral_rolloff_skew'] = df_q1['spectral_rolloff_demos'].apply(lambda x: skew(x, axis=1))
df_q1['spectral_rolloff_kurtosis'] = df_q1['spectral_rolloff_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q1['spectral_rolloff_range'] = df_q1['spectral_rolloff_max'] - df_q1['spectral_rolloff_min']

# df_q2
df_q2['spectral_rolloff_mean'] = df_q2['spectral_rolloff_demos'].apply(lambda x: x.mean(axis=1))
df_q2['spectral_rolloff_std'] = df_q2['spectral_rolloff_demos'].apply(lambda x: x.std(axis=1))
df_q2['spectral_rolloff_min'] = df_q2['spectral_rolloff_demos'].apply(lambda x: x.min(axis=1))
df_q2['spectral_rolloff_max'] = df_q2['spectral_rolloff_demos'].apply(lambda x: x.max(axis=1))
df_q2['spectral_rolloff_skew'] = df_q2['spectral_rolloff_demos'].apply(lambda x: skew(x, axis=1))
df_q2['spectral_rolloff_kurtosis'] = df_q2['spectral_rolloff_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q2['spectral_rolloff_range'] = df_q2['spectral_rolloff_max'] - df_q2['spectral_rolloff_min']

# df_q3
df_q3['spectral_rolloff_mean'] = df_q3['spectral_rolloff_demos'].apply(lambda x: x.mean(axis=1))
df_q3['spectral_rolloff_std'] = df_q3['spectral_rolloff_demos'].apply(lambda x: x.std(axis=1))
df_q3['spectral_rolloff_min'] = df_q3['spectral_rolloff_demos'].apply(lambda x: x.min(axis=1))
df_q3['spectral_rolloff_max'] = df_q3['spectral_rolloff_demos'].apply(lambda x: x.max(axis=1))
df_q3['spectral_rolloff_skew'] = df_q3['spectral_rolloff_demos'].apply(lambda x: skew(x, axis=1))
df_q3['spectral_rolloff_kurtosis'] = df_q3['spectral_rolloff_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q3['spectral_rolloff_range'] = df_q3['spectral_rolloff_max'] - df_q3['spectral_rolloff_min']

# df_q4
df_q4['spectral_rolloff_mean'] = df_q4['spectral_rolloff_demos'].apply(lambda x: x.mean(axis=1))
df_q4['spectral_rolloff_std'] = df_q4['spectral_rolloff_demos'].apply(lambda x: x.std(axis=1))
df_q4['spectral_rolloff_min'] = df_q4['spectral_rolloff_demos'].apply(lambda x: x.min(axis=1))
df_q4['spectral_rolloff_max'] = df_q4['spectral_rolloff_demos'].apply(lambda x: x.max(axis=1))
df_q4['spectral_rolloff_skew'] = df_q4['spectral_rolloff_demos'].apply(lambda x: skew(x, axis=1))
df_q4['spectral_rolloff_kurtosis'] = df_q4['spectral_rolloff_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q4['spectral_rolloff_range'] = df_q4['spectral_rolloff_max'] - df_q4['spectral_rolloff_min']

print("11. Finish adding spectral_rolloff's additional features.")

11. Finish adding spectral_rolloff's additional features.


In [None]:
### 12.tonnetz ###
# (6, 1295)
# The six dimensions correspond to two features each for fifths, minor thirds,
#   and major thirds, reflecting the presence and changes of these intervals over time.
# Each value in the array represents the presence and movement of
#   harmonic intervals in the respective dimension, measured across the time frames
#   of the audio signal. These values can indicate shifts in tonality, key changes,
#   and the harmonic complexity of the piece.
# calculate mean, std, min, max, skewness, kurtosis, range
df_q1['tonnetz_mean'] = df_q1['tonnetz_demos'].apply(lambda x: x.mean(axis=1))
df_q1['tonnetz_std'] = df_q1['tonnetz_demos'].apply(lambda x: x.std(axis=1))
df_q1['tonnetz_min'] = df_q1['tonnetz_demos'].apply(lambda x: x.min(axis=1))
df_q1['tonnetz_max'] = df_q1['tonnetz_demos'].apply(lambda x: x.max(axis=1))
df_q1['tonnetz_skew'] = df_q1['tonnetz_demos'].apply(lambda x: skew(x, axis=1))
df_q1['tonnetz_kurtosis'] = df_q1['tonnetz_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q1['tonnetz_range'] = df_q1['tonnetz_max'] - df_q1['tonnetz_min']

# df_q2
df_q2['tonnetz_mean'] = df_q2['tonnetz_demos'].apply(lambda x: x.mean(axis=1))
df_q2['tonnetz_std'] = df_q2['tonnetz_demos'].apply(lambda x: x.std(axis=1))
df_q2['tonnetz_min'] = df_q2['tonnetz_demos'].apply(lambda x: x.min(axis=1))
df_q2['tonnetz_max'] = df_q2['tonnetz_demos'].apply(lambda x: x.max(axis=1))
df_q2['tonnetz_skew'] = df_q2['tonnetz_demos'].apply(lambda x: skew(x, axis=1))
df_q2['tonnetz_kurtosis'] = df_q2['tonnetz_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q2['tonnetz_range'] = df_q2['tonnetz_max'] - df_q2['tonnetz_min']

# df_q3
df_q3['tonnetz_mean'] = df_q3['tonnetz_demos'].apply(lambda x: x.mean(axis=1))
df_q3['tonnetz_std'] = df_q3['tonnetz_demos'].apply(lambda x: x.std(axis=1))
df_q3['tonnetz_min'] = df_q3['tonnetz_demos'].apply(lambda x: x.min(axis=1))
df_q3['tonnetz_max'] = df_q3['tonnetz_demos'].apply(lambda x: x.max(axis=1))
df_q3['tonnetz_skew'] = df_q3['tonnetz_demos'].apply(lambda x: skew(x, axis=1))
df_q3['tonnetz_kurtosis'] = df_q3['tonnetz_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q3['tonnetz_range'] = df_q3['tonnetz_max'] - df_q3['tonnetz_min']

# df_q4
df_q4['tonnetz_mean'] = df_q4['tonnetz_demos'].apply(lambda x: x.mean(axis=1))
df_q4['tonnetz_std'] = df_q4['tonnetz_demos'].apply(lambda x: x.std(axis=1))
df_q4['tonnetz_min'] = df_q4['tonnetz_demos'].apply(lambda x: x.min(axis=1))
df_q4['tonnetz_max'] = df_q4['tonnetz_demos'].apply(lambda x: x.max(axis=1))
df_q4['tonnetz_skew'] = df_q4['tonnetz_demos'].apply(lambda x: skew(x, axis=1))
df_q4['tonnetz_kurtosis'] = df_q4['tonnetz_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q4['tonnetz_range'] = df_q4['tonnetz_max'] - df_q4['tonnetz_min']

print("12. Finish adding tonnetz's additional features.")

12. Finish adding tonnetz's additional features.


In [None]:
### 13.zero_crossing_rate ###
# (1, 1295)
# Each value in the array represents the rate of zero crossings within each frame.
# The value is essentially the count of zero crossings normalized by the time frame length, giving a rate per frame.
# calculate mean, std, min, max, skewness, kurtosis, range
df_q1['zero_crossing_rate_mean'] = df_q1['zero_crossing_rate_demos'].apply(lambda x: x.mean(axis=1))
df_q1['zero_crossing_rate_std'] = df_q1['zero_crossing_rate_demos'].apply(lambda x: x.std(axis=1))
df_q1['zero_crossing_rate_min'] = df_q1['zero_crossing_rate_demos'].apply(lambda x: x.min(axis=1))
df_q1['zero_crossing_rate_max'] = df_q1['zero_crossing_rate_demos'].apply(lambda x: x.max(axis=1))
df_q1['zero_crossing_rate_skew'] = df_q1['zero_crossing_rate_demos'].apply(lambda x: skew(x, axis=1))
df_q1['zero_crossing_rate_kurtosis'] = df_q1['zero_crossing_rate_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q1['zero_crossing_rate_range'] = df_q1['zero_crossing_rate_max'] - df_q1['zero_crossing_rate_min']

# df_q2
df_q2['zero_crossing_rate_mean'] = df_q2['zero_crossing_rate_demos'].apply(lambda x: x.mean(axis=1))
df_q2['zero_crossing_rate_std'] = df_q2['zero_crossing_rate_demos'].apply(lambda x: x.std(axis=1))
df_q2['zero_crossing_rate_min'] = df_q2['zero_crossing_rate_demos'].apply(lambda x: x.min(axis=1))
df_q2['zero_crossing_rate_max'] = df_q2['zero_crossing_rate_demos'].apply(lambda x: x.max(axis=1))
df_q2['zero_crossing_rate_skew'] = df_q2['zero_crossing_rate_demos'].apply(lambda x: skew(x, axis=1))
df_q2['zero_crossing_rate_kurtosis'] = df_q2['zero_crossing_rate_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q2['zero_crossing_rate_range'] = df_q2['zero_crossing_rate_max'] - df_q2['zero_crossing_rate_min']

# df_q3
df_q3['zero_crossing_rate_mean'] = df_q3['zero_crossing_rate_demos'].apply(lambda x: x.mean(axis=1))
df_q3['zero_crossing_rate_std'] = df_q3['zero_crossing_rate_demos'].apply(lambda x: x.std(axis=1))
df_q3['zero_crossing_rate_min'] = df_q3['zero_crossing_rate_demos'].apply(lambda x: x.min(axis=1))
df_q3['zero_crossing_rate_max'] = df_q3['zero_crossing_rate_demos'].apply(lambda x: x.max(axis=1))
df_q3['zero_crossing_rate_skew'] = df_q3['zero_crossing_rate_demos'].apply(lambda x: skew(x, axis=1))
df_q3['zero_crossing_rate_kurtosis'] = df_q3['zero_crossing_rate_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q3['zero_crossing_rate_range'] = df_q3['zero_crossing_rate_max'] - df_q3['zero_crossing_rate_min']

# df_q4
df_q4['zero_crossing_rate_mean'] = df_q4['zero_crossing_rate_demos'].apply(lambda x: x.mean(axis=1))
df_q4['zero_crossing_rate_std'] = df_q4['zero_crossing_rate_demos'].apply(lambda x: x.std(axis=1))
df_q4['zero_crossing_rate_min'] = df_q4['zero_crossing_rate_demos'].apply(lambda x: x.min(axis=1))
df_q4['zero_crossing_rate_max'] = df_q4['zero_crossing_rate_demos'].apply(lambda x: x.max(axis=1))
df_q4['zero_crossing_rate_skew'] = df_q4['zero_crossing_rate_demos'].apply(lambda x: skew(x, axis=1))
df_q4['zero_crossing_rate_kurtosis'] = df_q4['zero_crossing_rate_demos'].apply(lambda x: kurtosis(x, axis=1))
df_q4['zero_crossing_rate_range'] = df_q4['zero_crossing_rate_max'] - df_q4['zero_crossing_rate_min']

print("13. Finish adding zero_crossing_rate's additional features.")

13. Finish adding zero_crossing_rate's additional features.


In [None]:
### 14.tempo ###
# (1,)
# Estimate the tempo (beats per minute)
print("14. Using tempo itself as feature is good. No need to add other mathematical features based on this.")

14. Using tempo itself as feature is good. No need to add other mathematical features based on this.


In [None]:
### 15.tempogram ###
# (384, 1295)
# shape (tempo_bins, n_frames), where tempo_bins represents the number of
#   tempo bins (each corresponding to a specific tempo in BPM) considered in the analysis
# Each element in the array represents the strength or likelihood of a particular
#   tempo at a specific time frame. Higher values indicate a stronger presence of
#   that tempo at the given time, suggesting that the music or audio is likely following that tempo at that moment.
# calculate mean, std, min, max, skewness, kurtosis, range
# df_q1['tempogram_mean'] = df_q1['tempogram'].apply(lambda x: x.mean(axis=1))
# df_q1['tempogram_std'] = df_q1['tempogram'].apply(lambda x: x.std(axis=1))
# df_q1['tempogram_min'] = df_q1['tempogram'].apply(lambda x: x.min(axis=1))
# df_q1['tempogram_max'] = df_q1['tempogram'].apply(lambda x: x.max(axis=1))
# df_q1['tempogram_skew'] = df_q1['tempogram'].apply(lambda x: skew(x, axis=1))
# df_q1['tempogram_kurtosis'] = df_q1['tempogram'].apply(lambda x: kurtosis(x, axis=1))
# df_q1['tempogram_range'] = df_q1['tempogram_max'] - df_q1['tempogram_min']

# drop feature
df_q1.drop(columns=["tempogram"], inplace=True) # inplace=True will let the modificaiton replace original data
df_q2.drop(columns=["tempogram"], inplace=True)
df_q3.drop(columns=["tempogram"], inplace=True)
df_q4.drop(columns=["tempogram"], inplace=True)
print("15. Finish adding tempogram's additional features. Since causing catastrophic cancellation, decide to DROP this feature.")

15. Finish adding tempogram's additional features. Since causing catastrophic cancellation, decide to DROP this feature.


In [None]:
### 16.fourier_tempogram ###
# (193, 1296)
# shape (tempo_bins, n_frames), where tempo_bins represents the number of tempo frequency bins analyzed
# Each element in the array represents the magnitude of a particular
#   tempo frequency at a specific time frame. High values indicate strong periodicities
#   or consistent tempos at those frequency bins, suggesting that the audio signal exhibits strong rhythmic patterns at those tempos.
# calculate mean, std, min, max, skewness, kurtosis, range
# df_q1['fourier_tempogram_mean'] = df_q1['fourier_tempogram'].apply(lambda x: x.mean(axis=1))
# df_q1['fourier_tempogram_std'] = df_q1['fourier_tempogram'].apply(lambda x: x.std(axis=1))
# df_q1['fourier_tempogram_min'] = df_q1['fourier_tempogram'].apply(lambda x: x.min(axis=1))
# df_q1['fourier_tempogram_max'] = df_q1['fourier_tempogram'].apply(lambda x: x.max(axis=1))
# df_q1['fourier_tempogram_skew'] = df_q1['fourier_tempogram'].apply(lambda x: skew(x, axis=1))
# df_q1['fourier_tempogram_kurtosis'] = df_q1['fourier_tempogram'].apply(lambda x: kurtosis(x, axis=1))
# df_q1['fourier_tempogram_range'] = df_q1['fourier_tempogram_max'] - df_q1['fourier_tempogram_min']

# # df_q2
# df_q2['fourier_tempogram_mean'] = df_q2['fourier_tempogram'].apply(lambda x: x.mean(axis=1))
# df_q2['fourier_tempogram_std'] = df_q2['fourier_tempogram'].apply(lambda x: x.std(axis=1))
# df_q2['fourier_tempogram_min'] = df_q2['fourier_tempogram'].apply(lambda x: x.min(axis=1))
# df_q2['fourier_tempogram_max'] = df_q2['fourier_tempogram'].apply(lambda x: x.max(axis=1))
# df_q2['fourier_tempogram_skew'] = df_q2['fourier_tempogram'].apply(lambda x: skew(x, axis=1))
# df_q2['fourier_tempogram_kurtosis'] = df_q2['fourier_tempogram'].apply(lambda x: kurtosis(x, axis=1))
# df_q2['fourier_tempogram_range'] = df_q2['fourier_tempogram_max'] - df_q2['fourier_tempogram_min']

# # df_q3
# df_q3['fourier_tempogram_mean'] = df_q3['fourier_tempogram'].apply(lambda x: x.mean(axis=1))
# df_q3['fourier_tempogram_std'] = df_q3['fourier_tempogram'].apply(lambda x: x.std(axis=1))
# df_q3['fourier_tempogram_min'] = df_q3['fourier_tempogram'].apply(lambda x: x.min(axis=1))
# df_q3['fourier_tempogram_max'] = df_q3['fourier_tempogram'].apply(lambda x: x.max(axis=1))
# df_q3['fourier_tempogram_skew'] = df_q3['fourier_tempogram'].apply(lambda x: skew(x, axis=1))
# df_q3['fourier_tempogram_kurtosis'] = df_q3['fourier_tempogram'].apply(lambda x: kurtosis(x, axis=1))
# df_q3['fourier_tempogram_range'] = df_q3['fourier_tempogram_max'] - df_q3['fourier_tempogram_min']

# # df_q4
# df_q4['fourier_tempogram_mean'] = df_q4['fourier_tempogram'].apply(lambda x: x.mean(axis=1))
# df_q4['fourier_tempogram_std'] = df_q4['fourier_tempogram'].apply(lambda x: x.std(axis=1))
# df_q4['fourier_tempogram_min'] = df_q4['fourier_tempogram'].apply(lambda x: x.min(axis=1))
# df_q4['fourier_tempogram_max'] = df_q4['fourier_tempogram'].apply(lambda x: x.max(axis=1))
# df_q4['fourier_tempogram_skew'] = df_q4['fourier_tempogram'].apply(lambda x: skew(x, axis=1))
# df_q4['fourier_tempogram_kurtosis'] = df_q4['fourier_tempogram'].apply(lambda x: kurtosis(x, axis=1))
# df_q4['fourier_tempogram_range'] = df_q4['fourier_tempogram_max'] - df_q4['fourier_tempogram_min']

# drop feature
df_q1.drop(columns=["fourier_tempogram"], inplace=True)
df_q2.drop(columns=["fourier_tempogram"], inplace=True)
df_q3.drop(columns=["fourier_tempogram"], inplace=True)
df_q4.drop(columns=["fourier_tempogram"], inplace=True)

print("16. Finish adding fourier_tempogram's additional features. Not useful, therefore DROP this feature")

16. Finish adding fourier_tempogram's additional features. Not useful, therefore DROP this feature


In [None]:
print(f"df_q1.shape is: {df_q1.shape}") # (225, 101)
print(f"df_q2.shape is: {df_q2.shape}") # (225, 101)
print(f"df_q3.shape is: {df_q3.shape}") # (225, 101)
print(f"df_q4.shape is: {df_q4.shape}") # (225, 101)

df_q1.shape is: (225, 101)
df_q2.shape is: (225, 101)
df_q3.shape is: (225, 101)
df_q4.shape is: (225, 101)


In [None]:
df_q1.head()

Unnamed: 0,xs,srs,chroma_stft_demos,chroma_cqt_demos,chroma_vqt_demos,chroma_cens_demos,melspectrogram_demos,mfcc_demos,rms_demos,spectral_centroid_demos,...,tonnetz_skew,tonnetz_kurtosis,tonnetz_range,zero_crossing_rate_mean,zero_crossing_rate_std,zero_crossing_rate_min,zero_crossing_rate_max,zero_crossing_rate_skew,zero_crossing_rate_kurtosis,zero_crossing_rate_range
0,"[2.5885166e-15, -3.28127e-16, 1.3183922e-14, 2...",22050,"[[0.64465576, 0.5051329, 0.6534084, 0.679588, ...","[[0.5660721, 0.5691674, 0.5468383, 0.45331684,...","[[0.848989, 0.8981467, 0.73216736, 0.62612474,...","[[0.15784575, 0.15223381, 0.14725037, 0.142912...","[[1.3533841e-08, 3.5398793e-06, 0.00014950961,...","[[-556.7651, -511.8035, -422.1127, -383.47598,...","[[0.0005485677, 0.0020291524, 0.0029873797, 0....","[[2600.362152387303, 2068.7657054944157, 1942....",...,"[-0.051441124412621844, -0.2763099231427766, 0...","[-0.2676932002114798, -0.2611242959606561, -0....","[0.39503816324089674, 0.44382569880070666, 0.6...",[0.1439822635135135],[0.06052925895431856],[0.03125],[0.564453125],[2.299913592613744],[9.972581240443636],[0.533203125]
1,"[-2.860234e-14, 2.5666e-14, 1.2625685e-14, 5.9...",22050,"[[0.54650044, 0.23805048, 0.119608514, 0.28680...","[[0.3125394, 0.27293065, 0.23888744, 0.2133149...","[[0.5071617, 0.41103512, 0.4162637, 0.29573256...","[[0.08206277, 0.08034206, 0.07872083, 0.077143...","[[8.769024e-09, 3.6914357e-06, 2.9017758e-05, ...","[[-502.408, -498.12592, -482.74106, -464.25244...","[[0.00015566013, 0.00077594863, 0.0017064415, ...","[[3185.2521348690234, 2261.5056267393707, 1867...",...,"[-0.08358419526538056, -0.24758178264490532, 0...","[-0.4546138798763306, -0.5757695864820596, 0.0...","[0.42942833816117487, 0.4362426048761109, 0.52...",[0.10062665902509653],[0.06195558099185238],[0.0126953125],[0.53857421875],[2.4103148531571437],[10.632752706921927],[0.52587890625]
2,"[1.4363665e-14, 1.7338858e-14, -1.7440724e-15,...",22050,"[[0.35400018, 0.31895828, 0.15409946, 0.094428...","[[0.14923647, 0.15812233, 0.16552325, 0.166325...","[[0.24184267, 0.19096148, 0.16628368, 0.155282...","[[0.0, 0.0, 0.00010032334, 0.0004665737, 0.001...","[[2.2634141e-07, 0.00012189508, 0.00091573584,...","[[-533.9376, -523.7987, -495.95483, -466.6701,...","[[0.0001440642, 0.0008707615, 0.0016941759, 0....","[[3886.240396747828, 2602.2479592907566, 2583....",...,"[0.4483331808652231, -0.09338588894483782, 0.0...","[0.07777779783759664, -0.66986980775311, -0.57...","[0.549337563990827, 0.45787447752394766, 0.692...",[0.06854375301640926],[0.03186446827069103],[0.01806640625],[0.21875],[1.0653329036995505],[1.313176804637732],[0.20068359375]
3,"[3.948538e-14, 2.4908268e-14, 2.3690624e-14, 3...",22050,"[[0.1104949, 0.054469876, 0.04289725, 0.038043...","[[0.11308139, 0.11504669, 0.10645489, 0.100842...","[[0.22638988, 0.19970289, 0.25529122, 0.247800...","[[0.021481985, 0.02449442, 0.027625265, 0.0307...","[[3.1630414e-06, 0.00047005177, 0.0046568206, ...","[[-567.22174, -488.08493, -394.5883, -341.3959...","[[0.0010538041, 0.0049711214, 0.011370153, 0.0...","[[1260.2291554594187, 1460.3761663407904, 1368...",...,"[0.46187238116049745, 0.7032068902094887, 0.64...","[-0.689955821707545, 0.011593890255965356, -0....","[0.6551601962522167, 0.8039702548013667, 0.817...",[0.0696255127895753],[0.016080018069265088],[0.02978515625],[0.1484375],[0.7492532964882583],[1.9720264141033086],[0.11865234375]
4,"[1.5478067e-15, -4.7274973e-16, 3.9259797e-16,...",22050,"[[0.5735273, 0.44081706, 0.2916847, 0.1408742,...","[[0.4814759, 0.54028213, 0.64797264, 0.4830193...","[[0.37801835, 0.5129215, 0.62981725, 0.7347456...","[[0.51716256, 0.5277147, 0.538432, 0.5492283, ...","[[8.511694e-11, 3.06712e-08, 4.1023623e-07, 2....","[[-546.37854, -546.37854, -545.20844, -534.487...","[[1.50278765e-05, 6.5735556e-05, 0.00023117632...","[[2549.7987453674305, 1975.966686504611, 1500....",...,"[-0.07228530989872117, -0.033910242081109286, ...","[2.0776852407507347, 1.3959966015028629, 0.106...","[0.9353934797999381, 0.8043349604683768, 1.031...",[0.09968026061776061],[0.039801127600770446],[0.025390625],[0.23291015625],[0.24549962817541723],[-0.5180193190257767],[0.20751953125]


In [None]:
print(list(df_q1.columns))

['xs', 'srs', 'chroma_stft_demos', 'chroma_cqt_demos', 'chroma_vqt_demos', 'chroma_cens_demos', 'melspectrogram_demos', 'mfcc_demos', 'rms_demos', 'spectral_centroid_demos', 'spectral_bandwidth_demos', 'spectral_contrast_demos', 'spectral_flatness_demos', 'spectral_rolloff_demos', 'tonnetz_demos', 'zero_crossing_rate_demos', 'tempo', 'tempogram_ratio', 'labels', 'chroma_stft_mean', 'chroma_stft_std', 'chroma_stft_sum', 'chroma_stft_skew', 'chroma_stft_kurtosis', 'chroma_cqt_mean', 'chroma_cqt_std', 'chroma_cqt_sum', 'chroma_cqt_skew', 'chroma_cqt_kurtosis', 'chroma_vqt_mean', 'chroma_vqt_std', 'chroma_vqt_sum', 'chroma_vqt_skew', 'chroma_vqt_kurtosis', 'melspectrogram_mean', 'melspectrogram_std', 'melspectrogram_min', 'melspectrogram_max', 'melspectrogram_sum', 'melspectrogram_skew', 'melspectrogram_kurtosis', 'mfcc_mean', 'mfcc_std', 'mfcc_min', 'mfcc_max', 'mfcc_skew', 'mfcc_kurtosis', 'mfcc_median', 'rms_mean', 'rms_std', 'rms_min', 'rms_max', 'rms_skew', 'rms_kurtosis', 'spectral_c

In [None]:
df_q1['chroma_stft_demos']
data = df_q1['chroma_stft_demos'].values.reshape((225, 1))
data_sequences = []
sequences = []
for i in data:
  sequences.append(i[0])
max_len = max(s.shape[1] for s in sequences)

padded_sequences_all = []
for seq in sequences:
  padded_sequences = []
  for s in seq:
    padded_sequences.append(np.append(s,np.zeros((max_len-len(s),))))
  padded_sequences_all.append(np.array(padded_sequences))
test_array = np.array(padded_sequences_all).shape
print(test_array)



(225, 12, 1297)


In [None]:
### xiyah ###
# saved df_q1 having shape (225, 26)
# Can save RAM by directly read the saved csv file
# Save df_q1 after adding features to CSV
# df_q1_path = "/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q1_mod.csv"
# df_q2_path = "/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q2_mod.csv"
# df_q3_path = "/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q3_mod.csv"
# df_q4_path = "/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q4_mod.csv"
# df_q1.to_csv(df_q1_path, index=False)
# df_q2.to_csv(df_q2_path, index=False)
# df_q3.to_csv(df_q3_path, index=False)
# df_q4.to_csv(df_q4_path, index=False)

# save to .npy
# df_q1_path = "/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q1_mod.npy"
# df_q2_path = "/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q2_mod.npy"
# df_q3_path = "/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q3_mod.npy"
# df_q4_path = "/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/all_features_q4_mod.npy"
# # to array
# arr_q1 = df_q1.to_array()
# arr_q2 = df_q2.to_array()
# arr_q3 = df_q3.to_array()
# arr_q4 = df_q4.to_array()
# # save
# np.save(df_q1_path, arr_q1, allow_pickle=True)
# np.save(df_q2_path, arr_q2, allow_pickle=True)
# np.save(df_q3_path, arr_q3, allow_pickle=True)
# np.save(df_q4_path, arr_q4, allow_pickle=True)
### xiyah ###

In [None]:
# ### Xiangyu ###
# # The following is to add more statitical parameters
# # mean, max, min, std, skewness, kurtosis
# features_df = pd.DataFrame()

# # save the mean to the file
# features_df['chroma_stft_mean'] = [np.mean(feat) for feat in chroma_stft_demos]
# features_df['chroma_cqt_mean'] = [np.mean(feat) for feat in chroma_cqt_demos]
# features_df['chroma_vqt_mean'] = [np.mean(feat) for feat in chroma_vqt_demos]

# features_df['melspectrogram_mean'] = [np.mean(feat) for feat in melspectrogram_demos]
# features_df['mfcc_mean'] = [np.mean(feat, axis=1).mean() for feat in mfcc_demos]  # MFCCs需要沿两个轴取平均
# features_df['rms_mean'] = [np.mean(feat) for feat in rms_demos]

# features_df['spectral_centroid_mean'] = [np.mean(feat) for feat in spectral_centroid_demos]
# features_df['spectral_bandwidth_mean'] = [np.mean(feat) for feat in spectral_bandwidth_demos]
# features_df['spectral_contrast_mean'] = [np.mean(feat, axis=1).mean() for feat in spectral_contrast_demos]
# features_df['spectral_flatness_mean'] = [np.mean(feat) for feat in spectral_flatness_demos]
# features_df['spectral_rolloff_mean'] = [np.mean(feat) for feat in spectral_rolloff_demos]
# features_df['tonnetz_mean'] = [np.mean(feat) for feat in tonnetz_demos]
# features_df['zero_crossing_rate_mean'] = [np.mean(feat) for feat in zero_crossing_rate_demos]

# features_df['tempo_mean'] = [np.mean(feat) for feat in tempo]
# features_df['tempogram_mean'] = [np.mean(feat) for feat in tempogram]

# #by BEI 这个吊变量跑出来是个虚数 删了
# #q1_features_df['fourier_tempogram'] = [np.mean(feat) for feat in q1_fourier_tempogram]
# features_df['tempogram_ratio_mean'] = [np.mean(feat) for feat in tempogram_ratio]

# # save the max to the file

# features_df['chroma_stft_max'] = [np.max(feat) for feat in chroma_stft_demos]
# features_df['chroma_stft_max'] = [np.max(feat) for feat in chroma_stft_demos]
# features_df['chroma_cqt_max'] = [np.max(feat) for feat in chroma_cqt_demos]
# features_df['chroma_vqt_max'] = [np.max(feat) for feat in chroma_vqt_demos]

# features_df['melspectrogram_max'] = [np.max(feat) for feat in melspectrogram_demos]
# features_df['mfcc_max'] = [np.max(feat) for feat in mfcc_demos]
# features_df['rms_max'] = [np.max(feat) for feat in rms_demos]

# features_df['spectral_centroid_max'] = [np.max(feat) for feat in spectral_centroid_demos]
# features_df['spectral_bandwidth_max'] = [np.max(feat) for feat in spectral_bandwidth_demos]
# features_df['spectral_contrast_max'] = [np.max(feat, axis=1).mean() for feat in spectral_contrast_demos]
# features_df['spectral_flatness_max'] = [np.max(feat) for feat in spectral_flatness_demos]
# features_df['spectral_rolloff_max'] = [np.max(feat) for feat in spectral_rolloff_demos]
# features_df['tonnetz_max'] = [np.max(feat) for feat in tonnetz_demos]
# features_df['zero_crossing_rate_max'] = [np.max(feat) for feat in zero_crossing_rate_demos]

# features_df['tempo_max'] = [np.max(feat) for feat in tempo]
# features_df['tempogram_max'] = [np.max(feat) for feat in tempogram]

# # save the mean to the file
# features_df['chroma_stft_min'] = [np.min(feat) for feat in chroma_stft_demos]
# features_df['chroma_cqt_min'] = [np.min(feat) for feat in chroma_cqt_demos]
# features_df['chroma_vqt_min'] = [np.min(feat) for feat in chroma_vqt_demos]

# features_df['melspectrogram_min'] = [np.min(feat) for feat in melspectrogram_demos]
# features_df['mfcc_min'] = [np.min(feat) for feat in mfcc_demos]
# features_df['rms_min'] = [np.min(feat) for feat in rms_demos]

# features_df['spectral_centroid_min'] = [np.min(feat) for feat in spectral_centroid_demos]
# features_df['spectral_bandwidth_min'] = [np.min(feat) for feat in spectral_bandwidth_demos]
# features_df['spectral_contrast_min'] = [np.min(feat, axis=1).mean() for feat in spectral_contrast_demos]
# features_df['spectral_flatness_min'] = [np.min(feat) for feat in spectral_flatness_demos]
# features_df['spectral_rolloff_min'] = [np.min(feat) for feat in spectral_rolloff_demos]
# features_df['tonnetz_min'] = [np.min(feat) for feat in tonnetz_demos]
# features_df['zero_crossing_rate_min'] = [np.min(feat) for feat in zero_crossing_rate_demos]

# features_df['tempo_min'] = [np.min(feat) for feat in tempo]
# features_df['tempogram_min'] = [np.min(feat) for feat in tempogram]

# # standard deviation
# features_df['chroma_stft_std'] = [np.std(feat) for feat in chroma_stft_demos]
# features_df['chroma_cqt_std'] = [np.std(feat) for feat in chroma_cqt_demos]
# features_df['chroma_vqt_std'] = [np.std(feat) for feat in chroma_vqt_demos]

# features_df['melspectrogram_std'] = [np.std(feat) for feat in melspectrogram_demos]
# features_df['mfcc_std'] = [np.std(feat) for feat in mfcc_demos]
# features_df['rms_std'] = [np.std(feat) for feat in rms_demos]

# features_df['spectral_centroid_std'] = [np.std(feat) for feat in spectral_centroid_demos]
# features_df['spectral_bandwidth_std'] = [np.std(feat) for feat in spectral_bandwidth_demos]
# features_df['spectral_contrast_std'] = [np.std(feat, axis=1).mean() for feat in spectral_contrast_demos]
# features_df['spectral_flatness_std'] = [np.std(feat) for feat in spectral_flatness_demos]
# features_df['spectral_rolloff_std'] = [np.std(feat) for feat in spectral_rolloff_demos]
# features_df['tonnetz_std'] = [np.std(feat) for feat in tonnetz_demos]
# features_df['zero_crossing_rate_std'] = [np.std(feat) for feat in zero_crossing_rate_demos]

# features_df['tempo_std'] = [np.std(feat) for feat in tempo]
# features_df['tempogram_std'] = [np.std(feat) for feat in tempogram]


# # skewness
# features_df['chroma_stft_skew'] = [skew(feat.flatten()) for feat in chroma_stft_demos]
# features_df['chroma_cqt_skew'] = [skew(feat.flatten()) for feat in chroma_cqt_demos]
# features_df['chroma_vqt_skew'] = [skew(feat.flatten()) for feat in chroma_vqt_demos]

# features_df['melspectrogram_skew'] = [skew(feat.flatten()) for feat in melspectrogram_demos]
# features_df['mfcc_skew'] = [skew(feat.flatten()) for feat in mfcc_demos]
# features_df['rms_skew'] = [skew(feat.flatten()) for feat in rms_demos]

# features_df['spectral_centroid_skew'] = [skew(feat.flatten()) for feat in spectral_centroid_demos]
# features_df['spectral_bandwidth_skew'] = [skew(feat.flatten()) for feat in spectral_bandwidth_demos]
# features_df['spectral_contrast_skew'] = [skew(feat, axis=1).mean() for feat in spectral_contrast_demos]  # Consider how to best apply skew here
# features_df['spectral_flatness_skew'] = [skew(feat.flatten()) for feat in spectral_flatness_demos]
# features_df['spectral_rolloff_skew'] = [skew(feat.flatten()) for feat in spectral_rolloff_demos]
# features_df['tonnetz_skew'] = [skew(feat.flatten()) for feat in tonnetz_demos]
# features_df['zero_crossing_rate_skew'] = [skew(feat.flatten()) for feat in zero_crossing_rate_demos]

# features_df['tempo_skew'] = [skew(feat.flatten()) for feat in tempo]
# features_df['tempogram_skew'] = [skew(feat.flatten()) for feat in tempogram]

# # kurtosis
# features_df['chroma_stft_kurtosis'] = [kurtosis(feat.flatten()) for feat in chroma_stft_demos]
# features_df['chroma_cqt_kurtosis'] = [kurtosis(feat.flatten()) for feat in chroma_cqt_demos]
# features_df['chroma_vqt_kurtosis'] = [kurtosis(feat.flatten()) for feat in chroma_vqt_demos]

# features_df['melspectrogram_kurtosis'] = [kurtosis(feat.flatten()) for feat in melspectrogram_demos]
# features_df['mfcc_kurtosis'] = [kurtosis(feat.flatten()) for feat in mfcc_demos]
# features_df['rms_kurtosis'] = [kurtosis(feat.flatten()) for feat in rms_demos]

# features_df['spectral_centroid_kurtosis'] = [kurtosis(feat.flatten()) for feat in spectral_centroid_demos]
# features_df['spectral_bandwidth_kurtosis'] = [kurtosis(feat.flatten()) for feat in spectral_bandwidth_demos]
# features_df['spectral_contrast_kurtosis'] = [kurtosis(feat, axis=1).mean() for feat in spectral_contrast_demos]  # Adjust if necessary for your analysis
# features_df['spectral_flatness_kurtosis'] = [kurtosis(feat.flatten()) for feat in spectral_flatness_demos]
# features_df['spectral_rolloff_kurtosis'] = [kurtosis(feat.flatten()) for feat in spectral_rolloff_demos]
# features_df['tonnetz_kurtosis'] = [kurtosis(feat.flatten()) for feat in tonnetz_demos]
# features_df['zero_crossing_rate_kurtosis'] = [kurtosis(feat.flatten()) for feat in zero_crossing_rate_demos]

# features_df['tempo_kurtosis'] = [kurtosis(feat.flatten()) for feat in tempo]
# features_df['tempogram_kurtosis'] = [kurtosis(feat.flatten()) for feat in tempogram]


# # save all the parameters to a file
# features_df.to_csv('/content/drive/MyDrive/EN.553.602DataMiningSpring2024/dataset/MER_audio_taffc_dataset/features.csv', index=False)

## Updates
2024.02.27 (Xiyah)
  - All data was uploaded to group google drive.
  - Initiated the coding working document.
  - Done verifying librosa's features and the top100_features we had, notice only small amount features overlapped, then decided to not "replicate" Renato Panda's work. Instead, we will generate our own 19 features using Librosa, then do a final predicted results comparison to the labels generated by Renato Panda's team.
  - Next, needs to convert all 900 30-sec audio clips into the 19 features. The 19 features are listed in [here](https://librosa.org/doc/latest/feature.html), all features under sections "Spectral features" and "Rhythm features". No need to use the features under "Feature manipulation" and "Feature inversion".

2024.02.29 (Xiangyu)
  - Create six statistical parameters with respect to 19 features

2024.02.29 (Xiyah)
  - Changed saved files format from pickle to .npz which is easier for reading.

2024.03.02 (Xiyah)
  - Added all features' additional mathmatical features.
  - Saved new data to "all_features_q1_mod.csv", "all_features_q2_mod.csv", "all_features_q3_mod.csv", and "all_features_q3_mod.csv".
  - Featurizations are done.

2024.03.03 (Xiyah)
  - Delete features "tempogram" and "fourier_tempogram" from the four DataFrames using df.drop(inplace=Ture)
  - Delete .csv files since they will store data into strings, that is inconvenient for later usage. Still stores data into four .npy files.
  

## Mid-Checkpoint Represent

1. **Situation**: We want to know whether we can replicate our datasets' authors' works.  
   **Task**: We compared their top 100 features, which they used to train their models to librosa's features library, only a few overlaps.  
   **Action**: We decided to convert all audios into librosa features, which are 18 in-totoal. Then, store all the converted features into a single .npz file, since it is more flexible for us to do later works than .csv file, and is more doable than pickle file.
2. With 900 data and 18 features, we would like to add more features by implementing some statistical features based on the existing 18. Since more data while support Neural Network better.

## Later Issues

1. Can chunck the audio into only 15 secs or 10 secs instead of 30 secs to see whether we still got the similar results.

2.