## Convolutional neural network (CNN)<a name="convolutional"></a>
[[go back to the top]](#contents)

For the CNN classifier, we have two options:
- **1D CNN**, which inputs can be obtained by applying the **CNN directly on portions (windows)** of the original sound signal (after downsampling and normalization).
- **2D CNN**, which are based on **time frequency analysis of sounds**, as the **Mel-frequency cepstral coefficients (MFCCs)**.

With the features extracted above, we can evaluate their suitability for each CNN approach.
- **1D CNN:**
    - Spectral Centroid
    - Spectral Bandwith
    - Spectral Flatness
    - Spectral Rollof
- **2D CNN:**
    - Chromagram
    - Mel-Scaled Spectogram
    - Short-time Fourier transform Tempogram

In [12]:
import pandas as pd
# data = pd.read_csv("/home/luskas_carneiro/Desktop/AC2/AC-II/datasets/urbansounds_features_fold1.csv")
# data


In [13]:
def load_fold_data(fold_number, files):
    data = pd.read_csv(files[fold_number])
    labels = data.pop('Label').values
    features = data.values
    return features, labels

In [14]:
configurations= {
    "num_conv_layers": [3, 4, 5],  # Number of convolutional layers
    "filters_per_layer": [[32, 64], [64, 128], [128, 256]],  # Filters for each layer
    "kernel_size": [3, 5, 7],  # Kernel sizes
    "activation": ["relu", "tanh", "sigmoid"],  # Activation functions
    "dense_units": [64, 128, 256, 512],  # Number of dense layer units
    "dropout": [0.1, 0.2, 0.3,]  # Dropout rates
}

In [15]:
import random

def sample_random_config(configurations):
    return {key: random.choice(value) for key, value in configurations.items()}

# Example of a sampled configuration
random_config = sample_random_config(configurations)

print(random_config)



{'num_conv_layers': 4, 'filters_per_layer': [32, 64], 'kernel_size': 5, 'activation': 'tanh', 'dense_units': 64, 'dropout': 0.1}


In [16]:
import itertools

def generate_configs(configurations):
    keys, values = zip(*configurations.items())
    return [dict(zip(keys, v)) for v in itertools.product(*values)]

grid_search=generate_configs(configurations)
print(grid_search)
print(len(grid_search))


[{'num_conv_layers': 3, 'filters_per_layer': [32, 64], 'kernel_size': 3, 'activation': 'relu', 'dense_units': 64, 'dropout': 0.1}, {'num_conv_layers': 3, 'filters_per_layer': [32, 64], 'kernel_size': 3, 'activation': 'relu', 'dense_units': 64, 'dropout': 0.2}, {'num_conv_layers': 3, 'filters_per_layer': [32, 64], 'kernel_size': 3, 'activation': 'relu', 'dense_units': 64, 'dropout': 0.3}, {'num_conv_layers': 3, 'filters_per_layer': [32, 64], 'kernel_size': 3, 'activation': 'relu', 'dense_units': 128, 'dropout': 0.1}, {'num_conv_layers': 3, 'filters_per_layer': [32, 64], 'kernel_size': 3, 'activation': 'relu', 'dense_units': 128, 'dropout': 0.2}, {'num_conv_layers': 3, 'filters_per_layer': [32, 64], 'kernel_size': 3, 'activation': 'relu', 'dense_units': 128, 'dropout': 0.3}, {'num_conv_layers': 3, 'filters_per_layer': [32, 64], 'kernel_size': 3, 'activation': 'relu', 'dense_units': 256, 'dropout': 0.1}, {'num_conv_layers': 3, 'filters_per_layer': [32, 64], 'kernel_size': 3, 'activation':

In [None]:
# from tensorflow.keras import layers, models, optimizers

# def build_cnn(config, input_shape):
#     model = models.Sequential()
#     #first layer fixed because of input shape
#     model.add(layers.Conv2D(config["filters_per_layer"][0], config["kernel_size"], activation=config["activation"], padding="same", input_shape=input_shape))
#     model.add(layers.MaxPooling2D(pool_size=(2, 2)))
#     # Add convolutional layers
#     for filters in config["filters_per_layer"][1:]:
#         model.add(layers.Conv2D(filters, config["kernel_size"], activation=config["activation"], padding="same"))
#         model.add(layers.MaxPooling2D(pool_size=(2, 2)))
    
#     # Add dense layers
#     model.add(layers.Flatten())
#     model.add(layers.Dense(config["dense_units"], activation=config["activation"]))
#     model.add(layers.Dropout(config["dropout"]))
    
#     # Output layer for classification
#     model.add(layers.Dense(10, activation="softmax"))  # Adjust output units for your task
    
#     return model

In [34]:
from tensorflow.keras import layers, models, optimizers

def build_cnn(config, input_shape):
    model = models.Sequential()
    #first layer fixed because of input shape
    model.add(layers.Reshape((input_shape[0], 1), input_shape=input_shape))
    # Add convolutional layers
    for filters in config["filters_per_layer"]:
        model.add(layers.Conv1D(filters, config["kernel_size"], activation=config["activation"], padding="same"))
        model.add(layers.MaxPooling1D(2))
    
    # Add dense layers
    model.add(layers.Flatten())
    model.add(layers.Dense(config["dense_units"], activation=config["activation"]))
    model.add(layers.Dropout(config["dropout"]))
    
    # Output layer for classification
    model.add(layers.Dense(10, activation="softmax"))  # Adjust output units for your task
    
    return model

In [None]:
# import tensorflow as tf
# import numpy as np
# # Assuming you have training and validation data
# def train_evaluate_cnn(config, X_train, y_train, X_val, y_val):
#     model = build_cnn(config)
#     model.compile(
#         optimizer=optimizers.Adam(learning_rate=0.001),
#         loss="categorical_crossentropy",
#         metrics=["accuracy"]
#     )
#     history = model.fit(
#         X_train, y_train,
#         validation_data=(X_val, y_val),
#         epochs=20,  # Use fewer epochs for random search to save time
#         batch_size=64,
#         callbacks=[
#             tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True),
#             tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-6)
#         ],
#         verbose=0
#     )

#     return max(history.history["val_accuracy"])




# def cross_validate_model(config, files):
#     accuracies = []
    
#     for fold_number in range(len(files)):
#         # Validation data
#         X_val, y_val = load_fold_data(fold_number, files)
#         X_train, y_train = [], []
        
#         # Training data from other folds
#         for i in range(len(files)):
#             if i != fold_number:
#                 X_temp, y_temp = load_fold_data(i, files)
#                 X_train.append(X_temp)
#                 y_train.append(y_temp)

#         X_train = np.concatenate(X_train, axis=0)
#         y_train = np.concatenate(y_train, axis=0)
#         # Train and evaluate
#         accuracy = train_evaluate_cnn(config, X_train, y_train, X_val, y_val)
#         accuracies.append(accuracy)
#     print(accuracies)
#     # Return average accuracy
#     return np.mean(accuracies)

# def load_fold_data(fold_number, files):
#     data = pd.read_csv(files[fold_number])
#     labels = data.pop('Label').values
#     features = data.values
#     return features, labels

In [47]:
import tensorflow as tf
import numpy as np
# Assuming you have training and validation data
def train_evaluate_cnn(config, X_train, y_train, X_val, y_val):
    input_shape=X_train.shape[1:]
    model = build_cnn(config, input_shape=input_shape)
    model.compile(
        optimizer=optimizers.Adam(learning_rate=0.001),
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"]
    )
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        epochs=20,  # Use fewer epochs for random search to save time
        batch_size=32,
        callbacks=[
            tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
        ],
        verbose=0
    )

    return max(history.history["val_accuracy"])




def cross_validate_model(config, files):
    accuracies = []
    
    for fold_number in range(len(files)):
        # Validation data
        X_val, y_val = load_fold_data(fold_number, files)
        X_train, y_train = [], []
        
        # Training data from other folds
        for i in range(len(files)):
            if i != fold_number:
                X_temp, y_temp = load_fold_data(i, files)
                X_train.append(X_temp)
                y_train.append(y_temp)

        X_train = np.concatenate(X_train, axis=0)
        y_train = np.concatenate(y_train, axis=0)
        # Train and evaluate
        accuracy = train_evaluate_cnn(config, X_train, y_train, X_val, y_val)
        accuracies.append(accuracy)
        print (f"Estou no fold de validacao {fold_number} a acc foi {accuracy}")
    print(accuracies)
    # Return average accuracy
    return np.mean(accuracies)

def load_fold_data(fold_number, files):
    data = pd.read_csv(files[fold_number])
    labels = data.pop('Label').values
    features = data.values
    return features, labels

### Random Search Validation

In [None]:


def random_search(files):
    best_config_random = None
    best_score_random = 0
    
    for i in range(10):  # Number of random configurations to try
        config = sample_random_config(configurations)
        print(f"Testing configuration {i+1}: {config}")
        
        score = cross_validate_model(config, files)  # Replace with your data
        print(f"Validation Accuracy: {score:.4f}")
        
        if score > best_score_random:
            best_score_random = score
            best_config_random = config

    print(f"Best Configuration: {best_config_random}")
    print(f"Best Validation Accuracy: {best_score_random:.4f}")


### Grid Search

In [37]:
best_config_grid = None
best_score_grid = 0

def grid_search(files):

    for i in range(20):  # Number of random configurations to try
        config = sample_random_config(configurations)
        print(f"Testing configuration {i+1}: {config}")
        
        score = cross_validate_model(config, files)  # Replace with your data
        print(f"Validation Accuracy: {score:.4f}")
        
        if score > best_score_grid:
            best_score_grid = score
            best_config_grid = config

    print(f"Best Configuration: {best_config_grid}")
    print(f"Best Validation Accuracy: {best_score_grid:.4f}")


In [38]:
files = [f'datasets/urbansounds_features_fold{i}.csv' for i in range(1,11)] 

all_configs = generate_configs(configurations)

In [50]:
random_search(files)

Testing configuration 1: {'num_conv_layers': 3, 'filters_per_layer': [32, 64], 'kernel_size': 5, 'activation': 'sigmoid', 'dense_units': 512, 'dropout': 0.3}


  super().__init__(**kwargs)


Estou no fold de validacao 0 a acc foi 0.6036655306816101


  super().__init__(**kwargs)


Estou no fold de validacao 1 a acc foi 0.5788288116455078


  super().__init__(**kwargs)


Estou no fold de validacao 2 a acc foi 0.49405404925346375


  super().__init__(**kwargs)


Estou no fold de validacao 3 a acc foi 0.5848484635353088


  super().__init__(**kwargs)


Estou no fold de validacao 4 a acc foi 0.2094017118215561


  super().__init__(**kwargs)


Estou no fold de validacao 5 a acc foi 0.5115431547164917


  super().__init__(**kwargs)


Estou no fold de validacao 6 a acc foi 0.5668257474899292


  super().__init__(**kwargs)


Estou no fold de validacao 7 a acc foi 0.5111662745475769


  super().__init__(**kwargs)


Estou no fold de validacao 8 a acc foi 0.529411792755127


  super().__init__(**kwargs)


Estou no fold de validacao 9 a acc foi 0.6236559152603149
[0.6036655306816101, 0.5788288116455078, 0.49405404925346375, 0.5848484635353088, 0.2094017118215561, 0.5115431547164917, 0.5668257474899292, 0.5111662745475769, 0.529411792755127, 0.6236559152603149]
Validation Accuracy: 0.5213
Testing configuration 2: {'num_conv_layers': 4, 'filters_per_layer': [128, 256], 'kernel_size': 3, 'activation': 'tanh', 'dense_units': 128, 'dropout': 0.3}


  super().__init__(**kwargs)


Estou no fold de validacao 0 a acc foi 0.6540664434432983


  super().__init__(**kwargs)


Estou no fold de validacao 1 a acc foi 0.5259009003639221


  super().__init__(**kwargs)


Estou no fold de validacao 2 a acc foi 0.5427026748657227


  super().__init__(**kwargs)


Estou no fold de validacao 3 a acc foi 0.6080808043479919


  super().__init__(**kwargs)


Estou no fold de validacao 4 a acc foi 0.6474359035491943


  super().__init__(**kwargs)


Estou no fold de validacao 5 a acc foi 0.5795868635177612


  super().__init__(**kwargs)


Estou no fold de validacao 6 a acc foi 0.6014319658279419


  super().__init__(**kwargs)


Estou no fold de validacao 7 a acc foi 0.6588089466094971


  super().__init__(**kwargs)


Estou no fold de validacao 8 a acc foi 0.6519607901573181


  super().__init__(**kwargs)


Estou no fold de validacao 9 a acc foi 0.6714456677436829
[0.6540664434432983, 0.5259009003639221, 0.5427026748657227, 0.6080808043479919, 0.6474359035491943, 0.5795868635177612, 0.6014319658279419, 0.6588089466094971, 0.6519607901573181, 0.6714456677436829]
Validation Accuracy: 0.6141
Testing configuration 3: {'num_conv_layers': 4, 'filters_per_layer': [128, 256], 'kernel_size': 7, 'activation': 'tanh', 'dense_units': 64, 'dropout': 0.3}


  super().__init__(**kwargs)


Estou no fold de validacao 0 a acc foi 0.6254295706748962


  super().__init__(**kwargs)


Estou no fold de validacao 1 a acc foi 0.5720720887184143


  super().__init__(**kwargs)


KeyboardInterrupt: 