# Homework: Sonar signal classification
In this workshop you will train a binary classification neural network to differentiate if sonar signals bounced against a metal cylinder or a rock.

[Info of the data](https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks))

Goal: Get an accuracy higher than 0.80 in the validation set

## Get the data from Google Drive

In [2]:
# Import libraries to interact with Google Drive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials



In [3]:
# Authenticate with your Google account to get access to the data
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

In [4]:
# Download data
download = drive.CreateFile({'id': '1rw5l3jCo2vlNc8NLrMk3KPZr6HsjNRCY'})
download.GetContentFile('sonar.csv')

In [5]:
ls

[0m[01;34msample_data[0m/  sonar.csv


## Do some magic below! ;)

In [17]:
import pandas as pd

# Cargar los datos desde el archivo CSV
data = pd.read_csv('sonar.csv', header=None)

# Asignar nuevas etiquetas a las columnas
nuevas_columnas = [f'Caracteristica_{i}' for i in range(1, 62)]  # Del 1 al 61
data.columns = nuevas_columnas

# Convertir 'R' a 0 y 'M' a 1 en la columna de etiquetas usando replace
data['Clase'] = data['Caracteristica_61'].replace({'R': 0, 'M': 1})

# Mostrar las primeras filas del DataFrame para verificar el contenido
data.head()

Unnamed: 0,Caracteristica_1,Caracteristica_2,Caracteristica_3,Caracteristica_4,Caracteristica_5,Caracteristica_6,Caracteristica_7,Caracteristica_8,Caracteristica_9,Caracteristica_10,...,Caracteristica_53,Caracteristica_54,Caracteristica_55,Caracteristica_56,Caracteristica_57,Caracteristica_58,Caracteristica_59,Caracteristica_60,Caracteristica_61,Clase
0,0.02,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,...,0.0065,0.0159,0.0072,0.0167,0.018,0.0084,0.009,0.0032,R,0
1,0.0453,0.0523,0.0843,0.0689,0.1183,0.2583,0.2156,0.3481,0.3337,0.2872,...,0.0089,0.0048,0.0094,0.0191,0.014,0.0049,0.0052,0.0044,R,0
2,0.0262,0.0582,0.1099,0.1083,0.0974,0.228,0.2431,0.3771,0.5598,0.6194,...,0.0166,0.0095,0.018,0.0244,0.0316,0.0164,0.0095,0.0078,R,0
3,0.01,0.0171,0.0623,0.0205,0.0205,0.0368,0.1098,0.1276,0.0598,0.1264,...,0.0036,0.015,0.0085,0.0073,0.005,0.0044,0.004,0.0117,R,0
4,0.0762,0.0666,0.0481,0.0394,0.059,0.0649,0.1209,0.2467,0.3564,0.4459,...,0.0054,0.0105,0.011,0.0015,0.0072,0.0048,0.0107,0.0094,R,0


In [18]:
# Eliminar la columna original de etiquetas ('Caracteristica_61')
data.drop('Caracteristica_61', axis=1, inplace=True)

# Separar los datos en características (X) y etiquetas (y)
X = data.drop('Clase', axis=1).values
y = data['Clase'].values

In [19]:
print(X.shape)
print(y.shape)

(208, 60)
(208,)


In [23]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Normalizar los datos para entrenar correctamente la red
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Dividir los datos en conjuntos de entrenamiento y validación
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Crear el modelo de red neuronal
modelo = Sequential([
    Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    Dense(32, activation='relu'),
    Dense(1, activation='sigmoid')
])

# Compilar el modelo
modelo.compile(optimizer='adam',
               loss='binary_crossentropy',
               metrics=['accuracy'])

# Entrenar el modelo
history = modelo.fit(X_train, y_train, epochs=30, batch_size=16, validation_data=(X_val, y_val), verbose=2)

# Evaluar el modelo
loss, accuracy = modelo.evaluate(X_val, y_val)
print(f'Validation accuracy: {accuracy:.2f}')

Epoch 1/30
11/11 - 2s - loss: 0.7064 - accuracy: 0.5060 - val_loss: 0.5843 - val_accuracy: 0.7857 - 2s/epoch - 152ms/step
Epoch 2/30
11/11 - 0s - loss: 0.5509 - accuracy: 0.7892 - val_loss: 0.5050 - val_accuracy: 0.8333 - 80ms/epoch - 7ms/step
Epoch 3/30
11/11 - 0s - loss: 0.4733 - accuracy: 0.8554 - val_loss: 0.4547 - val_accuracy: 0.8333 - 91ms/epoch - 8ms/step
Epoch 4/30
11/11 - 0s - loss: 0.4019 - accuracy: 0.8916 - val_loss: 0.4167 - val_accuracy: 0.8571 - 81ms/epoch - 7ms/step
Epoch 5/30
11/11 - 0s - loss: 0.3437 - accuracy: 0.9277 - val_loss: 0.3837 - val_accuracy: 0.8095 - 77ms/epoch - 7ms/step
Epoch 6/30
11/11 - 0s - loss: 0.2940 - accuracy: 0.9398 - val_loss: 0.3466 - val_accuracy: 0.8333 - 89ms/epoch - 8ms/step
Epoch 7/30
11/11 - 0s - loss: 0.2501 - accuracy: 0.9578 - val_loss: 0.3283 - val_accuracy: 0.8333 - 86ms/epoch - 8ms/step
Epoch 8/30
11/11 - 0s - loss: 0.2143 - accuracy: 0.9819 - val_loss: 0.3201 - val_accuracy: 0.8333 - 90ms/epoch - 8ms/step
Epoch 9/30
11/11 - 0s - 