# Shells

https://www.geeksforgeeks.org/how-can-tensorflow-be-used-with-abalone-dataset-to-build-a-sequential-model/ following this tutorial.

## Setup

In [None]:
import pandas as pd
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

## Loading the data

It can be downloaded from https://drive.google.com/file/d/1zQ5ZGN8JC-2fQ-9WvO3RAX-OCbjaC0k7/view?usp=share_link.

In [None]:
abalone = pd.read_csv("abalone.csv")

abalone.head()

In [None]:
abalone.describe().T

In [None]:
features = pd.get_dummies(abalone.drop('Rings', axis=1), columns=['Sex'])
target = abalone['Rings']

x_train, x_test, y_train, y_test = train_test_split(
    features, target,
    test_size=0.2,
    random_state=22
)

x_train.shape, x_test.shape


In [None]:
scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_test_scaled = scaler.transform(x_test)

## Understanding X and Y

In [None]:
example_x = np.arange(16).reshape((8,2))
example_y = range(8)

ex_x_train, ex_x_test, ex_y_train, ex_y_test = train_test_split(
    example_x,
    example_y,
	train_size=0.8,
	random_state=42
)

print("Training set x: ", ex_x_train)
print("Training set y: ", ex_y_train)

In [None]:
print("Testing set x: ", ex_x_test)
print("Testing set y: ", ex_y_test)

A bit to unpack here, but most important, what is x and y?

x = input 
y = desired outcome

The function `train_test_split` splits the training data, 80% for training and 20% for testing if the training worked.

## Creating and training the model

In [None]:
model = tf.keras.Sequential([
    tf.keras.layers.Dense(256, activation='relu', input_shape=[10]),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(256, activation='relu'),
    tf.keras.layers.Dropout(0.3),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(1, activation='relu')
])

model.compile(
    loss='mae',
    optimizer='adam',
    metrics=['mape']
)

In [None]:
model.summary()

In [None]:
x_train.info(), y_train.info()

In [None]:
x_test.info(), y_test.info()

In [None]:
history = model.fit(x_train_scaled, y_train,
                    epochs=50,
                    verbose=1,
                    batch_size=64,
                    validation_data=(x_test_scaled, y_test))

In [None]:
hist_df=pd.DataFrame(history.history)
hist_df.head()