## Download and Clean Dataset


Let's start by importing the <em>pandas</em> and the Numpy libraries.


In [1]:
import pandas as pd
import numpy as np

import warnings
warnings.simplefilter('ignore', FutureWarning)

In [2]:
concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')


#### Split data into predictors and target


The target variable in this problem is the concrete sample strength. Therefore, our predictors will be all the other columns.


In [3]:
concrete_data_columns = concrete_data.columns
predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # all columns except Strength
target = concrete_data['Strength'] # Strength column

<a id="item2"></a>


Let's do a quick sanity check of the predictors and the target dataframes.


In [4]:
predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm.head()
n_cols = predictors_norm.shape[1] # number of predictors



In [5]:
predictors_norm

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.795140,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.551340
3,0.491187,0.795140,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069
...,...,...,...,...,...,...,...,...
1025,-0.045623,0.487998,0.564271,-0.092126,0.451190,-1.322363,-0.065861,-0.279597
1026,0.392628,-0.856472,0.959602,0.675872,0.702285,-1.993711,0.496651,-0.279597
1027,-1.269472,0.759210,0.850222,0.521336,-0.017520,-1.035561,0.080068,-0.279597
1028,-1.168042,1.307430,-0.846733,-0.279443,0.852942,0.214537,0.191074,-0.279597


In [6]:
from sklearn.model_selection import train_test_split

# Assume X is your input data (features) and y is your target data (labels)

# Split the data into 70% training and 30% testing
X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size=0.3, random_state=42)

# `random_state=42` ensures reproducibility


## Import Keras


Recall from the videos that Keras normally runs on top of a low-level library such as TensorFlow. This means that to be able to use the Keras library, you will have to install TensorFlow first and when you import the Keras library, it will be explicitly displayed what backend was used to install the Keras library. In CC Labs, we used TensorFlow as the backend to install Keras, so it should clearly print that when we import Keras.


#### Let's go ahead and import the Keras library


In [7]:
import keras

In [8]:
from keras.models import Sequential
from keras.layers import Dense

<a id='item33'></a>


## Build a Neural Network


Let's define a function that defines our regression model for us so that we can conveniently call it to create our model.


In [9]:
# define regression model
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

<a id="item4"></a>


<a id='item34'></a>


## Train and Test the Network


Let's call the function now to create our model.


In [10]:
# build the model
model = regression_model()

In [11]:
# fit the model
model.fit(X_train, y_train, epochs=50, verbose=2)

Epoch 1/50
23/23 - 0s - loss: nan
Epoch 2/50
23/23 - 0s - loss: nan
Epoch 3/50
23/23 - 0s - loss: nan
Epoch 4/50
23/23 - 0s - loss: nan
Epoch 5/50
23/23 - 0s - loss: nan
Epoch 6/50
23/23 - 0s - loss: nan
Epoch 7/50
23/23 - 0s - loss: nan
Epoch 8/50
23/23 - 0s - loss: nan
Epoch 9/50
23/23 - 0s - loss: nan
Epoch 10/50
23/23 - 0s - loss: nan
Epoch 11/50
23/23 - 0s - loss: nan
Epoch 12/50
23/23 - 0s - loss: nan
Epoch 13/50
23/23 - 0s - loss: nan
Epoch 14/50
23/23 - 0s - loss: nan
Epoch 15/50
23/23 - 0s - loss: nan
Epoch 16/50
23/23 - 0s - loss: nan
Epoch 17/50
23/23 - 0s - loss: nan
Epoch 18/50
23/23 - 0s - loss: nan
Epoch 19/50
23/23 - 0s - loss: nan
Epoch 20/50
23/23 - 0s - loss: nan
Epoch 21/50
23/23 - 0s - loss: nan
Epoch 22/50
23/23 - 0s - loss: nan
Epoch 23/50
23/23 - 0s - loss: nan
Epoch 24/50
23/23 - 0s - loss: nan
Epoch 25/50
23/23 - 0s - loss: nan
Epoch 26/50
23/23 - 0s - loss: nan
Epoch 27/50
23/23 - 0s - loss: nan
Epoch 28/50
23/23 - 0s - loss: nan
Epoch 29/50
23/23 - 0s - loss

<tensorflow.python.keras.callbacks.History at 0x16fb335f250>

In [12]:
from sklearn.metrics import mean_squared_error

# Make predictions on the test data
y_pred = model.predict(X_test)

# Compute Mean Squared Error
mse = mean_squared_error(y_test, y_pred)

print(f"Mean Squared Error (MSE) on Test Data: {mse}")

ValueError: Input contains NaN.