In the context of deep learning with Keras (a high-level neural networks API in Python), Sequential() is a model type that is used to create a linear stack of layers. The Sequential model is simple and straightforward, where you define one layer at a time, and each layer has exactly one input tensor and one output tensor.



In [None]:
# Creating a Model
model = Sequential()

Dense: This layer type is a fully connected (dense) layer, where each neuron receives input from all neurons of the previous layer.

units=32: This parameter sets the number of neurons in the layer to 32.

activation='relu': The Rectified Linear Unit (ReLU) activation function is applied to the output of each neuron, which introduces non-linearity and helps the network learn more complex functions. ReLU returns 0 for any input less than 0 and returns the input for any input greater than or equal to 0.

input_dim=30: This parameter specifies the dimension of the input data. The model expects input vectors with 30 features. This parameter is only required for the first layer to define the input shape.


units=1: This layer has a single neuron, making it suitable for binary classification tasks where the output is either 0 or 1.

activation='sigmoid': The Sigmoid activation function maps the output to a value between 0 and 1. It’s typically used in the output layer for binary classification problems because it can be interpreted as a probability.

In [None]:
layer1 = Dense(units=32, activation = 'relu', input_dim = 30)
model.add(layer1)
model.add(Dense(units=16, activation = 'relu'))
model.add(Dense(units=1, activation = 'sigmoid'))

Optimizer: optimizer='adam'
Purpose: The optimizer is used to update the weights of the neural network to minimize the loss during training.

'adam': Adam (Adaptive Moment Estimation) is an advanced optimization algorithm that adjusts the learning rate for each parameter. It combines the advantages of two other popular optimizers:

AdaGrad (Adaptive Gradient Algorithm): It adapts the learning rate to the parameters, performing larger updates for infrequent and smaller updates for frequent parameters.

RMSProp (Root Mean Square Propagation): It divides the learning rate by an exponentially decaying average of squared gradients.

Adam uses estimates of first and second moments of the gradients to adapt the learning rate, which helps in achieving faster convergence and robust performance. It requires minimal configuration and is well-suited for most problems.


2. Loss Function: loss='binary_crossentropy'
Purpose: The loss function measures how well the model's predictions match the true labels. It is a key component used by the optimizer to guide the training process.

'binary_crossentropy': This loss function is used for binary classification tasks, where the target variable has only two possible outcomes (e.g., 0 or 1). It calculates the cross-entropy loss between the true labels and the predicted probabilities. 


3. Metrics: metrics=['accuracy']
Purpose: Metrics are used to evaluate the performance of the model during training and testing. Unlike the loss, metrics are not used to train the model but to monitor its performance.

'accuracy': Accuracy measures the proportion of correctly classified samples out of the total samples. For binary classification, it is calculated as:

 
Example: If the model correctly classifies 90 out of 100 samples, the accuracy is 0.90 or 90%.

In [None]:
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

random_state=0:

Purpose: Controls the shuffling applied to the data before applying the split.
Value: 0 ensures that the split is reproducible. Using the same random_state value allows you to get the same split every time you run the code.
Example: This is important for debugging and comparison, as it ensures that your training and test sets are the same across different runs.

In [None]:
from sklearn import datasets
cancer = datasets.load_breast_cancer()
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(cancer.data, cancer.target, test_size = 0.2, random_state = 0)

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

epochs=20:

Purpose: Specifies the number of times the model will iterate over the entire training dataset.
Value: 20 means the model will train for 20 full passes over the training data.
Example: If your dataset has 1000 samples and you set epochs=20, the model will make 20 passes through all 1000 samples during training.
batch_size=50:

Purpose: Defines the number of samples per gradient update (batch).
Value: 50 means the model will update its weights after processing 50 samples. This means that during training, the dataset is divided into mini-batches of 50 samples, and the model parameters are updated after each batch.
Example: If x_train has 1000 samples and batch_size is 50, the model will divide the data into 20 batches and update weights 20 times per epoch.
validation_data=(x_test, y_test):

Purpose: Provides the validation dataset to evaluate the model’s performance after each epoch, without using it for training.
Content:
x_test: Input features for the validation set.
y_test: Labels for the validation set.

Example: During training, after each epoch, the model will be evaluated on this validation data to monitor its performance and to help in preventing overfitting by observing the loss and metrics on unseen data.

In [None]:
model.fit(x_train, y_train, epochs=20, batch_size = 50, validation_data=(x_test, y_test))

Train on 455 samples, validate on 114 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x1a1e4c0e48>

In [None]:
predictions = model.predict(x_test)
score = model.evaluate(x_test, y_test)
score



[0.060566798113940057, 0.96491227651897227]