Assignment # 1 Deep Learning: Syed_Muhammad_Ovais

## Gradient descent

 is an optimization algorithm used to minimize a function. In machine learning, it's widely employed to adjust model parameters iteratively to reduce the error between predicted and actual values. Imagine a hiker trying to reach the lowest point in a valley; gradient descent is similar, taking steps downhill in the direction of steepest descent.

Types of Gradient Descent
1. Batch Gradient Descent
Uses the entire training dataset to compute the gradient in each iteration.

2. Stochastic Gradient Descent (SGD)
Uses only one training example to compute the gradient in each iteration.

3. Mini-Batch Gradient Descent
Uses a small subset (mini-batch) of the training data to compute the gradient in each iteration

 ## Validation Set and Validation Loss

Validation Set: A validation set is a portion of your dataset that is set aside and not used during the training process. It's crucial for evaluating the performance of a machine learning model. Think of it as a trial run before the final test.

### Purpose:
To assess how well the model generalizes to unseen data.
To detect overfitting, where the model performs exceptionally well on the training data but poorly on new data.
To tune hyperparameters (like learning rate, number of layers, etc.).

### Characteristics:
Should be representative of the overall dataset.
Typically, it's a smaller subset of the data compared to the training set.

## Validation Loss
Validation loss is a metric that measures how well your model performs on the validation set. It's calculated in the same way as training loss but using the validation data instead.

### Purpose:

To monitor the model's performance on unseen data during training.
To make decisions about when to stop training (early stopping) to avoid overfitting.
To compare different models or hyperparameter settings.
### Interpretation:

A decreasing validation loss indicates that the model is improving its ability to generalize.
An increasing validation loss, while training loss continues to decrease, is a strong indicator of overfitting.

In [None]:
import tensorflow as tf
import numpy as np
from tensorflow import keras
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [None]:
tips = sns.load_dataset("tips")
tips.head()


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


In [None]:
tips = pd.get_dummies(tips, drop_first=True)
tips.head()

Unnamed: 0,total_bill,tip,size,sex_Female,smoker_No,day_Fri,day_Sat,day_Sun,time_Dinner
0,16.99,1.01,2,True,True,False,False,True,True
1,10.34,1.66,3,False,True,False,False,True,True
2,21.01,3.5,3,False,True,False,False,True,True
3,23.68,3.31,2,False,True,False,False,True,True
4,24.59,3.61,4,True,True,False,False,True,True


In [None]:
x = tips.drop("tip", axis=1)
y = tips["tip"]
x.head()

Unnamed: 0,total_bill,size,sex_Female,smoker_No,day_Fri,day_Sat,day_Sun,time_Dinner
0,16.99,2,True,True,False,False,True,True
1,10.34,3,False,True,False,False,True,True
2,21.01,3,False,True,False,False,True,True
3,23.68,2,False,True,False,False,True,True
4,24.59,4,True,True,False,False,True,True


In [None]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

In [None]:
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation="relu", input_shape=(x_train.shape[1],)),
    tf.keras.layers.Dense(32, activation="relu"),
    tf.keras.layers.Dense(1)
])
model.compile(optimizer="adam", loss="mean_squared_error", metrics =['mae'])
model.fit(x_train, y_train, epochs=500, batch_size=32, verbose=1)
loss,accuracy = model.evaluate(x_test, y_test)
print('Mean Squared Error: ', loss)
print('Test Accuracy: ', accuracy)

Epoch 1/500


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - loss: 10.4530 - mae: 2.9656
Epoch 2/500
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 9.3951 - mae: 2.7077   
Epoch 3/500
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 6.0541 - mae: 2.1472  
Epoch 4/500
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 4.6129 - mae: 1.7845 
Epoch 5/500
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 3.1624 - mae: 1.4107  
Epoch 6/500
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 2.1373 - mae: 1.0976 
Epoch 7/500
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 1.4717 - mae: 0.9026  
Epoch 8/500
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 1.1849 - mae: 0.8065  
Epoch 9/500
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 1.2319 - ma