### __Gradient Descent Optimization in TensorFlow__
Date : 20 Oct 2024.

___`What is Gradient Descent?`___

Gradient Descent is an iterative optimization method that updates model parameters to minimize a cost function. The cost function represents the difference between predicted and actual outcomes in a model, such as Mean Squared Error (MSE) in regression tasks.

At each iteration, the parameters are updated by moving in the direction opposite to the gradient of the cost function. The size of each step is determined by the learning rate. The goal is to reach a local or global minimum where the gradient is zero.

___`How Gradient Descent Works`___

The key steps of gradient descent are:

1. **Initialize Parameters**: Start with random parameter values.
2. **Calculate Gradient**: Compute the gradient of the cost function with respect to the parameters.
3. **Update Parameters**: Update the parameters by subtracting a fraction of the gradient (scaled by the learning rate).
4. **Repeat**: Continue until convergence, i.e., until the cost function reaches a minimum.

__`Example:`__ Minimizing \( f(x) = x^2 \)
1. Start with \( x = 3 \).
2. The gradient(derivative) at \( x = 3 \) is \( 2x = 6 \).
3. Update \( x \) using a learning rate of 0.1:  
   \( x = 3 - 0.1 \times 6 = 2.4 \).
4. Repeat this process until the minimum is reached at \( x = 0 \).


__`Implementing Gradient Descent in TensorFlow`__

Let’s create a linear regression model in TensorFlow and use gradient descent to optimize it.



__`Step 1:`__ Setup and Placeholders

We first need placeholders for the input and output data:

```python
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# Placeholders for input (x) and output (y)
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
```

__`Step 2:`__ Define Model Parameters

Define a single parameter for the slope (w) of the best-fit line:

```python
# Initialize model parameter (slope)
w = tf.Variable(0.5, name="weights")
```

__`Step 3:`__ Define the Linear Regression Model

Use `tf.add()` and `tf.multiply()` to build the model:

```python
# Linear regression model: y = wx + b
model = tf.add(tf.multiply(x, w), 0.5)
```

__`Step 4:`__ Define the Cost Function

Use MSE as the cost function:

```python
# Mean Squared Error (MSE) cost function
cost = tf.reduce_mean(tf.square(model - y))
```

__`Step 5:`__ Gradient Descent Optimizer

Create the optimizer with a learning rate of 0.01:

```python
# Gradient Descent Optimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(cost)
```

__`Step 7:`__ Training the Model

Now, train the model on a toy dataset:

```python
# Toy dataset
x_train = [1, 2, 3, 4]
y_train = [2, 4, 6, 8]

# Start TensorFlow session
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    # Training loop
    for i in range(1000):
        sess.run(train, feed_dict={x: x_train, y: y_train})
    
    # Get trained weight value
    w_val = sess.run(w)

print("Trained weight:", w_val)
```

__`Step 7:`__ Visualizing Gradient Descent

We can also visualize the loss convergence by plotting the loss over iterations:

```python
import matplotlib.pyplot as plt
import tensorflow as tf

# Setup data and initial parameters
X = tf.constant([[1.], [2.], [3.], [4.]])
y = tf.constant([[2.], [4.], [6.], [8.]])
w = tf.Variable(0.)
b = tf.Variable(0.)

# Define model and loss function
def model(x):
    return w * x + b

def loss(predicted_y, true_y):
    return tf.reduce_mean(tf.square(predicted_y - true_y))

# Set learning rate and initialize training loop
learning_rate = 0.001
losses = []

for i in range(250):
    with tf.GradientTape() as tape:
        predicted_y = model(X)
        current_loss = loss(predicted_y, y)
    gradients = tape.gradient(current_loss, [w, b])
    w.assign_sub(learning_rate * gradients[0])
    b.assign_sub(learning_rate * gradients[1])
    
    losses.append(current_loss.numpy())

# Plot loss over iterations
plt.plot(losses)
plt.xlabel("Iteration")
plt.ylabel("Loss")
plt.show()
```

This plot shows the decreasing loss over time, indicating that the model parameters are being optimized.



__`Conclusion`__

Gradient Descent is a fundamental algorithm in machine learning, and TensorFlow provides efficient implementations for optimizing models. By understanding the steps and using TensorFlow’s API, you can apply gradient descent to various problems, such as linear regression and deep learning models.