# Implementation of cost function in a model with real world Data

### Problem Defination
we have a data of various vehicls of a town with same milege. where fuel consumed by a vehicle and distance covered by it were measured. now lets take the data and  try to predict the distance traveled by the vehicle based on amount of fuel consumed.

at first lets import necessary modules

In [78]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

now we have all the necessary modules. Lets start by importing the data. 

In [None]:
data = pd.read_csv("./../data/slReg/fuleconsumedxdistance.csv")
data.head()

In [None]:
X = data["Fuel_Consumed_Liters"]
y = data["Distance_Traveled_km"]
X.shape

### Ploting the data

In [None]:
data.plot(kind="scatter", x="Fuel_Consumed_Liters",y="Distance_Traveled_km", color = "blue")
plt.show()

### Cost function :

The equation for cost with one variable is:
  $$J(w,b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})^2 \tag{1}$$ 
 
where 
  $$f_{w,b}(x^{(i)}) = wx^{(i)} + b \tag{2}$$

***pseudo code :***
```
m = number of training examples  
total_cost = 0  
iterate {  
  f_wb = w * x_i + b  
  cost = (f_wb - y_i)**2  
  total_cost += cost  
}  
return total_cost  
```


In [82]:
def compute_cost(X,y,w,b):
    m = X.shape[0]
    total_cost = 0
    #your code starts here

    #your code ends here
    return total_cost;

using gradient descent

In [83]:

def gradient_descent(X, y, w, b, learning_rate, iterations, cost_function):
    m = X.shape[0]
    cost_history = []

    for i in range(iterations):
        predictions = w * X + b

        dw = (1 / m) * np.sum((predictions - y) * X)
        db = (1 / m) * np.sum(predictions - y)

        w -= learning_rate * dw
        b -= learning_rate * db

        cost = cost_function(X, y, w, b)
        cost_history.append(cost)

        if i % 500 == 0:
            print(f"Iteration {i}: Cost {cost:.4f}, w {w:.4f}, b {b:.4f}")

    return w, b, cost_history

splitting the dataset using trining set and test set

In [84]:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=45)

training the model

In [None]:
# initialize parameters
w_init = 0
b_init = 0
# some gradient descent settings
iterations = 20000
tmp_alpha = 0.001

w_final, b_final, J_hist = gradient_descent(x_train ,y_train, w_init, b_init, tmp_alpha, iterations, compute_cost)
print(f"(w,b) found by gradient descent: ({w_final:8.4f},{b_final:8.4f})")

plotting the cost x iteration

In [None]:
x_ = np.linspace(1,len(J_hist))
plt.plot(x_,J_hist)
plt.title("Cost over iteration")
plt.show()

plotting the data and our regression line with training and test set

In [None]:
x_line = np.linspace(min(X),max(X),100)
y_line = w_final * x_line + b_final
plt.scatter(x_train,y_train,color = "blue")
plt.plot(x_line,y_line,color = "red")
plt.title("Plotting regression line in training set")
# plt.xlabel("Fuel consumed in litre")
# plt.ylabel("Distance traveled in km")
plt.show()

In [None]:
plt.scatter(x_train,y_train,color = "orange")
plt.plot(x_line,y_line,color = "green")
plt.title("Plotting regression line in test set")
plt.show()

In [64]:
from sklearn.metrics import mean_absolute_error,mean_squared_error,r2_score

evaluation of the model

In [None]:
y_pred = w_final * x_test + b_final
mae = mean_absolute_error(y_test,y_pred)
mse = mean_squared_error(y_test,y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, y_pred)
# Print results
print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
print(f"R^2 Score: {r2:.2f}")
