## Assignment
#### Linear Regression
  - Implement Linear Regression using Gradient Descent from scratch on dataset provided in the given folder.
  - You need to predict the performance index of students based on other attributes.
  - You need to implement the model from scratch just using numpy for building the model, you can use any other library for data manipulation and visualization (prepossessing).
  - Use first 90% of the data for training and the remaining 10% for testing.
  - Report RMSE on the test data for model evaluation (you can use any library for this)
  - Your goal is to RSME as low as possible, so try to experiment with all the concepts you have learned till now.
  - Implement the code in .ipynb file and submit it
#### Logistic Regression
  - Implement Logistic Regression using Gradient Descent from scratch on dataset provided in the given folder.
  - You need to predict whether the last column is 1 or 0 based on other attributes.
  - You need to implement the model from scratch just using numpy for building the model, you can use any other library for data manipulation and visualization (prepossessing).
  - Use first 90% of the data for training and the remaining 10% for testing.
  - Report accuracy of your model (percentage of data points classified correctly) along with the confusion matrix.
  - Implement the code in .ipynb file and submit it

In [1]:
import numpy as np
import pandas as pd
import copy

In [2]:
df=pd.read_csv('framingham.csv')

In [3]:
df = df.replace({'NA': 0})
df=df.fillna(0)

In [4]:
print(df)

      male  age  education  currentSmoker  cigsPerDay  BPMeds  \
0        1   39        4.0              0         0.0     0.0   
1        0   46        2.0              0         0.0     0.0   
2        1   48        1.0              1        20.0     0.0   
3        0   61        3.0              1        30.0     0.0   
4        0   46        3.0              1        23.0     0.0   
...    ...  ...        ...            ...         ...     ...   
4233     1   50        1.0              1         1.0     0.0   
4234     1   51        3.0              1        43.0     0.0   
4235     0   48        2.0              1        20.0     0.0   
4236     0   44        1.0              1        15.0     0.0   
4237     0   52        2.0              0         0.0     0.0   

      prevalentStroke  prevalentHyp  diabetes  totChol  sysBP  diaBP    BMI  \
0                   0             0         0    195.0  106.0   70.0  26.97   
1                   0             0         0    250.0  121.0

In [5]:
df=df.to_numpy()

In [6]:
print(df)

[[  1.  39.   4. ...  80.  77.   0.]
 [  0.  46.   2. ...  95.  76.   0.]
 [  1.  48.   1. ...  75.  70.   0.]
 ...
 [  0.  48.   2. ...  84.  86.   0.]
 [  0.  44.   1. ...  86.   0.   0.]
 [  0.  52.   2. ...  80. 107.   0.]]


In [7]:
train=df[:3814,:]
X_train=train[:,:-1]
y_train=train[:,-1]
test=df[3814:,:]
print(y_train)

[0. 0. 0. ... 0. 0. 0.]


In [8]:
def sigmoid(x):
  return 1 / (1 + np.exp(-x))

In [9]:
def gradient_descent(X, y, w_in, b_in, gradient_function, alpha, num_iters): 
    w = copy.deepcopy(w_in)
    b = b_in   
    for i in range(num_iters):
        dj_db,dj_dw = gradient_function(X, y, w, b)
        w = w - alpha * dj_dw
        b = b - alpha * dj_db     
    return w, b

In [10]:
def compute_gradient(X, y, w, b): 
    m,n = X.shape
    dj_dw = np.zeros((n,))
    dj_db = 0.
    for i in range(m):
        f_wb_i = sigmoid(np.dot(X[i],w) + b)
        err_i  = f_wb_i  - y[i]
        for j in range(n):
            dj_dw[j] = dj_dw[j] + err_i * X[i,j]
        dj_db = dj_db + err_i
    dj_dw = dj_dw/m
    dj_db = dj_db/m        
    return dj_db, dj_dw

In [11]:
w_final, b_final = gradient_descent(X_train, y_train, list(np.zeros(15)+10),10.0, compute_gradient, 5.0e-7, 1000)
print(f"b,w found by gradient descent: {b_final},{w_final} ")

b,w found by gradient descent: 9.99957538017803,[9.99982708 9.97928789 9.99917252 9.99979012 9.99633954 9.9999903
 9.99999843 9.99988385 9.9999924  9.90116898 9.94470333 9.965113
 9.98913211 9.96787979 9.96913018] 


In [12]:
X_test=test[:,:-1]
y_test=test[:,-1]
print(y_test)

[0. 1. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0.
 0. 1. 0. 0. 0. 1. 1. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0.
 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 1. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.
 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 1. 0. 0. 1.
 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 1. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 1. 0.
 0. 0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 1. 0. 1.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.

In [13]:
result=[]
for i in X_test:
    if np.dot(w_final,i)+b_final>=0.5:
        result.append(1)
    else:
        result.append(0)
result=np.array(result)

In [14]:
print(result)

[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]


In [15]:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, result)
print("Confusion Matrix:\n", cm)

Confusion Matrix:
 [[  0 355]
 [  0  69]]


##Getting very cooked values of w and b I don't know why