# **Introduction**
This is my first machine learning kernel. I used logistic regression.
<font color = blue>
## **Content:**
1. [Load and check data](#1)<br>
1. [Variable Description](#2)<br>
1. [Normalization](#3)<br>
1. [Train Test Split](#4)<br>
1. [Paramter Initialize and Sigmoid Function](#5)<br>
1. [Forward and Backward Propagation](#6)<br>
1. [Updating (Learning) Parameters](#7)<br>
1. [Prediction](#8)<br>
1. [Logistic Regression](#9)<br>
1. [Sklearn with Linear Regression](#10)<br>

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
hd=pd.read_csv("/kaggle/input/heart-disease-uci/heart.csv")

<a id="1"></a><br>
## **1.Load and Check Data**

In [None]:
hd.head(20)

In [None]:
hd.columns

<a id="2"></a><br>
## **2.Veriable Description**
* age: age
* sex: sex
* cp: chest pain type  
* trestbps: resting blood pressure (in mm Hg on admission to the hospital)
* chol:serum cholestoral in mg/dl
* fbs: fasting blood sugar &gt; 120 mg/dl) (1 = true; 0 = false
* restecg:resting electrocardiographic results  
* thalach: maximum heart rate achieved 
* exang: exercise induced angina
* oldpeak: ST depression induced by exercise relative to rest

In [None]:
hd.info()

In [None]:
y= hd.sex.values
y

In [None]:
x_data=hd.drop(["sex"],axis=1)
x_data

<a id="3"></a><br>
## **3.Normalization**
I reduced the values between 1 and 0.<br>
-normalization formula:(x-min(x))/(max(x)-min(x))

In [None]:
x=(x_data-np.min(x_data))/(np.max(x_data)-np.min(x_data)).values
x

<a id="4"></a><br>
## **4.Train Test Split**
_train: real values

In [None]:
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=42)
x_train=x_train.T
x_test=x_test.T
y_train=y_train.T
y_test=y_test.T

print("x train: ",x_train.shape)
print("x test: ",x_test.shape)
print("y train: ",y_train.shape)
print("y test: ",y_test.shape)

<a id="5"></a><br>
## **5.Paramter Initialize and Sigmoid Function**
* w = weighs
* b = bias

In [None]:
def initialize_w_and_b(dimension):
    w=np.full((dimension,1),0.01)
    b=0.0
    return w,b
def sigmoid(z):
    y_head=1/(1+np.exp(-z))
    return y_head

print(sigmoid(0))

<a id="6"></a><br>
## **6.Forward and Backward Propagation**
-der:derivative

In [None]:
def forward_backward_propagation(w,b,x_train,y_train):
   #forward 
    z=np.dot(w.T,x_train)+b
    y_head=sigmoid(z)
    loss=-y_train*np.log(y_head)-(1-y_train)*np.log(1-y_head)
    cost=(np.sum(loss))/x_train.shape[1]
    
    #backward
    der_w=(np.dot(x_train,((y_head-y_train).T)))/x_train.shape[1]
    der_b=np.sum(y_head-y_train)/x_train.shape[1]
    gradients={"der_w":der_w,"der_b":der_b}
    return cost,gradients 

<a id="7"></a><br>
## **7.Updating (Learning) Parameters**

In [None]:
def update(w,b,learning_rate,number_of_iteration):
    cost_list1=[]
    cost_list2=[]
    index=[]
    
    for i in range(number_of_iteration):
        cost,gradients=forward_backward_propagation(w,b,x_train,y_train)
        cost_list1.append(cost)
        w=w-learning_rate*gradients["der_w"]
        b=b-learning_rate*gradients["der_b"]
        if i % 10 == 0:
            cost_list2.append(cost)
            index.append(i)
            print("Cost after iteration %i: %f"%(i,cost))
            
    parameters={"weigh":w,"bias":b}
    plt.plot(index,cost_list2)
    plt.xticks(index,rotation="vertical")
    plt.xlabel("Number of Iteration")
    plt.ylabel("Cost")
    plt.show()
    
    return parameters,gradients,cost_list1

<a id="8"></a><br>
## **8.Prediction**

In [None]:
def predict(w,b,x_test):
    z=sigmoid(np.dot(w.T,x_test)+b)
    Y_prediction=np.zeros((1,x_test.shape[1]))
    
    for i in range(z.shape[1]):
        if z[0,i]<=0.5:
            Y_prediction[0,i]=0
        else:
            Y_prediction[0,i]=1
            
    return Y_prediction

<a id="9"></a><br>
## **9.Logistic Regression**

In [None]:
def logistic_regression(x_train,y_train,x_test,y_test,learning_rate,num_iterations):
    #initialize
    dimension=x_train.shape[0]
    w,b=initialize_w_and_b(dimension)
    parameters,gradients,cost_list1=update(w,b,learning_rate,num_iterations)
    
    y_prediction_test=predict(parameters["weigh"],parameters["bias"],x_test)
    
    #print test errors
    print("test accurary:{}".format(100-np.mean(np.abs(y_prediction_test-y_test))*100))

logistic_regression(x_train,y_train,x_test,y_test,learning_rate=0.01,num_iterations=500)

<a id="10"></a><br>
## **10.Sklearn with Linear Regression**

In [None]:
lr=LogisticRegression()
lr.fit(x_train.T,y_train.T)
print("test accuracy {}".format(lr.score(x_test.T,y_test.T)))