## Logistic Regression Function

$\sigma(t) = \frac{e^t}{e^t + 1} = \frac{1}{1+e^-t}$

$t= \beta_0 + \beta_1x \space\space\space\space\space t=A+B_x$

## Multiple Logistic Regression

$p(x) = \frac{1}{1+e^-(\beta_0+\beta_1x)}$

$\beta_0+\beta_1 x_1 + \beta_2 x_2 + \dots + \beta_m x_m = \beta_0 + \sum\limits_{i=1}^{m} \beta_i x_i$

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

veriler = pd.read_csv("veriler.csv")

In [11]:
x = veriler.iloc[:,1:4].values
y = veriler.iloc[:,4:].values

x

array([[130,  30,  10],
       [125,  36,  11],
       [135,  34,  10],
       [133,  30,   9],
       [129,  38,  12],
       [180,  90,  30],
       [190,  80,  25],
       [175,  90,  35],
       [177,  60,  22],
       [185, 105,  33],
       [165,  55,  27],
       [155,  50,  44],
       [160,  58,  39],
       [162,  59,  41],
       [167,  62,  55],
       [174,  70,  47],
       [193,  90,  23],
       [187,  80,  27],
       [183,  88,  28],
       [159,  40,  29],
       [164,  66,  32],
       [166,  56,  42]], dtype=int64)

In [12]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.33,random_state=0)

x_test

array([[164,  66,  32],
       [165,  55,  27],
       [167,  62,  55],
       [162,  59,  41],
       [125,  36,  11],
       [166,  56,  42],
       [155,  50,  44],
       [159,  40,  29]], dtype=int64)

In [14]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()

x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

In [16]:
from sklearn.linear_model import LogisticRegression

logr = LogisticRegression(random_state=0)

In [18]:
import warnings
warnings.filterwarnings("ignore")

In [19]:
logr.fit(x_train,y_train)

In [21]:
y_pred = logr.predict(x_test)
print(y_pred, y_test)

['e' 'e' 'e' 'e' 'k' 'e' 'e' 'e'] [['k']
 ['k']
 ['k']
 ['k']
 ['e']
 ['k']
 ['k']
 ['k']]


In [26]:
predDF = pd.DataFrame(data=y_pred, index=range(len(y_pred)), columns=["Tahmin Edilen"])
realDF = pd.DataFrame(data=y_test, index=range(len(y_test)), columns=["Gerçek Değerler"])

sonucDF = pd.concat([predDF,realDF],axis=1)

sonucDF

Unnamed: 0,Tahmin Edilen,Gerçek Değerler
0,e,k
1,e,k
2,e,k
3,e,k
4,k,e
5,e,k
6,e,k
7,e,k


## Karmaşıklık Matrisi (Confusion Matrix)

| | C$_1$ | C$_2$ |
|:--------:|:--------:|:--------:|
|$C_1$| True positive | False negative|
|$C_2$| False positive | True negative|

|classes|buy_computer=yes|buy_computer=no|total|recognition(%)|
|:--------:|:--------:|:--------:|:--------:|:--------:|
|buy_computers=yes|6954|46|7000|99.34|
|buy_computers=no|412|2588|300|86.27|
|total|7366|2634|10000|95.52|

- Accuracy M, acc(M): model M için yüzde kaç doğru sınıflandırma olduğudur.
    - Error rate (misclassification rate) = 1 - acc(M)
    - Alternatif ölçümler (e.g., for cancer diagnosis)
        - sensitivity = t-pos/(t-pos+f-neg)  ___true positive recognition rate___
        - specificty = t-neg/(t-neg+f-pos)   ___true negative recognition rate___
        - precision = t-pos/(t-pos+f-pos)
        - accuracy = sensitivity * pos/(pos+neg) + specificy * neg/(pos+neg)
        
![Confusion Matrix Değerlendirme](cm.png)


In [27]:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test,y_pred)
cm

array([[0, 1],
       [7, 0]], dtype=int64)