# Logistic Regression
The LogisticRegression class in sklearn implements regularized logistic regression, capable of handling both binary and multiclass classification. It includes support for various optimization solvers (lbfgs, liblinear, newton-cg, etc.) and penalty terms (L1, L2, ElasticNet). The choice of solver depends on the nature of the data, and regularization strength is controlled via the C parameter.

In [1]:
import warnings 
warnings.filterwarnings( "ignore" ) 

In [2]:
import numpy as np
import pandas as pd

In [105]:
class Logistic_Regression:
    def __init__(self,lr=0.001,n_iters=1000):
        self.lr=lr
        self.n_iters=n_iters
        self.weights=None
        self.bias=None

    def fit(self,X,y):
        n_samples,n_features=X.shape
        self.weights=np.zeros(n_features)
        self.bias=0
        for i in range(n_samples):
            y_pred=1/(1+np.exp(-(np.dot(X,self.weights)+self.bias)))
            dw=np.dot(X.T,(y_pred-y))/n_samples
            db=np.sum(y_pred-y)/n_samples
            self.weights=self.weights-self.lr*dw
            self.bias=self.bias-self.lr*db
            

    def predict(self,X):
        y= 1/(1+np.exp(-(np.dot(X,self.weights)+self.bias)))
        return [0 if y_<=0.5 else 1 for y_ in y]

In [128]:
df=pd.read_csv("diabetes.csv")
X=df.drop(columns=["Outcome"])
y=df["Outcome"]

In [152]:
df

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1
...,...,...,...,...,...,...,...,...,...
763,10,101,76,48,180,32.9,0.171,63,0
764,2,122,70,27,0,36.8,0.340,27,0
765,5,121,72,23,112,26.2,0.245,30,0
766,1,126,60,0,0,30.1,0.349,47,1


In [129]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=1234)

In [130]:
lr=Logistic_Regression()
lr.fit(X_train,y_train)
predictions=lr.predict(X_test)

In [131]:
lr.weights

array([ 0.1087906 ,  0.06818568, -0.0817432 , -0.02725345,  0.02334304,
        0.00343248,  0.00438369,  0.00429227])

In [132]:
lr.bias

-0.022507759904298274

In [139]:
predictions=np.array(predictions)
predictions

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1,
       1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1,
       1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In [146]:
y_test=np.array(y_test)

In [147]:
y_test

array([0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1,
       0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0,
       0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0,
       0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1,
       0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0,
       1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0],
      dtype=int64)

In [148]:
x=0
for i in range(len(predictions)):
    if y_test[i]==predictions[i]:
        x+=1
print(x/len(predictions))

0.38311688311688313


### Key Parameters:
- penalty ({'l1', 'l2', 'elasticnet', None}): Specifies the norm of the penalty (regularization term).<br>
    None: No penalty.<br>
    L1: Sparsity-inducing penalty.<br>
    L2: Default option, minimizes the squared magnitude of coefficients.<br>
    ElasticNet: Combination of L1 and L2.
- dual (bool, default=False): Dual formulation, only for l2 penalty with the liblinear solver.

- tol (float, default=1e-4): Tolerance for stopping criteria.

- C (float, default=1.0): Inverse of regularization strength, smaller values indicate stronger regularization.

- fit_intercept (bool, default=True): Specifies if a constant should be added to the decision function.

- class_weight (dict or 'balanced', default=None): Adjusts weights inversely proportional to class frequencies.

- solver ({'lbfgs', 'liblinear', 'newton-cg', 'sag', 'saga', etc.}, default='lbfgs'): Optimization algorithm to use.<br>
    'liblinear': Suitable for small datasets and binary classification.<br>
    'lbfgs', 'newton-cg', 'saga': Handle multiclass problems.
- max_iter (int, default=100): Maximum iterations for the solver to converge.

- multi_class ({'auto', 'ovr', 'multinomial'}, default='auto'): Defines how to handle multiclass problems.<br>
    ovr: One-vs-Rest scheme.<br>
    multinomial: Minimizes cross-entropy loss across all classes.


In [140]:
from sklearn.linear_model import LogisticRegression
lr1=LogisticRegression()
lr1.fit(X_train,y_train)
predictions1=lr1.predict(X_test)

In [141]:
predictions1

array([0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1,
       0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0,
       0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1,
       0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1],
      dtype=int64)

In [149]:
x=0
for i in range(len(predictions1)):
    if y_test[i]==predictions1[i]:
        x+=1
print(x/len(predictions1))

0.7727272727272727


### Example: python
from sklearn.datasets import load_iris<br>
from sklearn.linear_model import LogisticRegression

- Load sample data<br>
X, y = load_iris(return_X_y=True)

- Create LogisticRegression model and fit<br>
clf = LogisticRegression(random_state=0).fit(X, y)

- Predict<br>
predictions = clf.predict(X[:2, :])<br>
proba = clf.predict_proba(X[:2, :])<br>
print(predictions)<br>
print(proba)<br>

### Important Notes:
- Solvers Compatibility: Not all solvers support all penalties. For instance, liblinear only supports l1 and l2 penalties and is limited to binary classification.

- Regularization: Default penalty='l2'. Stronger regularization (lower C values) helps prevent overfitting.

### Attributes:
- coef_: Weights assigned to the features.
- intercept_: Bias term.
- n_iter_: Number of iterations performed by the solver.

This class is particularly flexible due to its various solvers and regularization options, making it useful for both small and large datasets.