# Secure Logistic Regression

Logistic "regression" is a classification method that allows to learn a seperating function between two classes

It is often deployed in MPC Use cases as a light weight machine learning model, often after preprocessing the data using private set intersection to create a secret shared dataset with features and labels

Use Cases include Medical Prediction, Fraud Detection and promotion models

In this notebook we will show how to train a logistic regression model in a plaintext way as an orientation for the MPC version

<span style="color:red">Look at fixed point logistic regression for MOTION that only provides integer implementation</span>



In [12]:
import numpy as np

### Import the data

- Two variants possible:
    - multiple parties provide horizontally split data for x and y
    - one ore more parties provide the features and one party the labels

Here we omit the data set join step and assume that the data is already joined


In [8]:
# Generate general dataset
from sklearn.datasets import make_classification

# Generate dummy dataset
X, y = make_classification(n_samples=2000, n_features=3, n_redundant=0, random_state=42)

#Train test split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,random_state=42)


### Helper Class
Sigmoide function is used to map the output of the linear regression to a probability

In [14]:
def sigmoid(x):
    return 1/(1+np.exp(-x))

#### Logistic Regression Class

In [15]:
class LogisticRegression():

    def __init__(self, lr=0.001, n_iters=1000):
        self.lr = lr
        self.n_iters = n_iters
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.n_iters):
            linear_pred = np.dot(X, self.weights) + self.bias
            predictions = sigmoid(linear_pred)

            dw = (1/n_samples) * np.dot(X.T, (predictions - y))
            db = (1/n_samples) * np.sum(predictions-y)

            self.weights = self.weights - self.lr*dw
            self.bias = self.bias - self.lr*db


    def predict(self, X):
        linear_pred = np.dot(X, self.weights) + self.bias
        y_pred = sigmoid(linear_pred)
        class_pred = [0 if y<=0.5 else 1 for y in y_pred]
        return class_pred

#### Perform Training

In [16]:
regressor = LogisticRegression(lr=0.0001, n_iters=1000)
regressor.fit(X_train, y_train)


#### Perform prediction

In [17]:
predictions=regressor.predict(X_test)
#Calculate accuracy
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, predictions))



0.876
