## Encrypted Linear Regression Tutorial
This Jupyter notebook provides an introduction to using EncryptedLinearRegression from the venumML library, built on top of venumpy.

Note: This is a basic example and might require additional libraries for data manipulation and visualization depending on your specific needs.

In [None]:
from venumML.venumpy import small_glwe as vp
import numpy as np

### Simple Plaintext Linear Regression with scikit-learn
Before diving into EncryptedLinearRegression, let's explore unencrypted linear regression using scikit-learn (sklearn). Sklearn provides a widely used implementation of linear regression with the LinearRegression class.

In [2]:
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

# Step 1: Generate Sample Data
X, y = make_regression(n_samples=10, n_features=2, noise=0.1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

sk_lr = LinearRegression()
sk_lr.fit(X_train, y_train)

print("Scikit-Learn Coefficients:", sk_lr.coef_)
print("Scikit-Learn Intercept:", sk_lr.intercept_)

# Compare with Scikit-Learn predictions
sk_lr_predictions = sk_lr.predict(X_test)

%timeit sk_lr_predictions
print("Scikit-Learn Predictions:", sk_lr_predictions)

Scikit-Learn Coefficients: [97.34470865 53.97517178]
Scikit-Learn Intercept: 0.01304980049037141
13.2 ns ± 0.338 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)
Scikit-Learn Predictions: [-123.36744417  191.42716919]


The steps for plaintext linear regression are:

- We import LinearRegression from sklearn.linear_model.
- We create an instance of LinearRegression and train it on the data using fit(X, y).
- We define new data (X_new) for prediction.
- We use the trained model's predict(X_new) method to get the predicted target value.
- We print the predicted value for the new data point.
- This section demonstrates how to use scikit-learn's LinearRegression for prediction, similar to how we'll use EncryptedLinearRegression in the next sections.

### VenumML EncryptedLinearRegression Class
This class implements a linear regression model with support for encrypted data. It allows you to perform encrypted predictions without revealing the underlying model parameters or the data itself.

#### Encryption Approach:

This class is designed to enable computations over encrypted data by using Fully Homomorphic Encryption (FHE). FHE enables computations on encrypted data, allowing the model to perform linear regression without decrypting the input data.

#### Class Attributes:

context: Venumpy context object used for encryption and decryption. This can be defined as an argument in the constructor or as a class attribute depending on how you want to manage the encryption context for your models.


### Encrypted Linear Regression with venumpy

In [None]:
from venumML.linear_models.regression.linear_regression import EncryptedLinearRegression
from venumML.venum_tools import encrypt_array

# Create venumpy context with 128 bits of security
ctx = vp.SecretContext()
ctx.precision = 6

In [4]:
X = np.array([[1, 2], [2, 3], [3, 5], [4, 7], [5, 11]])  # Features
y = np.array([2, 4, 5, 4, 5])
       
# 1D array
model = EncryptedLinearRegression(ctx)
model.fit(X, y)
model.encrypt_coefficients(ctx)

In [5]:


class EncryptedLinearRegression:
    """
    A linear regression model that supports encrypted training and prediction.

    Attributes
    ----------
    context : EncryptionContext
        The encryption context that provides encryption and decryption methods.
    coef_ : array-like, shape (n_features,)
        Coefficients of the linear model after fitting (in plaintext).
    intercept_ : float
        Intercept of the linear model after fitting (in plaintext).
    encrypted_intercept_ : encrypted float
        Encrypted intercept of the model, used in encrypted prediction.
    encrypted_coef_ : list of encrypted floats
        Encrypted coefficients of the model, used in encrypted prediction.
    """
    
    def __init__(self, ctx):
        """
        Initialises the EncryptedLinearRegression model with a given encryption context.

        Parameters
        ----------
        ctx : EncryptionContext
            The encryption context used to encrypt values.
        """

        self._context = ctx
        self._coef_ = None
        self._intercept_ = None
        self._encrypted_intercept_ = ctx.encrypt(0)
        self._encrypted_coef_ = ctx.encrypt(0)

    
    def encrypted_fit(self, ctx, x, y, lr=0.3, gamma=0.9, epochs=10):
        """
        Fits the linear regression model on encrypted data using Nesterov's accelerated gradient descent.

        Parameters
        ----------
        ctx : EncryptionContext
            The encryption context used to encrypt and decrypt values.
        x : encrypted array-like, shape (n_samples, n_features)
            Encrypted input data.
        y : encrypted array-like, shape (n_samples,)
            Encrypted target values.
        lr : float, optional, default=0.3
            Learning rate for the optimizer.
        gamma : float, optional, default=0.9
            Momentum parameter for Nesterov's accelerated gradient descent.
        epochs : int, optional, default=10
            Number of epochs to run for optimization.
        """

        optimizer = Nesterov(ctx)
        encrypted_intercept, encrypted_coef, losses = optimizer.venum_nesterov_agd(ctx,x,y)
        
        self._encrypted_intercept_ = encrypted_intercept
        self._encrypted_coef_ = encrypted_coef

    
    def fit(self, X, y):
        """
        Fits the linear regression model using ordinary least squares.

        Parameters
        ----------
        X : array-like, shape (n_samples, n_features)
            Plaintext input data.
        y : array-like, shape (n_samples,)
            Plaintext target values.
        """

        X_b = np.c_[np.ones((X.shape[0], 1)), X]
        theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
        self._intercept_ = theta_best[0]
        self._coef_ = theta_best[1:]

    def encrypt_coefficients(self, ctx):
        """
        Encrypts the model's coefficients and intercept after fitting.

        Parameters
        ----------
        ctx : EncryptionContext
            The encryption context used to encrypt plaintexts.
        """

        self._encrypted_intercept_ = ctx.encrypt(self._intercept_)
        self._encrypted_coef_ = [ctx.encrypt(v) for v in self._coef_]

    def predict(self, encrypted_X, ctx):
        """
        Predicts outcomes using encrypted input data and the model's encrypted coefficients.

        Parameters
        ----------
        encrypted_X : encrypted array-like, shape (n_samples, n_features)
            Encrypted input data for making predictions.
        ctx : EncryptionContext
            The encryption context used to encrypt and decrypt values.

        Returns
        -------
        encrypted_prediction : encrypted array-like, shape (n_samples,)
            The encrypted predictions based on the encrypted model coefficients and intercept.
        """
        
        encrypted_prediction = encrypted_X @ self._encrypted_coef_ + self._encrypted_intercept_
        return encrypted_prediction


In [7]:
# Step 1: Sample Data from previous sklearn is used

# Step 2: Train EncryptedLinearRegression and Scikit-Learn LinearRegression
my_lr = EncryptedLinearRegression(ctx)
my_lr.fit(X_train, y_train)

# Compare the Coefficients and Intercept
print("VENum Linear Regression Coefficients:", my_lr._coef_)
print("VENum Linear Regression Intercept:", my_lr._intercept_)

# Test Inference
my_lr.encrypt_coefficients(ctx)

cipher_X = encrypt_array(X_test,ctx)

# Now cipher_X is a flat list of encrypted features
my_lr_predictions = my_lr.predict(cipher_X,ctx)

# Decrypt predictions
decrypted_predictions = [pred.decrypt() for pred in my_lr_predictions]

# Compare with Scikit-Learn predictions
sk_lr_predictions = sk_lr.predict(X_test)

%timeit my_lr_predictions
# Output comparisons (Note: The decrypted predictions need to be reshaped or processed further to match the format of sk_lr_predictions)
print("Decrypted VENum Predictions:", decrypted_predictions)

VENum Linear Regression Coefficients: [97.34470865 53.97517178]
VENum Linear Regression Intercept: 0.013049800490385621
10.6 ns ± 0.649 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)
Decrypted VENum Predictions: [-123.367444044113, 191.427116046652]
