# Creating Enviromnet
Create an environment where we can do meaninful profiling for the state evolution logistic regression integrals.
For this we load a dataset and compute some artifical overlaps. Then we can timeit a few error metrics to get a sense of the speedup we can achieve using numba.
At a later stage we will do the same for ERM.

In [35]:
import numpy as np
from scipy.special import owens_t
import matplotlib.pyplot as plt
# get a logger
import logging
logger = logging.getLogger(__name__)
from ERM import fair_adversarial_error_erm
from state_evolution import OverlapSet, fair_adversarial_error_overlaps
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [36]:
helper = np.random.uniform(0, 2*np.pi)
theta = np.array([np.cos(helper), np.sin(helper)])
helper = np.random.uniform(helper-np.pi/2, helper+np.pi/2)
w = np.array([np.cos(helper), np.sin(helper)])
print("theta: ", theta)
print("w: ", w)


theta:  [ 0.96313197 -0.26902939]
w:  [0.66106533 0.75032835]


In [37]:
from data_model import VanillaGaussianDataModel
data_model = VanillaGaussianDataModel(2,logger,source_pickle_path="")
data_set = data_model.generate_data(10000,0)
X = data_set.X
X_original = X
d = 2
# We have our own teacher and must create the labels ourselves
y_teacher = np.sign(X.dot(theta)/np.sqrt(d))
y_student = np.sign(X.dot(w)/np.sqrt(d))

In [38]:
# let us fix a gamma
gamma = 0.3

# gamma describes two lines having margin gamma away from the teacher

# let us fix an epsilon
epsilon = 0.5

# compute the overlaps
q = np.dot(w,w) / d
m = np.dot(theta,w) / d
rho = np.dot(theta,theta) / d


# epsilon*m/sqrt(d*q) describes another line parallel to the teacher

# Let us define a Sigma_upsilon
Sigma_upsilon = np.array([[1, 0], [0, 1]])

F = theta.dot(Sigma_upsilon@w) / d
A = w.dot(Sigma_upsilon@w) / d

# now epsilon*F/sqrt(2*q) describes another line parallel to the teacher



In [39]:
# create an object with a property Sigma_upsilon
class DataModel:
    def __init__(self, Sigma_upsilon, rho):
        self.Sigma_upsilon = Sigma_upsilon
        self.rho = rho 
        self.Sigma_w = np.eye(2)
        self.Sigma_delta = np.eye(2)

data_model = DataModel(Sigma_upsilon, rho)


overlaps = OverlapSet()
overlaps.A = A
overlaps.F = F 
overlaps.q = q
overlaps.m = m
overlaps.N = q
overlaps.sigma = 1

from helpers import ProblemType

class Task:
    def __init__(self, epsilon, gamma, overlaps):
        self.epsilon = epsilon
        self.gamma = gamma
        self.overlaps = overlaps
        self.tau = 1
        self.problem_type = ProblemType.Logistic
        self.lam = 0.01
        self.d = 2



task = Task(epsilon, gamma, overlaps)

We want to evaluate test_loss and training error. The first does not use the proximal, the second does. We will start with the first

In [40]:
from state_evolution import LogisticObservables
logistic_problem = LogisticObservables()

In [41]:
# timeit 
%time logistic_problem.test_loss(task,overlaps, data_model,epsilon,20)

CPU times: user 656 µs, sys: 974 µs, total: 1.63 ms
Wall time: 273 µs


0.8610207984092665

In [42]:
# %time logistic_problem.numba_test_loss(task,overlaps, data_model,epsilon,20)

Okay, so a speedup is possible! After compilation at least. Fair enough. Now how about a measure based on proximals?

In [43]:
%time logistic_problem.training_error(task,overlaps, data_model,20)

CPU times: user 12.4 ms, sys: 5.33 ms, total: 17.7 ms
Wall time: 4.6 ms


0.17063038980874703

In [44]:
# %time logistic_problem.numba_training_error(task,overlaps, data_model,20)

The proximal really is a bit harder. root_scalar is a python function...

Yeah, so just use the scipy brentq source code. strongly adapt it to your problem. Compile the C source and use ctypes support in numba. Quite the speedup though :D

# ERM

Now the next question is how to optimize ERM

In [45]:
from ERM import run_optimizer

In [46]:
task.problem_type = ProblemType.Logistic

In [47]:
%time run_optimizer(task, data_model, data_set, logger)

CPU times: user 539 ms, sys: 154 ms, total: 693 ms
Wall time: 124 ms


(array([ 1.86170695, -1.84278529]), <ERM.LogisticProblem at 0x160000410>)

In [48]:
task.problem_type = ProblemType.NumbaLogistic

In [49]:
%time run_optimizer(task, data_model, data_set, logger)

CPU times: user 847 ms, sys: 121 ms, total: 969 ms
Wall time: 161 ms


(array([ 0.12748115, -0.4571405 ]), <ERM.LogisticProblem at 0x160062950>)

Sometimes it's even worse using numba

Can we do better when putting the entire gradient expression to numba?

# Scalings
This part is not for profiling but to test some scalings

In [50]:
from data_model import VanillaGaussianDataModel
d = 100
data_model = VanillaGaussianDataModel(d,logger,source_pickle_path="")


In [51]:
# n = []

# for i in range(1000):
#     data_set = data_model.generate_data(100,0)
#     n.append(np.mean((data_set.X @ data_set.theta/np.sqrt(d))**2))

# print(np.mean(n), np.std(n))