In [2]:
"""
This notebook is used to figure out what loss function should be used for the flows. 
Here we use an optimization model to figure out what optimal solution to our loss function is.
In particular we want to figure out if we can design a loss function that when perfectly 
optimized and disregarding the continuity enforced by the flow, produces the training data 
distribution, but gives more weight to errors in the tails rather than the bulk. 
We still haven't found an easy way of doing this.
"""

"\nThis notebook is used to figure out what loss function should be used for the flows. \nHere we use an optimization model to figure out what optimal solution to our loss function is.\nIn particular we want to figure out if we can design a loss function that when perfectly \noptimized and disregarding the continuity enforced by the flow, produces the training data \ndistribution, but gives more weight to errors in the tails rather than the bulk. \nWe still haven't found an easy way of doing this.\n"

In [3]:
"""
For now, the negative loss is defined as the log probability of the set of PID features 
given the flow, averaged over all training data (within the "context" of the auxiliary 
features).
loss = - log_prob(inputs=x, context=y).mean()
Overall, flows already do a much better job than GANs at considering the tails. If an 
event in the tail occurs in the training data and the model gives it 0 probability, then 
the loss will be infinite. GANs on the other hand will not be punished for not generating 
events in the tails if I understand correctly. The caveat is that more advanced GANs like 
the ones we are using promote the generated samples to be "diverse", which effectively 
pushes some of the generated samples into some part of the tails.

One way to make the generation of events more likely is to take all possible observed x 
values and add one (or two, or any real number of) copies of them into the dataset used 
for the loss calculation.

On the other hand, I've been thinking about how to keep the optimal distribution produced 
by the flow to be the close to the actual distribution of training data, while punishing 
the difference between the empirical training data distribution and the flow distribution 
in the tails more severely than the difference of the distributions in the bulk. One solution 
would be to bin the training and flow distributions and punish the square difference with a 
weight depending on whether this bin is in the bulk or tail. Of course this solution struggles 
with the curse of dimensionality. 
"""

'\nFor now, the negative loss is defined as the log probability of the set of PID features \ngiven the flow, averaged over all training data (within the "context" of the auxiliary \nfeatures).\nloss = - log_prob(inputs=x, context=y).mean()\nOverall, flows already do a much better job than GANs at considering the tails. If an \nevent in the tail occurs in the training data and the model gives it 0 probability, then \nthe loss will be infinite. GANs on the other hand will not be punished for not generating \nevents in the tails if I understand correctly. The caveat is that more advanced GANs like \nthe ones we are using promote the generated samples to be "diverse", which effectively \npushes some of the generated samples into some part of the tails.\n\nOne way to make the generation of events more likely is to take all possible observed x \nvalues and add one (or two, or any real number of) copies of them into the dataset used \nfor the loss calculation.\n\nOn the other hand, I\'ve been

In [1]:
from __future__ import division
import matplotlib.pyplot as plt
import numpy as np
from numpy import exp, linspace, loadtxt, pi, sqrt
from lmfit import Model
from scipy.optimize import minimize
from pyomo.environ import *
import pyomo.environ as pyo
from pyomo.opt import SolverFactory
import math

ModuleNotFoundError: No module named 'lmfit'

In [None]:
D = 4 # number of samples
U = 3 # number of unique samples


model = pyo.ConcreteModel()

model.x = pyo.Var(range(U), bounds=(0,1), domain=pyo.NonNegativeReals)


def obj_log(model): #objective
    #return  1*model.x[0] + 2*model.x[1]+ 1*model.x[2]
    return 1*pyo.log(model.x[0]) + 2*pyo.log(model.x[1])+ 1*pyo.log(model.x[2])

#model.OBJ = pyo.Objective(sense=maximize, rule= obj_log)
#model.OBJ = pyo.Objective(sense=maximize, expr = sum(model.x[i] for i in range(U)  ))
#model.OBJ = pyo.Objective(sense=maximize, expr = 1*model.x[0] + 2*model.x[1]+ 1*model.x[2]  ) #with this objective, generate only the mode 100% of the time
model.OBJ = pyo.Objective(sense=maximize, expr = 1*pyo.log(model.x[0]) + 2*pyo.log(model.x[1])+ 1*pyo.log(mode.x[2])) #generated the exact fractions we want

model.Constraint1 = pyo.Constraint(expr = sum(model.x[i] for i in range(U)  ) == 1)
#model.Constraint1 = pyo.Constraint(expr = 3*model.x[0] + 4*model.x[1] >= 1)

np.log(1)

solver = SolverFactory('ipopt') # only installed on my personal machine
#solver = SolverFactory('gurobi') # cannot do nonlinear optimization
#solver = SolverFactory('cplex') # couldn't get to work
solver.solve(model)

print([model.x[i].value for i in range(U)])