AB Testing with elfi, bernoulli+lognorm #255

Jakedismo · 2018-02-12T12:06:10Z

Summary:

I'm trying to replicate a STAN model I've been using with elfi. I've run into some trouble with Elfi and propably Python.

Description:

The lognorm and bernoulli distances (using scipy stats functions) are implemented in Python, prior is defined (using one lognorm 2,10), a Simulator object is defined (the function is tested and it ouputs similar data as in elfi examples), ABC rejection sampler defined, but sampling from the rejection sampler throws an error.

Reproducible Steps:

I have defined the simulator like this:

def legacy_updated(data, batch_size=1, random_state=None):
    data= np.atleast_1d(data)
    global result_array 
    rows = len(data)
    cols = len(data[0])
    result_array = np.array([])
    for row in range(rows):
        if(1 > row):
            for col in range(cols):
                x = data[row][col]*bernoulli_pmf(bernoulli_theta(data[None][row]))
                y = lognormal_pdf(frozen_lognormalmean(data[row][None],lognormal_sigma(data[row][None])))
                result_array = np.append(result_array,[x+y, x]) 
            return result_array.reshape(1,-1)
        else:
            for col in range(cols):
                y = data[row][col]*bernoulli_pmf((bernoulli_theta(data[None][row])))
                result_array = np.append(result_array,[y,])
            return result_array.reshape(1,-1)

def lognormal_pdf(y):
    s=0.945
    x = np.linspace(ss.lognorm.ppf(0.01, s),ss.lognorm.ppf(0.99, s))
    return ss.lognorm.pdf(x, s)

def bernoulli_pmf(y):
    p = 0.3 
    x = bernoulli_theta(y)
    return ss.bernoulli.pmf(x,p)

def log_mean(y):
    return np.mean(np.log(y),axis=0)

def log_std(y):
    return np.std(np.log(y),axis=0)

def frozen_lognormalmean(y,s):
    rv=ss.lognorm(s)
    return rv.pdf(np.mean(y))

def lognormal_sigma(y):
    s=0.945
    x = np.linspace(ss.lognorm.ppf(0.01, s),ss.lognorm.ppf(0.99, s))
    return x
    
def bernoulli_theta(y):
    p = 0.3 
    x = np.arange(ss.bernoulli.ppf(0.01, p), ss.bernoulli.ppf(9.99,p))
    return ss.bernoulli.rvs(1,p)

Current Output:

Simulator function outputs the data in the same format as in the elfi example, but when I try to generate data with simulator node I get the following error from within the loop.
`TypeError Traceback (most recent call last)
C:\Anaconda\envs\DataScienceEnv\lib\site-packages\elfi\executor.py in execute(cls, G)
69 try:
---> 70 G.node[node] = cls._run(op, node, G)
71 except Exception as exc:

C:\Anaconda\envs\DataScienceEnv\lib\site-packages\elfi\executor.py in _run(fn, node, G)
153
--> 154 output_dict = {'output': fn(*args, **kwargs)}
155 return output_dict

in legacy_updated(data, batch_size, random_state)
9 rows = len(data)
---> 10 cols = len(data[0])
11 result_array = np.array([])

TypeError: object of type 'numpy.float64' has no len()`

Expected Output:

I didn't expect the error since my function is outputting a numpy array, my python skills propably come in play also this really doesn't seem like a big error

ELFI Version:

0.3.1

Python Version:

3.6.5

Operating System:

windows 10

The text was updated successfully, but these errors were encountered:

vuolleko · 2018-02-12T12:44:05Z

Hi,

Without complete code I'm unable to reproduce this. Most crucially your handling of batch_size remains unclear.

However, based on the error message the problem is that the data argument given to legacy_updated is 1D, but you're trying to use it as 2D. Assuming data is an elfi.Prior, this is the default behaviour, and data.shape[0] = batch_size. You can use the size keyword, e.g. elfi.Prior('lognorm', 2, 10, size=3), to change this.

Also, if you intend to use var et al. as elfi.Summary, note that ELFI uses the first dimension for internal batches, so you probably should replace axis=0 with axis=1 everywhere.

That said, all this assumes that you make use of ELFI's internal batching. It is certainly possible to circumvent this (by essentially forcing batch_size to 1), but I do not recommended such. :)

Jakedismo · 2018-02-12T13:03:49Z

Thx for the help, I got the simulator node working but ran to a trouble with the distance node.
I'm not really using batch_size atm since I just want to get the basic concept working first.
I think that this is due to summaries being 1d arrays instead of 2d:
In executing node 'd': all the input array dimensions except for the concatenation axis must match exactly.
That being said I think that I can get this working (most of the issues I'm facing are due to STAN and c++ and I feel like I'm going to a tree with my ass up when doing this in python...)

vuolleko · 2018-02-12T13:18:19Z

ELFI always uses batches, and batch_size is always the length of the first dimension (unless you hard-code stuff otherwise, which I do not recommend). This is true even if you use batch_size=1, in which case the first dimension of all arrays has length 1.

The summaries are typically 1d arrays of length batch_size (but can be 2d as well).

Jakedismo · 2018-02-12T13:37:42Z

I'll propably just refurbish the whole model to a more usable and robust syntax, I'll be in touch once I get something worthwhile going :D

vuolleko closed this as completed Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AB Testing with elfi, bernoulli+lognorm #255

AB Testing with elfi, bernoulli+lognorm #255

Jakedismo commented Feb 12, 2018 •

edited

Loading

vuolleko commented Feb 12, 2018

Jakedismo commented Feb 12, 2018 •

edited

Loading

vuolleko commented Feb 12, 2018 •

edited

Loading

Jakedismo commented Feb 12, 2018

AB Testing with elfi, bernoulli+lognorm #255

AB Testing with elfi, bernoulli+lognorm #255

Comments

Jakedismo commented Feb 12, 2018 • edited Loading

Summary:

Description:

Reproducible Steps:

Current Output:

Expected Output:

ELFI Version:

Python Version:

Operating System:

vuolleko commented Feb 12, 2018

Jakedismo commented Feb 12, 2018 • edited Loading

vuolleko commented Feb 12, 2018 • edited Loading

Jakedismo commented Feb 12, 2018

Jakedismo commented Feb 12, 2018 •

edited

Loading

Jakedismo commented Feb 12, 2018 •

edited

Loading

vuolleko commented Feb 12, 2018 •

edited

Loading