# Build the 2D universal estimator
Input: two (one) dimensional function $f$ that gives the $PMF$ of a univariate distribution.

> $x = f(d_1, d_2)$

In Python, this comes as a simple function from $Real*Real$ to $Real[0,1]$, case in point: log-normal function.

``` python

def universal(f, data):
    def g(d1, d2):
        """ g is only defined in the cube (-pi/2, pi/2)^2 """
        return f(tan(d1), tan(d2))

    # universal(g) { // Learn d1 and d2, of g, from the data.
    # -1. Set U = cube (-pi/2, pi2/2)^2
    #  0. Pick the number of sample points; should not be too large...  256 maybe.
    #  1. Generate a dense coverage of the cube.
    #  2. Generate the data for each point in the sample of the parameter space.
    #  3. Apply DNN to find out d1 and d2, for the input data, this may be the final result
    #  4. Set the cube to a cube of half the volume.
    #  5. forget all that you learned.
    #  6. Repeat from step 1, if the volume of the cube is greater than 1/128 of the original cube.
    # return the estimate found in the last step 3 above.
               
```

Algorithm:

1. Transform the parameter space to unit cube of dimension n=1,2, e.g., (1+ arctan(t))/2, sigmoid.
2. Learn the Sigmoid(t1), Sigmoid(t2) (optional: learn t1,t2, not their sigmoid) using the usual statistical method of generating synthetic data.
3. Compute the variance of the error of learning, it is typically a function of t1, t2, e.g., the error is not the same everywhere.
4. Compute the fisher information: Log, Derivate, Square (analytically), integrate (numerically) over x, ranging over all reals.
5. Multiply by n (the number of sample points).
6. Compute the inverse.
7. Compare to the actual error from (3)

In [1]:
# import library
%run lib.ipynb

## Universal estimator

In [2]:
def universal_estimator(dist_name, sample_generator, next_config, param_space, N=10000, M=256):

    # generate data
    print(f'generating {dist_name} data (M={M}, N={N}) param-space: {param_space} ... ', end='')
    raw, H, params = generate_data(N=N, 
                                   M=M, 
                                   sample=sample_generator, 
                                   nextConfig=lambda: next_config(param_space),
                                   density=False, 
                                   dense_histogram=True, 
                                   apply_log_scale=False)
    
    H_train, H_test, params_train, params_test = train_test_split(H, 
                                                                    params, 
                                                                    test_size=0.25, 
                                                                    random_state=RANDOM_STATE)
    print(f'histogram shape: {H_train.shape}')

    # fit model to train data
    print(f'fitting dnn model ... ', end='')
    start_time = time.time()
    dnn_model, history = dnn_fit(X_train=H_train, y_train=params_train)
    train_time = round(time.time() - start_time)
    print(f'duration: {round(train_time)} sec.')

    # predict params on test data
    print(f'predicting distribution params ... ', end='')
    params_pred, sqrt_mse = dnn_predict(dnn_model, H_test, params_test)
    print(f'sqrt_mse: {sqrt_mse:.6f}')


## Fit (lognormal)

In [None]:
from scipy import stats
from scipy.stats import lognorm

# Generate data (lognormal)
def lognormal_sample(config, size):
    return lognorm.rvs(s=config, size=size, random_state=RANDOM_STATE)

def lognormal_next_config(param_space):
    min_s = param_space[0][0]
    max_s = param_space[0][1]
    if None != RS:
        return RS.uniform(low=min_s, high=max_s, size=1)[0]
    return np.random.uniform(low=min_s, high=max_s, size=1)[0]

# Fit (lognormal)
universal_estimator(dist_name='lognormal', 
                    sample_generator=lognormal_sample, 
                    next_config=lognormal_next_config,
                    param_space=[(0.0, 1.0)], 
                    N=10000, 
                    M=256)

generating lognormal data (M=256, N=10000) param-space: [(0.0, 1.0)] ... histogram shape: (7500, 111)
fitting dnn model ... 

generating lognormal data (M=256, N=10) param-space: [(0.0, 1.0)] ... histogram shape: (7, 12)  
fitting dnn model ... duration: 7 sec.  
predicting distribution params ... sqrt_mse: 0.397776