# universal estimator-exp-1e (compare to prev test-set)

Let $f(d)$ be a one dimensional function, that returns a samples drawn from a univariate distribution (e.g., log-normal).

**Research question**: Let $e = (\vec{pred\_params} - \vec{test\_params})$.  
Does $STD(e)$ decreases when the parameter-search-space gets smaller?

1. Generate a sample (256 observations) using $f$ e.g: $sample = f(d=0.92, size=256)$
2. $estimator(f, sample)$ is a function which learns the parameter $d$ of $f$ from the sample.

> - Init: search_space = [0, 1]
>
> - Iterate:
>   1. Generate synthetic data-set ( train / test ) using $f(d)$ with two settings (exp-1a, exp-1b):
>          d ~ uniform(search_space)
>          d ~ norm(search_space)
>   2. Fit a DNN model to the training set
>   3. Compare the current model error on the test-set to the error on the previous test-set:
>          test_params = [current test params]
>          test_params_prev = [test params from previous iteration]
>          e = (pred_params - test_params)
>          sigma = STD(e)
>          e_prev = (pred_params - test_params_prev)
>          sigma_prev = STD(e_prev)
>   3. Predict the parameter $d\_pred$ on the input sample using the DNN model
>          d_pred = model.predict(sample)
>   4. Narrow the search space:
>          pivot = d_pred
>          scale = 3 * std(e)
>          search_space = [ pivot - scale, pivot + scale ]
>          if next_search_space == search_space: narrow the search_space by epsilon = 0.003
>          (here I'm interested merely on narrowing the search_space, not on "correct" focusing)
>   5. Let: *threshold* = 0.02.  
>      Stop if abs( *sigma* - *sigma_prev* ) > *threshold*

> - Return:
>   - d_pred_array: array of d_pred predicted at each iteration (3)
>   - search_spaces: array of search-space at each iteration (4)
>   - (pred-set, test-set) at each iteration

3. Plot a graph:

>   - $x$: iteration #
>   - $y1$: sigma_1 = STD(pred_params - test_params)
>   - $y2$: abs(d_pred - d_true)
>
> Plot (shaded) intervals around sigma_1:
>
> - sigma_2:
>          mu = mean(e)
>          var_2 = 1/n * sum( ( (e - mu) ^ 2 - (sigma_1) ^ 2 ) ^ 2 )
>          sigma_2 = sqrt(var_2)
>
> - 3 * sigma_1

In [1]:
# import library
%run lib.ipynb
np.set_printoptions(precision=4)

In [2]:
from scipy import stats
from scipy.stats import lognorm

# sample from lognormal
def sample_lognormal(config, size):
    return lognorm.rvs(s=config, size=size, random_state=RANDOM_STATE)

def next_config(search_space):

    """
    return a (uniform) random parameter within search_space
    """
    low = search_space[0]
    high = search_space[1]
    return np.random.uniform(low, high, size=1)[0]


In [3]:
NUM_BINS = 346

def exp_1e(f, sample, d_true, initial_search_space):
    
    """
    Learn the parameter of f, from the sample.
    Arguments:
        - f: one dimensional function that gives the PMF of a univariate distribution.
        - sample: generated using f.
        - search_space: initial search-space
    """

    # number of observations in sample
    M = len(sample)
    N = 1000
   
    # Generate a histogram for the input *sample*
    nbins = 256
    H_sample = np.histogram(sample, bins=nbins, range=(0, nbins), density=False)[0]
    H_sample = np.reshape(H_sample, (1, -1))
    
    search_space = initial_search_space
    first_iteration = True
    H_test_prev = None
    test_params_prev = None
    threshold = 0.2
    
    while True:

        # 1. Generate synthetic data-sets (train/test) using f (within search_space)
        # -----------------------------------------------------------------------------------------
        print()
        print(f'*** search_space: {search_space}')
        print(f'generating data (M={M}, N={N})', end=', ')

        raw_train, H_train, train_params = generate_data(N=N, 
                                       M=M, 
                                       sample=f, 
                                       nextConfig=lambda: next_config(search_space),
                                       nbins=nbins)
        raw_test, H_test, test_params = generate_data(N=N, 
                                       M=M, 
                                       sample=f, 
                                       nextConfig=lambda: next_config(search_space),
                                       nbins=nbins)
        if first_iteration:
            # first iteration only
            H_test_prev = H_test
            test_params_prev = test_params
            first_iteration = False

        # 2. Fit a DNN model to train-set and predict on test-set
        # -----------------------------------------------------------------------------------------

        # train
        print(f'training ...', end=' ')
        dnn_model, history = dnn_fit(X_train=H_train, y_train=train_params)

        # 3. Compare the current model error on the test-set to the error on the previous test-set
        # -----------------------------------------------------------------------------------------
        
        # predict test
        pred_params = dnn_model.predict(H_test).flatten()

        # error
        e = pred_params - test_params
        std = np.std(e)

        # predict test_prev
        pred_params_prev = dnn_model.predict(H_test_prev).flatten()

        # error
        e_prev = pred_params_prev - test_params_prev
        std_prev = np.std(e_prev)
        
        std_diff = np.abs(std - std_prev)
        print(f'*** std: {std:.4f}, std_prev: {std_prev:.4f}, std_diff: {std_diff:.4f}')
        if std_diff > threshold:
            break

        # 4. Predict the parameter (d_pred) on the input sample
        # ---------------------------------------------------------------------------

        d_pred = dnn_model.predict(H_sample).flatten()[0]
            
        # 5. Narrow the search-space
        # ---------------------------------------------------------------------------
        search_pivot = d_pred
        search_STD = np.std(pred_params - test_params)

#         std_factor = 1
#         std_factor = 2
        std_factor = 3

        search_scale = std_factor * search_STD
        print(f'search_pivot: {search_pivot:.4f}, search_scale ({std_factor}*STD) = {search_scale:.4f}')
        
        next_search_space = np.array([ 
            max(search_space[0], search_pivot - search_scale), 
            min(search_space[1], search_pivot + search_scale)])
        
        # if no change in search_space, narrow by epsilon
        if np.array_equal(search_space, next_search_space):
            epsilon = min(0.003, 0.1 * (search_space[1] - search_space[0]))
            print(f'no change in search_space. narrowing by epsilon: {epsilon}')
            next_search_space = np.array([ search_space[0] + epsilon, search_space[1] - epsilon ])
        
        search_space = next_search_space
        
    print(f'd_pred: {d_pred:.4f}, abs(d_pred - d_true): {abs(d_pred - d_true):.4f}')


## Fit (lognormal)

In [4]:
# d_true  = 0.92
for d_true in [0.92]:
    print()
    print(f'param true value: {d_true}')
    f = sample_lognormal
    sample = f(config=d_true, size=256)
    initial_search_space=np.array([0.0, 1.0])
    exp_1e(f=f, sample=sample, d_true=d_true, initial_search_space=initial_search_space)



param true value: 0.92

*** search_space: [0. 1.]
generating data (M=256, N=1000), training ... *** std: 0.0625, std_prev: 0.0625, std_diff: 0.0000
search_pivot: 0.8684, search_scale (3*STD) = 0.1876

*** search_space: [0.6808 1.    ]
generating data (M=256, N=1000), training ... *** std: 0.0572, std_prev: 0.2143, std_diff: 0.1571
search_pivot: 0.9052, search_scale (3*STD) = 0.1716

*** search_space: [0.7336 1.    ]
generating data (M=256, N=1000), training ... *** std: 0.0556, std_prev: 0.2197, std_diff: 0.1640
search_pivot: 0.8630, search_scale (3*STD) = 0.1669
no change in search_space. narrowing by epsilon: 0.003

*** search_space: [0.7366 0.997 ]
generating data (M=256, N=1000), training ... *** std: 0.0563, std_prev: 0.2340, std_diff: 0.1777
search_pivot: 0.8528, search_scale (3*STD) = 0.1689
no change in search_space. narrowing by epsilon: 0.003

*** search_space: [0.7396 0.994 ]
generating data (M=256, N=1000), training ... *** std: 0.0565, std_prev: 0.2585, std_diff: 0.2020
d