In [1]:
# import library
%run lib.ipynb

## Experiment 3.1: Compute prediction error
So far, we computed the error (MSE / MAE) of λ comparing the learned one with the actual one.  
In this experiment, we test the error in prediction. So,

### Part I
- M=256 sample points, generating histogram H of size K; H[k] = number of samples in bin k, i.e., count how many sample points. Therefore K is computed, but will be small for typical values of λ.
- λ = 0.5 ... 1.5
- N = 10K

Let V = (H[k] - M * Poisson(λ,k))

Then V is parameterized by

k: the current bin

i = 1,...,N: sample number

All computation must include therefore K*N values of V.

**Objective**: compute MAE(V), MSE(V), MEAN_BIAS(V), with respect to all these values.

### Part II
Repeat with λ computed by the expectation of the sample, i.e., $\sum_{k=0}^{K}\; \frac{k*H[k]}{M}$

### Part III
- Combine Part I and Part II, to find out which is better; ours should be. We need a single table/graph/plot.


- Show that the expectation of the sample, gives a biased estimate, and that ours does not.

  To do so: Same data as stage I, but NO LEARNING. Use N=10,000, compute λ by its expectation; 
  
  compare with the λ you selected at random, and show that there is bias between the two.

In [2]:
def generate_poisson_data(N, M, min_lambda=0.5, max_lambda=1.5):
    
    # define λ generator for 0.5 <= λ <= 1.5
    lambda_gen = lambda: next_lambda(min_lambda=min_lambda, max_lambda=max_lambda)

    raw, H, lambdas = generate_data(N=N, 
                                    M=M, 
                                    nextConfig=lambda_gen,
                                    sample=sample_poisson, 
                                    density=False, 
                                    dense_histogram=True, 
                                    apply_log_scale=False)
    return H, lambdas

def compute_V(H, M, lambdas):
    """Compute values of H parameterized by lambdas and (0 <= k < M) using poisson.pmf"""
    i,j = np.ix_(np.arange(H.shape[0]), np.arange(H.shape[1]))
    V = H - M * poisson.pmf(k=j, mu=lambdas[i])
    return V

def calc_expected_lambdas(H, M):
    """ return expected lambda at each row of H """
    k = np.ix_(np.arange(H.shape[1]))
    return np.sum(k*H/M, axis=1)

def experiment_3_1():

    # for reporoducable results
    reset_random_state(17)

    # generate data (histogram and lambdas)
    N=10000
#lilo:     M=256
    M=1024
    min_lambda=0.5
    max_lambda=1.5
    print(f'generating data (M={M}, N={N}) ... ', end='')
    H, lambdas = generate_poisson_data(N, M, min_lambda=min_lambda, max_lambda=max_lambda)
    H_train, H_test, lambdas_train, lambdas_test = train_test_split(H, lambdas, test_size=0.25, 
                                                                    random_state=RANDOM_STATE)
    print(f'histogram shape: {H_train.shape}')

    ##################################################
    # PART-I (prediced λ)
    #
    #   Let V = (H[k] - M * Poisson(prediced_λ,k))
    #   Compute MAE(V), MSE(V), MEAN_BIAS(V)
    ##################################################
    print()
    print(f'experiment_3_1_part_I (prediced λ)')

    # fit model to train data
    print(f'fitting dnn model ... ', end='')
    start_time = time.time()
    dnn_model, history = dnn_fit(X_train=H_train, y_train=lambdas_train)
    train_time = round(time.time() - start_time)
    print(f'duration: {round(train_time)} sec.')

    # predict lambdas on test data
    print(f'predicting lambdas ... ', end='')
    lambdas_pred, sqrt_mse = dnn_predict(dnn_model, H_test, lambdas_test)
    print(f'sqrt_mse(λ): {sqrt_mse:.6f}')
    
    # compute V(prediced_λ): V[i,j] = H_test[i,j] - M * Poisson(prediced_λ[i], j)
    V_prediced = compute_V(H=H_test, M=M, lambdas=lambdas_pred)
    
    # MAE(V)
    MAE_V_predicted = np.mean(np.abs(V_prediced))
    print(f'MAE_V_predicted: {MAE_V_predicted:.6f}')
    
    # SQRT_MSE(V)
    SQRT_MSE_V_predicted = np.sqrt(np.mean(np.square(V_prediced)))
    print(f'SQRT_MSE_V_predicted: {SQRT_MSE_V_predicted:.6f}')
    
    # MEAN_BIAS(V)
    MEAN_BIAS_V_predicted = np.mean(V_prediced)
    print(f'MEAN_BIAS_V_predicted: {MEAN_BIAS_V_predicted:.6f}')

    ##################################################
    # PART-II (expected λ)
    #
    #   Let V = (H[k] - M * Poisson(expected_λ,k))
    #   Compute MAE(V), MSE(V), MEAN_BIAS(V)
    ##################################################
    print()
    print(f'experiment_3_1_part_II (expected λ)')

    lambdas_expected = calc_expected_lambdas(H_test, M)
    
    # compute V(expected_λ): V[i,j] = H_test[i,j] - M * Poisson(λ[i], j)
    V_expected = compute_V(H=H_test, M=M, lambdas=lambdas_expected)
    
    # MAE(V)
    MAE_V_expected = np.mean(np.abs(V_expected))
    print(f'MAE_V_expected: {MAE_V_expected:.6f}')
    
    # SQRT_MSE(V)
    SQRT_MSE_V_expected = np.sqrt(np.mean(np.square(V_expected)))
    print(f'SQRT_MSE_V_expected: {SQRT_MSE_V_expected:.6f}')
    
    # MEAN_BIAS(V)
    MEAN_BIAS_V_expected = np.mean(V_expected)
    print(f'MEAN_BIAS_V_expected: {MEAN_BIAS_V_expected:.6f}')

    ##################################################
    # PART-III (λ bias comparision)
    #
    #   Compare bias of predicted_λ vs expected_λ
    ##################################################
    print()
    print(f'experiment_3_1_part_III (λ bias comparision)')
    MEAN_BIAS_lambdas_pred = np.mean(lambdas_pred - lambdas_test)
    print(f'MEAN_BIAS_lambdas_pred: {MEAN_BIAS_lambdas_pred:.6f}')
    MEAN_BIAS_lambdas_expected = np.mean(lambdas_expected - lambdas_test)
    print(f'MEAN_BIAS_lambdas_expected: {MEAN_BIAS_lambdas_expected:.6f}')
 
    df = pd.DataFrame({
        'MAE(V)': [MAE_V_expected, MAE_V_predicted],
        'SQRT_MSE(V)': [SQRT_MSE_V_expected, SQRT_MSE_V_predicted],
        'MEAN_BIAS(V)': [MEAN_BIAS_V_expected, MEAN_BIAS_V_predicted],
        'MEAN_BIAS(λ)': [MEAN_BIAS_lambdas_expected, MEAN_BIAS_lambdas_pred],
    }, index=['Expected', 'Prediced'])
    
    return df

df = experiment_3_1()
df

generating data (M=1024, N=10000) ... histogram shape: (7500, 10)

experiment_3_1_part_I (prediced λ)
fitting dnn model ... duration: 21 sec.
predicting lambdas ... sqrt_mse(λ): 0.044206
MAE_V_predicted: 4.963334
SQRT_MSE_V_predicted: 8.927807
MEAN_BIAS_V_predicted: 0.000071

experiment_3_1_part_II (expected λ)
MAE_V_expected: 3.839313
SQRT_MSE_V_expected: 7.056411
MEAN_BIAS_V_expected: 0.000068

experiment_3_1_part_III (λ bias comparision)
MEAN_BIAS_lambdas_pred: 0.027761
MEAN_BIAS_lambdas_expected: 0.000452


Unnamed: 0,MAE(V),SQRT_MSE(V),MEAN_BIAS(V),MEAN_BIAS(λ)
Expected,3.839313,7.056411,6.8e-05,0.000452
Prediced,4.963334,8.927807,7.1e-05,0.027761


generating data (M=256, N=10000) ... histogram shape: (7500, 11)

experiment_3_1_part_I (prediced λ)
fitting dnn model ... duration: 23 sec.
predicting lambdas ... sqrt_mse(λ): 0.060201
MAE_V_predicted: 1.782004
SQRT_MSE_V_predicted: 3.399430
MEAN_BIAS_V_predicted: 0.000002

experiment_3_1_part_II (expected λ)
MAE_V_expected: 1.754229
SQRT_MSE_V_expected: 3.370083
MEAN_BIAS_V_expected: 0.000002

experiment_3_1_part_III (λ bias comparision)
MEAN_BIAS_lambdas_pred: 0.005243
MEAN_BIAS_lambdas_expected: 0.000082
MAE(V)	SQRT_MSE(V)	MEAN_BIAS(V)	MEAN_BIAS(λ)
Expected	1.754229	3.370083	0.000002	0.000082
Prediced	1.782004	3.399430	0.000002	0.005243

## Experiment 3.2: Can we help learning of Poisson, by teaching the underlying mathematics.

Learning the computed λ, by using expectation is **extremely easy** for a neural network. 
Based on #7, we hope that the DNN does better than this. So force it to do better, we give it the computed λ as input, and only ask it to learn the difference between the *real* λ (as used to synthesize the data) and the *estimator* λ. 

Our question is: _can we help the neural network to learn the data if we give it mathematical hints on it?_

To do so, define ζ to be the estimator λ, as computed by the expectation; 

> $ζ = Σ_{i=0}^{K} \frac{k * H[k]}{M}$

Define a new histogram Z, computed from H, using ζ .  The new histogram will show the difference between the expected value and the real value,  i.e, run the following loop for k=0, ... , K

> $Z[k] = H[k] - M * Poisson(ζ,k)$

Now apply DNN. The new data includes all the data the previous DNN achieved, but in a slightly more convenient way. Hopefully, the results are better.

Design and implement and experiment to give a conclusive answer to this question. 


In [3]:
# TODO