# $\chi ^2$ new formulation

I have considered an easier $\chi^2$ formulation, which is basically how the statistic used in classical statistical testing when referring to the $\chi^2$ test. The test has the goal of the test is to determine whether a sample comes from a distribution. The assumptions are that the distribution is descrete and the outcomes mutually exclusive, then the observed counts per category are compared with the theoretical ones. Here, the categories are the magnitude bins. I have considered two alternative formulations that treats differently bins with zero counts (where to be zero are the counts provided by the union forecast, not the single synthetic catalogue or the observations). One formulation exclude them and the second one adds one to all counts. I refer to the former as simply $\chi^2$ and to the second with $\chi^2_{+1}$. The ingredients here are the counts per magnitude bin provided by the union forecast $\Lambda_U(k)$, by the resampled synthetic catalogues $\tilde\Lambda_j(k)$, and by the observations $\Omega(k)$

## $\chi^2$ formulation

The statistic in this case is 

$$
D_j = \sum_{k: c_k \neq 0} \frac{ (\tilde\Lambda_j(k) - c_k)^2}{c_k} 
$$
$$
D_0 = \sum_{k: c_k \neq 0} \frac{ (\Omega(k) - c_k)^2}{c_k}
$$
where 
$$
c_k = \frac{\Lambda_U(k)}{\sum_k \Lambda_U(k)} N_o 
$$

The values $c_k$ represents the expected magnitude counts per bin under the forecasting model (represented by $\Lambda_U$) if we had observed $N_o$ values. I consider $N_o$ in the $c_k$ expression because the catalogues are resampled. The summation goes over the bins where $c_k$ is different from zero. 

## $\chi^2_{+1}$ formulation

The statistic in this case is 

$$
D_j = \sum_{k} \frac{ (\tilde\Lambda_j(k) + 1 - c_k)^2}{c_k} 
$$
$$
D_o = \sum_{k} \frac{ (\Omega(k) + 1 - c_k)^2}{c_k} 
$$
where 
$$
c_k = \frac{\Lambda_U(k) + 1}{\sum_k (\Lambda_U(k) + 1)} (N_o + K) 
$$

The values $c_k$ has the same interpretation as before. The only difference is that I add 1 to all bins before summing them up to calculate the probability per bin expected under the forecasting model, and then the expected counts. The number of observations now is not $N_o$ but $N_o + K$ where $K$ is the number of magnitude bins.

## Expectation vs reality

Looking at the formulation of $D_j$ and $D_o$, given that both depends on the square differences between expected and observed counts, I would expect $D_o$ to be increasing a function of the difference $|b_f - b_o|$ and to not be sensitive to cases where $b_f$ is smaller or greater than $b_o$. In other words, I expect $Pr(D_j \leq D_o)$ to go to 1 for both $b_f > b_o$ and $b_f < b_o$. I would also expect that $D_o$ is minimised when $b_o = b_f$. However, none of these things happen when actually running the experiment.


![ciao](Do_Dj_bo125.png)


## Results - Overestimation

For this example, we have considered forecasts coming from a standard GR-law with corner magnitude $m_c = \infty$, and observations coming from a Tap-GR with varying $m_c$. In the panels $b_f$ is the $b$-value of the forecast, and $b_o$ of the observations. The experiment considers $N_o = 500$ observations in each forecasting period, and 1000 forecasting periods. The $\gamma_p$ are the proportion of synthetic catalogues with $D_j$ smaller or equal than $D_o$ calculated on the observations. If the model is consistent with the observations, $\gamma_p$ should look uniformly distributed. Here, I plot the difference between the empirical cumulative distribution (ECDF) of $\gamma_p$ and the uniform CDF, shaded region represents the $95\%$ confidence interval, so that lines outside the shaded region indicate incosistency between model and observations.   

![Overestimation results](tapgr_over.png)




## Results - Underestimation

We repeat the same experiment by switching the distributions used for the forecast and observations in the case of overestimation.

![Underestimation results](tapgr_under.png)