pygsti/report/templates/report_notebook/goodness.txt

@@markdown
# Model Violation Analysis
Metrics indicating how well the estimated gate set can be trusted -- i.e., how well it fits the data.
### ProgressTable:
Comparison between the computed and expected maximum $\log(\mathcal{L})$ for different values of $L$. 
$N_S$ and $N_p$ are the number of gate strings and parameters, respectively.  
The quantity $2\Delta\log(\mathcal{L})$ measures the goodness of fit of the GST model (small is better) and is expected to lie within $[k-\sqrt{2k},k+\sqrt{2k}]$ where $k = N_s-N_p$. 
$N_\sigma = (2\Delta\log(\mathcal{L})-k)/\sqrt{2k}$ is the number of standard deviations from the mean (a $p$-value can be straightforwardly derived from $N_\sigma$).  
The rating from 1 to 5 stars gives a very crude indication of goodness of fit.

@@code
ws.FitComparisonTable(Ls, circuits_per_iter, mdl_per_iter,
                      effective_ds, objfn_builder, 'L')
@@markdown
### Per-sequence Model Violation
Each point displays the goodness of fit for a single gate sequence.''')
@@code
k = -1 #iteration index
colorScatterPlot = ws.ColorBoxPlot(
    objfn_builder, circuits_per_iter[k], effective_ds, mdl_per_iter[k],
    linlg_pcntle=0.95, typ='scatter')

@@markdown
### $2\Delta\log(\mathcal{L})$ contributions for every individual experiment in the dataset.
Each pixel represents a single experiment (gate sequence), and its color indicates whether GST was able to fit the corresponding frequency well.  Shades of white/gray are typical. Red squares represent statistically significant evidence for model violation (non-Markovianity), and should appear with probability at most ($1-$*linlg_pcntle*) if the data really are Markovian. Square blocks of pixels correspond to base sequences (arranged vertically by germ and horizontally by length); each pixel within a block corresponds to a specific choice of pre- and post-fiducial sequences.  See text for further details.
@@code
k = -1 #iteration index
colorBoxPlot = ws.ColorBoxPlot(
    objfn_builder, circuits_per_iter[k], effective_ds, mdl_per_iter[k],
    linlg_pcntle=0.95)


@@markdown
### Data scaling factor for every individual experiment in the dataset.
Each pixel represents a single experiment (gate sequence), and its color indicates the amount of scaling that was applied to the original data counts when computing the log-likelihood or $\chi^2$ for this estimate.  
Values of 1.0 indicate all of the original data was used, whereas numbers between 0 and 1 indicate that the data counts for the experiement were artificially decreased (usually to improve the fit).
@@code
ws.ColorBoxPlot("scaling", circuits_final, effective_ds, mdl_final,
        submatrices={'scaling': scale_submxs, 'scaling.colormap': "revseq"})