-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat yadism matching data as level-2 closure test data #56
Conversation
This accounts for the fact that unlike experimental data, matching data doesn't contain sampling fluctuations. With this implemented the spread of both matching and experimental data is of similar scale/order.
@RoyStegeman Thanks for this! I will run a fit a bit later to check how much this would change the fit. |
The fit results are available here https://data.nnpdf.science/NNUSF/reports/addnoise-matching-1c526f9-221010/output/ PS: There are now new entries in the summary table expr: |
Thanks @Radonirinaunimi. I get the impression that we're underfitting a bit (at least in some region). 1) because the In particular point 2 is not a super strong argument since I don't know what we can reasonable obtain, but still it may be interesting to see if a more aggressive optimization could improve things. Do you maybe have plots of the SF predictions? Those might give us some intuition as to whether we are indeed underfitting. |
My guess on what might be happening is that the fit is more biased towards the Yadism datasets, that is if one was to add weights to the real data the We do indeed have plots of the SF predictions, I will post them in the slack, but the plots look very similar to before. |
This could also be the case, though while the exp chi2 of the matching is much better than that of the real data, this is not the case (at least to the same extend) for the training chi2. I think to answer the question of whether the fit is biased towards yadsim data, the training chi2 is more relevant than the exp chi2.
Okay perfect. Curious to see what they look like. P.S. do you think we should start including SF replica plots as well? |
True! This then indicates that the fluctuation is somehow the main problem here. |
Exactly, that's why I would be interested to know if we are underfitting or not. As said before, in principle underfitting could explain the both poor chi2 of the experimental data (for obvious reasons), and on the yadism side, a lack of fluctuations related to the fitting of the pseudodata replicas might also explain the low chi2 of the central data. Just a hypothesis at this stage of course, but I think it's plausible. |
I will explore this and will be running various fits, so far also we don't have any other plausible causes and solutions. |
Previousely the level-1 shift was changed replica-by-replica. This should not be done because it corresponds to a covmat different from the experimental one. While freezing level-1 fluctuations produces a central value that differs from the experimental value, this is not a problem as our methodology accounts for this (since it also occurs in experimental measurements).
nnusf/src/nnusf/sffit/load_data.py Lines 59 to 66 in 5db6400
Thanks! This exactly should now treat the pseudodata in the same footing as the real datasets. I will run a fit and check (unless our cluster is again polluted by the ATLAS guy...). |
In 5b3bd29, I just moved the adding of the L1 level noise to the data.loader module for the easy of computing the |
If we're confident this is what we want to do, I guess this can be merged? |
Co-authored-by: Roy Stegeman <r.stgmn@gmail.com>
With the couple of replicas already done (the fit is not fully complete yet), the results are as what we'd expect |
Okay good. Well, my reason for merging is that from a purely methodological point of view, this is what we agreed to do. Whether we get chi2~1 or not. If it turns out that there are other problems, and this does not produce the desired results, we can address those problems in a separate PR (and keep the discussions organized). |
Finally, here is the result of the fit https://data.nnpdf.science/NNUSF/reports/addl1noise-matching-e774de7-221018/output/ . The results are now as we'd expect. On the data comparison plots, the Yadism datapoints are now the fluctuated ones. Yes, I agree that we should merge this now, any minor change will be added in different places. |
This accounts for the fact that unlike experimental data, matching data doesn't contain sampling fluctuations. With this implemented the spread of both matching and experimental data is of similar scale/order.
Resolves #55