Theory covmats with matched cuts #305

RosalynLP · 2018-10-16T09:44:50Z

I am trying to work on creating theory covmats with cuts that match those of the shift matrices here: https://vp.nnpdf.science/NlltmlyWRRqCtSeJbi1xIQ==/.

Currently the theory covmats take in


each_dataset_results_bytheory = collect('results_bytheoryids',
                                        ('experiments', 'experiment'))

so I have tried to alter them to take in
each_dataset_results_matched = collect('results_bytheoryids', ['dataspecs_with_matched_cuts']).

However, I am getting the error


[ERROR]: Bad configuration encountered:
A parameter is required: dataset_input.
This is needed to process:
 - dataset
trough:
 - (('default_theory', 0),)
trough:
 - report
trough:
 - template_text
trough:
 - plot_thcorrmat_heatmap_custom
trough:
 - theory_corrmat_custom
trough:
 - theory_covmat_custom
trough:
 - covs_pt_prescrip
trough:
 - combine_by_type
trough:
 - each_dataset_results_matched

and I am not sure why. Is this the way I should be trying to do this?

The text was updated successfully, but these errors were encountered:

Zaharid · 2018-10-16T10:43:33Z

ISTM that what we want is to first collect results over dataspecs instead of:

/n/nnpdf (prescrip2 %) $ validphys --help results_bytheoryids
results_bytheoryids

Defined in: reportengine.resourcebuilder

results_bytheoryids()

The result of `results` for each in ('theoryids',).

and then everything else should follow (either with a fair amount of duplicated functions that only call the old functions or with NNPDF/reportengine#63 ).

Zaharid · 2018-10-16T10:44:14Z

Note that we already have datspecs_results

Zaharid · 2018-10-16T10:51:11Z

I think the runcard should look something like this. Please note there is this annoying bug at the moment NNPDF/reportengine#16

fit: XXX
use_cuts: "fromfit"
pdf: YYY
dataspecs:
  - theoryid: ZZZ
    experiments: ... # Probably has to go here for now. Sorry!
  - theoryid: XYXY
    experiments: ...
  ...

and then the actions would collect over:

matched_datasets_from_datapsecs::datasepecs_with_matched_cuts::datapsecs_results

or something like that. Note that this is the same as the shift matrix business.

Zaharid · 2018-10-16T13:56:56Z

Any progress with this? Any problems I could help with?

RosalynLP · 2018-10-16T14:04:33Z

Yes I'm still having problems with the runcards. Currently I have:



meta:
   author: Rosalyn Pearson
   keywords: [test, theory uncertainties, matched cuts]
   title: Testing theory covariance matrix with matched cuts
default_theory:
   - theoryid: 163

fivetheories: nobar

theoryids:
   - 163
   - 177
   - 176
   - 179
   - 174
#   - 180
#   - 173
#   - 175
#   - 178

dataspecs:
        - theoryid: 163
          speclabel: $(\xi_F,\xi_R)=(1,1)$
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
        - theoryid: 177
          speclabel: $(\xi_F,\xi_R)=(2,1)$
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
        - theoryid: 176
          speclabel: $(\xi_F,\xi_R)=(0.5,1)$
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
        - theoryid: 179
          speclabel: $(\xi_F,\xi_R)=(1,2)$ 
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
        - theoryid: 174
          speclabel: $(\xi_F,\xi_R)=(1,0.5)$
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
#        - theoryid: 180
#          speclabel: $(\xi_F,\xi_R)=(2,2)$ 
#        - theoryid: 173
#          speclabel: $(\xi_F,\xi_R)=(0.5,0.5)$
#        - theoryid: 175
#          speclabel: $(\xi_F,\xi_R)=(2,0.5)$   
#        - theoryid: 178
#          speclabel: $(\xi_F,\xi_R)=(0.5,2)$

normalize_to: 1

use_cuts: 'fromfit'
fit: NNPDF31_nlo_as_0118_1000

pdf:
  from_: fit

#template_text: |
#
#   {@with default_theory@}
#
#   {@plot_thcorrmat_heatmap_custom@}
#
#   {@endwith@}

actions_:
#  - report(main=true)
   - matched_datasets_from_dataspecs::dataspecs_with_matched_cuts::dataspecs_results plot_thcorrmat_heatmap_custom

and I am getting the error


[ERROR]: Bad configuration encountered:
A parameter is required: theoryid.
This is needed to process:
 - experiments
trough:
 - dataspecs
trough:
 - matched_datasets_from_dataspecs
trough:
 - ()
trough:
 - plot_thcorrmat_heatmap_custom
Maybe you mistyped theoryid in one of the following keys?
 - theoryids
 - fivetheories

RosalynLP · 2018-10-16T14:05:33Z

I also don't really understand whether this action I am doing is the right thing - I haven't yet altered anything in the code either as I am not really able to debug without getting a basic runcard to work.

RosalynLP · 2018-10-16T14:05:47Z

sorry that was an accident

Zaharid · 2018-10-16T15:59:34Z

Note the runcard above is wrong it that it has the structure:

dataspecs: [
{theoryid: ...},
{experiments: ...},
{theoryid: ...},
{experiments: ...},
...
]

rather that:

dataspecs: [
{experimnts: ..., theoryid: ...},
{experimnts: ..., theoryid: ...},
...
]

which is what the error message is telling you.

RosalynLP · 2018-10-16T16:04:34Z

I don't understand sorry, I tried taking the '-' away from the start of "experiments" but this didn't help

Zaharid · 2018-10-16T16:06:03Z

What does didn't help mean? I don't think it can give the same error.

RosalynLP · 2018-10-16T16:08:05Z

Ah, initially I left one with a dash by accident but now it says


[ERROR]: Bad configuration encountered:
A parameter is required: dataspecs_results.
This is needed to process:
 - (('matched_datasets_from_dataspecs', 0), ('dataspecs_with_matched_cuts', 0))
trough:
 - plot_thcorrmat_heatmap_custom
Maybe you mistyped dataspecs_results in one of the following keys?
 - dataspecs

Zaharid · 2018-10-16T16:23:00Z

This is because datapsecs_results is not something you are supposed to expand namespaces over, but rather something you are supposed to collect over (my earlier message wasn't all that clear in that regard). However it should be easy enough to look at how matched_datasets_shift_matrix works and to the equivalent thing. Note that pretty much the only change is to call male_scale_covmat instead of computing the shifts.

RosalynLP · 2018-10-16T16:24:23Z

I'm really confused, in that case what do I put in the runcard? What is wrong with teh current runcard?

Zaharid · 2018-10-16T16:27:51Z

Have a look at this runcard:

https://vp.nnpdf.science/NlltmlyWRRqCtSeJbi1xIQ==/input/runcard.yaml

and and the corresponding code and try to work out how things get passed around (maybe run it with --debug). Btw it is quite likely that it is affected by the reportengine bug and all the differences are due to changing the pdf...

RosalynLP · 2018-10-16T16:28:06Z

Also we don't want to call make_scale_var_covmat right? Because that won't correlate between process types in the correct way. Ultimately we want to call theory_covmat_custom but this is not easily equatable with matched_datasets_shift_matrix, or at least I don't see how to write an equivalent (this is what I was trying to do earlier).

RosalynLP · 2018-10-16T16:29:21Z

Sorry Zahari, this is the runcard I have been looking at most of the day and I just really don't understand it and can't get it to work properly for some reason

RosalynLP · 2018-10-16T16:30:05Z

I just don't understand how to extend it to the point prescription case, I don't think it is an obvious extension

Zaharid · 2018-10-16T16:35:08Z

Incidentally ISTM that the runcard works well, which makes the bug in re even more confusing.

RosalynLP · 2018-10-16T16:39:08Z

Wait what, the runcard I pasted above?

Zaharid · 2018-10-16T16:39:54Z

The one with the shift plots.

RosalynLP · 2018-10-16T16:55:14Z

Ah no I mean I think I understand how the shift plots work but I just am having difficulty doing an equivalent because

a) I don't understand how to adjust the runcard
b) I am not sure what to feed in to which new functions. What I am attempting is

matched_dataspecs_dataspecs_results = collect('dataspecs_results', ['dataspecs_with_matched_cuts'])

matched_datasets_matched_dataspecs_dataspecs_results = collect('matched_dataspecs_dataspecs_results', ['matched_datasets_from_dataspecs'])

Then writing a new combine_by_type which takes matched_datasets_matched_dataspecs_dataspecs_results rather than each_dataset_results_bytheory but has no other changes.

Is this correct? Is there any part of this which is wrong?

Zaharid · 2018-10-16T17:01:20Z

ISTM that everything could be adapted more or less easily (but not trivially) by changing the namespaces the various actions collect over. E.g. this

results_bytheoryids = collect(results,('theoryids',))
each_dataset_results_bytheory = collect('results_bytheoryids', ('experiments', 'experiment'))

could become:

results_bytheoryids = collect(results,('dataspecs_with_matched_cuts',))
each_dataset_results_bytheory = collect('results_bytheoryids', ('matched_datasets_from_dataspecs'))

and then maybe you'll need some function to get the right dataframe index (I wrote the functionality inside some other provider).

Zaharid · 2018-10-16T17:09:14Z

@RosalynLP Yes, what you are doing seems like what I said.

RosalynLP · 2018-10-17T10:40:15Z

OK great but I keep getting this problem:


[ERROR]: Bad configuration encountered:
A parameter is required: dataset_input.
This is needed to process:
 - commondata
trough:
 - report
trough:
 - template_text
trough:
 - plot_thcorrmat_heatmap_custom
trough:
 - theory_corrmat_custom
trough:
 - theory_covmat_custom
trough:
 - covs_pt_prescrip
trough:
 - combine_by_type
trough:
 - process_lookup
trough:
 - commondata_experiments

Initially I had

#commondata_experiments = collect('commondata', ['experiments', 'experiment'])

and I tried changing it to

commondata_experiments = collect('commondata',
                                 ('matched_datasets_from_dataspecs',))

but I still get the issue because of commondata itself.

RosalynLP · 2018-10-17T10:42:35Z

I only need the names of the experiments for this so I could take it from any dataspec but I am not sure how to do the syntax for this

RosalynLP · 2018-10-17T11:04:23Z

OK I did this instead


commondata_experiments_sub = collect('commondata', ['dataspecs_with_matched_cuts'])
commondata_experiments = collect('commondata_experiments_sub',['matched_datasets_from_dataspecs'])

RosalynLP · 2018-10-17T11:11:21Z

@Zaharid when you say "and then maybe you'll need some function to get the right dataframe index (I wrote the functionality inside some other provider)." I presume what you mean is the fact experiments_index doesn't work and gives the error


[ERROR]: Bad configuration encountered:
A parameter is required: experiments.
This is needed to process:
 - report
trough:
 - template_text
trough:
 - plot_thcorrmat_heatmap_custom
trough:
 - theory_corrmat_custom
trough:
 - theory_covmat_custom
trough:
 - experiments_index

but I don't understand what the statement "I wrote the functionality inside some other provider" means - what are the different providers? So the issue is experiments_index loads in the experiments before the cuts have been matched or something? What is the correct input rather than experiments to this kind of function?

RosalynLP · 2018-10-17T11:24:53Z

So we want to take in only the datasets which are mutual, i.e. those in matched_datasets_from_dataspecs?

RosalynLP · 2018-10-17T15:24:23Z

@Zaharid this whole thing makes no sense to me, even if I calculate the theory covmats using this runcard and matched_datsets_from_dataspecs, it is taking the dataspecs to be the different scale varied dataspecs, not the two dataspecs for NLO and NNLO with NNPDF3.1. So I somehow want to have two kinds of groupings in the runcard, one for the scale varied dataspecs, and one for the shift dataspecs. We then need the shift dataspecs to do as your functions already do and compute the shift matrix, and we need the other dataspecs to do the theory covmat stuff which previously existed. But we want to do all that just for the points belonging to the matched datasets from the OTHER (shift) dataspecs. Do you know how to separate these two things?

Zaharid · 2018-10-17T15:51:44Z

On Wed, Oct 17, 2018 at 4:24 PM RosalynLP ***@***.***> wrote: @Zaharid <https://github.com/Zaharid> this whole thing makes no sense to me, even if I calculate the theory covmats using this runcard and matched_datsets_from_dataspecs, it is taking the dataspecs to be the different scale varied dataspecs, not the two dataspecs for NLO and NNLO with NNPDF3.1. So I somehow want to have two kinds of groupings in the runcard, one for the scale varied dataspecs, and one for the shift dataspecs. We then need the shift dataspecs to do as your functions already do and compute the shift matrix, and we need the other dataspecs to do the theory covmat stuff which previously existed. But we want to do all that just for the points belonging to the matched datasets from the OTHER (shift) dataspecs. Do you know how to separate these two things?

This is can be solved with various kinds of namespaces: shiftconfig: dataspecs: - ... #nlo vs nnlo thcovconfig: dataspecs: - ... # bazillion theories #TODO: find better names shift_mat_for_comparison = collect('shift_matrix_whatever_was_called', ['shiftconfig']) th_mat_for_comparison = collect('thcovmat_custom_whatever', ['thcovmatconfig']) def do_some_comparison(shift_mat_for_comparison, th_mat_for_comparison): #because collect always returns a list shift_mat = th_mat_for_comparison[0] th_mat = th_mat_for_comparison[0] ...

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#305 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFabUnVxiSfzygCm5oJvSNf0OtadBvTtks5ul0uogaJpZM4Xd-Z5> .

Zaharid · 2018-10-17T15:56:15Z

mmm probably the fact that collect always returns a list is annoying enough to justify a collect_one or somesuch. Anyhow, lets get the thcovmat done first!

…

On Wed, Oct 17, 2018 at 4:51 PM Zahari Dim ***@***.***> wrote: On Wed, Oct 17, 2018 at 4:24 PM RosalynLP ***@***.***> wrote: > @Zaharid <https://github.com/Zaharid> this whole thing makes no sense to > me, even if I calculate the theory covmats using this runcard and > matched_datsets_from_dataspecs, it is taking the dataspecs to be the > different scale varied dataspecs, not the two dataspecs for NLO and NNLO > with NNPDF3.1. So I somehow want to have two kinds of groupings in the > runcard, one for the scale varied dataspecs, and one for the shift > dataspecs. We then need the shift dataspecs to do as your functions already > do and compute the shift matrix, and we need the other dataspecs to do the > theory covmat stuff which previously existed. But we want to do all that > just for the points belonging to the matched datasets from the OTHER > (shift) dataspecs. Do you know how to separate these two things? > This is can be solved with various kinds of namespaces: shiftconfig: dataspecs: - ... #nlo vs nnlo thcovconfig: dataspecs: - ... # bazillion theories #TODO: find better names shift_mat_for_comparison = collect('shift_matrix_whatever_was_called', ['shiftconfig']) th_mat_for_comparison = collect('thcovmat_custom_whatever', ['thcovmatconfig']) def do_some_comparison(shift_mat_for_comparison, th_mat_for_comparison): #because collect always returns a list shift_mat = th_mat_for_comparison[0] th_mat = th_mat_for_comparison[0] ... > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#305 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AFabUnVxiSfzygCm5oJvSNf0OtadBvTtks5ul0uogaJpZM4Xd-Z5> > . >

RosalynLP · 2018-10-18T10:14:24Z

OK so when are the matched cuts being applied? Before this collect function presumably? In which case how do they know which dataspsecs to pick? Or does the fact you collect over a certain namespace have an effect? I basically still don't see that this will make the theory covmat have the matched cuts for the shift comparison.

RosalynLP · 2018-10-18T10:23:21Z

#309

RosalynLP · 2018-10-18T10:54:10Z

I don't understand how the collect function is working here, for the shift matrix the workflow is essentially

matched_dataspecs_dataset_prediction_shift = collect(
    'dataspecs_dataset_prediction_shift', ['matched_datasets_from_dataspecs'])

def matched_datasets_shift_matrix(matched_dataspecs_dataset_prediction_shift):
    """Priduce a matrix out of the outer product of
    ``dataspecs_dataset_prediction_shift``. The matrix will be a
    pandas DataFrame, indexed similarly to ``experiments_index``."""
    all_shifts = np.concatenate(
        [val.shifts for val in matched_dataspecs_dataset_prediction_shift])
    mat = np.outer(all_shifts, all_shifts)
    #build index
    expnames = np.concatenate([
        np.full(len(val.shifts), val.experiment_name, dtype=object)
        for val in matched_dataspecs_dataset_prediction_shift
    ])
    dsnames = np.concatenate([
        np.full(len(val.shifts), val.dataset_name, dtype=object)
        for val in matched_dataspecs_dataset_prediction_shift
    ])
    point_indexes = np.concatenate([
        np.arange(len(val.shifts))
        for val in matched_dataspecs_dataset_prediction_shift
    ])

    index = pd.MultiIndex.from_arrays(
        [expnames, dsnames, point_indexes],
        names=["Experiment name", "Dataset name", "Point"])

    return pd.DataFrame(mat, columns=index, index=index)

shift_mat_for_comparison = collect('matched_datasets_shift_matrix', ['shiftconfig'])

So I don't see how this works: first the matched_datasets_from_dataspecs won't know which dataspecs to use, right, then even if that works you should end up with a list of matrices collected over the two theories NLO and NNLO or something, which makes no sense to me.

And then as for theories it won't know which dataspecs to use to evaluate the theory covmat, you'll end up with it using at best the mutual cuts from the scale varied theories, which aren't the same as for the NLO/NNLO mutual cuts, and then you will collect over all the theories, so end up with a list of matrices. But as far as I can see it will fail before this stage.

Regardless, I am having problems just getting the formatting on the runcard to work as it doesn't like all the different blocks:

Failed to parse yaml file: while parsing a block mapping
  in "matched_test_notab.yaml", line 23, column 9
expected <block end>, but found '-'
  in "matched_test_notab.yaml", line 100, column 9

meta:
   author: Rosalyn Pearson
   keywords: [test, theory uncertainties, matched cuts]
   title: Testing theory covariance matrix with matched cuts
default_theory:
   - theoryid: 163

fivetheories: nobar
   
theoryids:
   - 163
   - 177
   - 176
   - 179
   - 174
#   - 180
#   - 173
#   - 175
#   - 178

thcovconfig:
   dataspecs:
      - theoryid: 163
        speclabel: $(\xi_F,\xi_R)=(1,1)$
        experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
        - theoryid: 177
          speclabel: $(\xi_F,\xi_R)=(2,1)$
          experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
        - theoryid: 176
          speclabel: $(\xi_F,\xi_R)=(0.5,1)$
          experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
        - theoryid: 179
          speclabel: $(\xi_F,\xi_R)=(1,2)$ 
          experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
        - theoryid: 174
          speclabel: $(\xi_F,\xi_R)=(1,0.5)$
          experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
#        - theoryid: 180
#          speclabel: $(\xi_F,\xi_R)=(2,2)$ 
#        - theoryid: 173
#          speclabel: $(\xi_F,\xi_R)=(0.5,0.5)$
#        - theoryid: 175
#          speclabel: $(\xi_F,\xi_R)=(2,0.5)$   
#        - theoryid: 178
#          speclabel: $(\xi_F,\xi_R)=(0.5,2)$

shiftconfig:
   dataspecs:
      - theoryid: 52
        pdf: NNPDF31_nlo_as_0118_hessian
        speclabel: "NLO"
        fit: NNPDF31_nlo_as_0118_1000

      - theoryid: 53
        pdf: NNPDF31_nnlo_as_0118_hessian
        speclabel: "NNLO"
        fit: NNPDF31_nnlo_as_0118_1000

normalize_to: 1

use_cuts: 'fromfit'
fit: NNPDF31_nlo_as_0118_1000

pdf:
  from_: fit

template_text: |

   {@with default_theory@}

   {@plot_thcorrmat_heatmap_custom@}

   {@endwith@}

   {@with shiftconfig@}

   {@plot_matched_datasets_shift_matrix@}
   {@plot_matched_datasets_shift_matrix_correlations@}

   {@endwith@}

actions_:
  - report(main=true)

RosalynLP assigned Zaharid and voisey Oct 16, 2018

RosalynLP closed this as completed Oct 16, 2018

RosalynLP reopened this Oct 16, 2018

lucarottoli mentioned this issue Nov 2, 2018

[WIP] NNPDF31 common cuts #318

Merged

RosalynLP closed this as completed Nov 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Theory covmats with matched cuts #305

Theory covmats with matched cuts #305

RosalynLP commented Oct 16, 2018 •

edited

Loading

Zaharid commented Oct 16, 2018

Zaharid commented Oct 16, 2018

Zaharid commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018 •

edited by Zaharid

Loading

Zaharid commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

Zaharid commented Oct 17, 2018 via email

Zaharid commented Oct 17, 2018 via email

RosalynLP commented Oct 18, 2018

RosalynLP commented Oct 18, 2018

RosalynLP commented Oct 18, 2018

Theory covmats with matched cuts #305

Theory covmats with matched cuts #305

Comments

RosalynLP commented Oct 16, 2018 • edited Loading

Zaharid commented Oct 16, 2018

Zaharid commented Oct 16, 2018

Zaharid commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 16, 2018 • edited by Zaharid Loading

Zaharid commented Oct 16, 2018

Zaharid commented Oct 16, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

RosalynLP commented Oct 17, 2018

Zaharid commented Oct 17, 2018 via email

Zaharid commented Oct 17, 2018 via email

RosalynLP commented Oct 18, 2018

RosalynLP commented Oct 18, 2018

RosalynLP commented Oct 18, 2018

RosalynLP commented Oct 16, 2018 •

edited

Loading

RosalynLP commented Oct 16, 2018 •

edited by Zaharid

Loading