Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theory covmats with matched cuts #305

Closed
RosalynLP opened this issue Oct 16, 2018 · 34 comments
Closed

Theory covmats with matched cuts #305

RosalynLP opened this issue Oct 16, 2018 · 34 comments
Assignees

Comments

@RosalynLP
Copy link
Contributor

RosalynLP commented Oct 16, 2018

I am trying to work on creating theory covmats with cuts that match those of the shift matrices here: https://vp.nnpdf.science/NlltmlyWRRqCtSeJbi1xIQ==/.

Currently the theory covmats take in


each_dataset_results_bytheory = collect('results_bytheoryids',
                                        ('experiments', 'experiment'))

so I have tried to alter them to take in
each_dataset_results_matched = collect('results_bytheoryids', ['dataspecs_with_matched_cuts']).

However, I am getting the error


[ERROR]: Bad configuration encountered:
A parameter is required: dataset_input.
This is needed to process:
 - dataset
trough:
 - (('default_theory', 0),)
trough:
 - report
trough:
 - template_text
trough:
 - plot_thcorrmat_heatmap_custom
trough:
 - theory_corrmat_custom
trough:
 - theory_covmat_custom
trough:
 - covs_pt_prescrip
trough:
 - combine_by_type
trough:
 - each_dataset_results_matched

and I am not sure why. Is this the way I should be trying to do this?

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

ISTM that what we want is to first collect results over dataspecs instead of:

/n/nnpdf (prescrip2 %) $ validphys --help results_bytheoryids
results_bytheoryids

Defined in: reportengine.resourcebuilder

results_bytheoryids()

The result of `results` for each in ('theoryids',).

and then everything else should follow (either with a fair amount of duplicated functions that only call the old functions or with NNPDF/reportengine#63 ).

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

Note that we already have datspecs_results

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

I think the runcard should look something like this. Please note there is this annoying bug at the moment NNPDF/reportengine#16

fit: XXX
use_cuts: "fromfit"
pdf: YYY
dataspecs:
  - theoryid: ZZZ
    experiments: ... # Probably has to go here for now. Sorry!
  - theoryid: XYXY
    experiments: ...
  ...

and then the actions would collect over:

matched_datasets_from_datapsecs::datasepecs_with_matched_cuts::datapsecs_results

or something like that. Note that this is the same as the shift matrix business.

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

Any progress with this? Any problems I could help with?

@RosalynLP
Copy link
Contributor Author

Yes I'm still having problems with the runcards. Currently I have:



meta:
   author: Rosalyn Pearson
   keywords: [test, theory uncertainties, matched cuts]
   title: Testing theory covariance matrix with matched cuts
default_theory:
   - theoryid: 163

fivetheories: nobar

theoryids:
   - 163
   - 177
   - 176
   - 179
   - 174
#   - 180
#   - 173
#   - 175
#   - 178

dataspecs:
        - theoryid: 163
          speclabel: $(\xi_F,\xi_R)=(1,1)$
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
        - theoryid: 177
          speclabel: $(\xi_F,\xi_R)=(2,1)$
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
        - theoryid: 176
          speclabel: $(\xi_F,\xi_R)=(0.5,1)$
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
        - theoryid: 179
          speclabel: $(\xi_F,\xi_R)=(1,2)$ 
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
        - theoryid: 174
          speclabel: $(\xi_F,\xi_R)=(1,0.5)$
        - experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                - dataset: NMCPD
                - dataset: NMC
            - experiment: SLAC
              datasets:
                - dataset: SLACP
                - dataset: SLACD
            - experiment: BCDMS
              datasets:
                - dataset: BCDMSP
                - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                - dataset: NTVNUDMN
                - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                - dataset: CHORUSNU
                - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                - dataset: HERACOMBNCEM 
                - dataset: HERACOMBNCEP460
                - dataset: HERACOMBNCEP575
                - dataset: HERACOMBNCEP820
                - dataset: HERACOMBNCEP920
                - dataset: HERACOMBCCEM 
                - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                - dataset: H1HERAF2B
                - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                - dataset: ATLASWZRAP36PB
                - dataset: ATLASZHIGHMASS49FB
                - dataset: ATLASLOMASSDY11EXT
                - dataset: ATLASWZRAP11
                - dataset: ATLAS1JET11
                - dataset: ATLASZPT8TEVMDIST
                - dataset: ATLASZPT8TEVYDIST
                - dataset: ATLASTTBARTOT
                - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                - dataset: CMSWEASY840PB
                - dataset: CMSWMASY47FB
                - dataset: CMSWCHARMRAT
                - dataset: CMSDY2D11
                - dataset: CMSWMU8TEV
                - dataset: CMSJETS11
                - dataset: CMSTTBARTOT
                - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                - dataset: LHCBZ940PB
                - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                - dataset: CDFZRAP
                - dataset: CDFR2KT
            - experiment: D0
              datasets:
                - dataset: D0ZRAP
                - dataset: D0WEASY
                - dataset: D0WMASY
#        - theoryid: 180
#          speclabel: $(\xi_F,\xi_R)=(2,2)$ 
#        - theoryid: 173
#          speclabel: $(\xi_F,\xi_R)=(0.5,0.5)$
#        - theoryid: 175
#          speclabel: $(\xi_F,\xi_R)=(2,0.5)$   
#        - theoryid: 178
#          speclabel: $(\xi_F,\xi_R)=(0.5,2)$

normalize_to: 1

use_cuts: 'fromfit'
fit: NNPDF31_nlo_as_0118_1000

pdf:
  from_: fit

#template_text: |
#
#   {@with default_theory@}
#
#   {@plot_thcorrmat_heatmap_custom@}
#
#   {@endwith@}

actions_:
#  - report(main=true)
   - matched_datasets_from_dataspecs::dataspecs_with_matched_cuts::dataspecs_results plot_thcorrmat_heatmap_custom


and I am getting the error


[ERROR]: Bad configuration encountered:
A parameter is required: theoryid.
This is needed to process:
 - experiments
trough:
 - dataspecs
trough:
 - matched_datasets_from_dataspecs
trough:
 - ()
trough:
 - plot_thcorrmat_heatmap_custom
Maybe you mistyped theoryid in one of the following keys?
 - theoryids
 - fivetheories

@RosalynLP
Copy link
Contributor Author

I also don't really understand whether this action I am doing is the right thing - I haven't yet altered anything in the code either as I am not really able to debug without getting a basic runcard to work.

@RosalynLP RosalynLP reopened this Oct 16, 2018
@RosalynLP
Copy link
Contributor Author

sorry that was an accident

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

Note the runcard above is wrong it that it has the structure:

dataspecs: [
{theoryid: ...},
{experiments: ...},
{theoryid: ...},
{experiments: ...},
...
]

rather that:

dataspecs: [
{experimnts: ..., theoryid: ...},
{experimnts: ..., theoryid: ...},
...
]

which is what the error message is telling you.

@RosalynLP
Copy link
Contributor Author

I don't understand sorry, I tried taking the '-' away from the start of "experiments" but this didn't help

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

What does didn't help mean? I don't think it can give the same error.

@RosalynLP
Copy link
Contributor Author

Ah, initially I left one with a dash by accident but now it says


[ERROR]: Bad configuration encountered:
A parameter is required: dataspecs_results.
This is needed to process:
 - (('matched_datasets_from_dataspecs', 0), ('dataspecs_with_matched_cuts', 0))
trough:
 - plot_thcorrmat_heatmap_custom
Maybe you mistyped dataspecs_results in one of the following keys?
 - dataspecs

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

This is because datapsecs_results is not something you are supposed to expand namespaces over, but rather something you are supposed to collect over (my earlier message wasn't all that clear in that regard). However it should be easy enough to look at how matched_datasets_shift_matrix works and to the equivalent thing. Note that pretty much the only change is to call male_scale_covmat instead of computing the shifts.

@RosalynLP
Copy link
Contributor Author

I'm really confused, in that case what do I put in the runcard? What is wrong with teh current runcard?

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

Have a look at this runcard:

https://vp.nnpdf.science/NlltmlyWRRqCtSeJbi1xIQ==/input/runcard.yaml

and and the corresponding code and try to work out how things get passed around (maybe run it with --debug). Btw it is quite likely that it is affected by the reportengine bug and all the differences are due to changing the pdf...

@RosalynLP
Copy link
Contributor Author

Also we don't want to call make_scale_var_covmat right? Because that won't correlate between process types in the correct way. Ultimately we want to call theory_covmat_custom but this is not easily equatable with matched_datasets_shift_matrix, or at least I don't see how to write an equivalent (this is what I was trying to do earlier).

@RosalynLP
Copy link
Contributor Author

Sorry Zahari, this is the runcard I have been looking at most of the day and I just really don't understand it and can't get it to work properly for some reason

@RosalynLP
Copy link
Contributor Author

I just don't understand how to extend it to the point prescription case, I don't think it is an obvious extension

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

Incidentally ISTM that the runcard works well, which makes the bug in re even more confusing.

@RosalynLP
Copy link
Contributor Author

Wait what, the runcard I pasted above?

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

The one with the shift plots.

@RosalynLP
Copy link
Contributor Author

RosalynLP commented Oct 16, 2018

Ah no I mean I think I understand how the shift plots work but I just am having difficulty doing an equivalent because

a) I don't understand how to adjust the runcard
b) I am not sure what to feed in to which new functions. What I am attempting is

matched_dataspecs_dataspecs_results = collect('dataspecs_results', ['dataspecs_with_matched_cuts'])

matched_datasets_matched_dataspecs_dataspecs_results = collect('matched_dataspecs_dataspecs_results', ['matched_datasets_from_dataspecs'])

Then writing a new combine_by_type which takes matched_datasets_matched_dataspecs_dataspecs_results rather than each_dataset_results_bytheory but has no other changes.

Is this correct? Is there any part of this which is wrong?

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

ISTM that everything could be adapted more or less easily (but not trivially) by changing the namespaces the various actions collect over. E.g. this

results_bytheoryids = collect(results,('theoryids',))
each_dataset_results_bytheory = collect('results_bytheoryids', ('experiments', 'experiment'))

could become:

results_bytheoryids = collect(results,('dataspecs_with_matched_cuts',))
each_dataset_results_bytheory = collect('results_bytheoryids', ('matched_datasets_from_dataspecs'))

and then maybe you'll need some function to get the right dataframe index (I wrote the functionality inside some other provider).

@Zaharid
Copy link
Contributor

Zaharid commented Oct 16, 2018

@RosalynLP Yes, what you are doing seems like what I said.

@RosalynLP
Copy link
Contributor Author

OK great but I keep getting this problem:


[ERROR]: Bad configuration encountered:
A parameter is required: dataset_input.
This is needed to process:
 - commondata
trough:
 - report
trough:
 - template_text
trough:
 - plot_thcorrmat_heatmap_custom
trough:
 - theory_corrmat_custom
trough:
 - theory_covmat_custom
trough:
 - covs_pt_prescrip
trough:
 - combine_by_type
trough:
 - process_lookup
trough:
 - commondata_experiments

Initially I had

#commondata_experiments = collect('commondata', ['experiments', 'experiment'])

and I tried changing it to

commondata_experiments = collect('commondata',
                                 ('matched_datasets_from_dataspecs',))

but I still get the issue because of commondata itself.

@RosalynLP
Copy link
Contributor Author

I only need the names of the experiments for this so I could take it from any dataspec but I am not sure how to do the syntax for this

@RosalynLP
Copy link
Contributor Author

OK I did this instead


commondata_experiments_sub = collect('commondata', ['dataspecs_with_matched_cuts'])
commondata_experiments = collect('commondata_experiments_sub',['matched_datasets_from_dataspecs'])

@RosalynLP
Copy link
Contributor Author

@Zaharid when you say "and then maybe you'll need some function to get the right dataframe index (I wrote the functionality inside some other provider)." I presume what you mean is the fact experiments_index doesn't work and gives the error


[ERROR]: Bad configuration encountered:
A parameter is required: experiments.
This is needed to process:
 - report
trough:
 - template_text
trough:
 - plot_thcorrmat_heatmap_custom
trough:
 - theory_corrmat_custom
trough:
 - theory_covmat_custom
trough:
 - experiments_index

but I don't understand what the statement "I wrote the functionality inside some other provider" means - what are the different providers? So the issue is experiments_index loads in the experiments before the cuts have been matched or something? What is the correct input rather than experiments to this kind of function?

@RosalynLP
Copy link
Contributor Author

So we want to take in only the datasets which are mutual, i.e. those in matched_datasets_from_dataspecs?

@RosalynLP
Copy link
Contributor Author

@Zaharid this whole thing makes no sense to me, even if I calculate the theory covmats using this runcard and matched_datsets_from_dataspecs, it is taking the dataspecs to be the different scale varied dataspecs, not the two dataspecs for NLO and NNLO with NNPDF3.1. So I somehow want to have two kinds of groupings in the runcard, one for the scale varied dataspecs, and one for the shift dataspecs. We then need the shift dataspecs to do as your functions already do and compute the shift matrix, and we need the other dataspecs to do the theory covmat stuff which previously existed. But we want to do all that just for the points belonging to the matched datasets from the OTHER (shift) dataspecs. Do you know how to separate these two things?

@Zaharid
Copy link
Contributor

Zaharid commented Oct 17, 2018 via email

@Zaharid
Copy link
Contributor

Zaharid commented Oct 17, 2018 via email

@RosalynLP
Copy link
Contributor Author

OK so when are the matched cuts being applied? Before this collect function presumably? In which case how do they know which dataspsecs to pick? Or does the fact you collect over a certain namespace have an effect? I basically still don't see that this will make the theory covmat have the matched cuts for the shift comparison.

@RosalynLP
Copy link
Contributor Author

#309

@RosalynLP
Copy link
Contributor Author

I don't understand how the collect function is working here, for the shift matrix the workflow is essentially

matched_dataspecs_dataset_prediction_shift = collect(
    'dataspecs_dataset_prediction_shift', ['matched_datasets_from_dataspecs'])

def matched_datasets_shift_matrix(matched_dataspecs_dataset_prediction_shift):
    """Priduce a matrix out of the outer product of
    ``dataspecs_dataset_prediction_shift``. The matrix will be a
    pandas DataFrame, indexed similarly to ``experiments_index``."""
    all_shifts = np.concatenate(
        [val.shifts for val in matched_dataspecs_dataset_prediction_shift])
    mat = np.outer(all_shifts, all_shifts)
    #build index
    expnames = np.concatenate([
        np.full(len(val.shifts), val.experiment_name, dtype=object)
        for val in matched_dataspecs_dataset_prediction_shift
    ])
    dsnames = np.concatenate([
        np.full(len(val.shifts), val.dataset_name, dtype=object)
        for val in matched_dataspecs_dataset_prediction_shift
    ])
    point_indexes = np.concatenate([
        np.arange(len(val.shifts))
        for val in matched_dataspecs_dataset_prediction_shift
    ])

    index = pd.MultiIndex.from_arrays(
        [expnames, dsnames, point_indexes],
        names=["Experiment name", "Dataset name", "Point"])

    return pd.DataFrame(mat, columns=index, index=index)

shift_mat_for_comparison = collect('matched_datasets_shift_matrix', ['shiftconfig'])

So I don't see how this works: first the matched_datasets_from_dataspecs won't know which dataspecs to use, right, then even if that works you should end up with a list of matrices collected over the two theories NLO and NNLO or something, which makes no sense to me.

And then as for theories it won't know which dataspecs to use to evaluate the theory covmat, you'll end up with it using at best the mutual cuts from the scale varied theories, which aren't the same as for the NLO/NNLO mutual cuts, and then you will collect over all the theories, so end up with a list of matrices. But as far as I can see it will fail before this stage.

Regardless, I am having problems just getting the formatting on the runcard to work as it doesn't like all the different blocks:

Failed to parse yaml file: while parsing a block mapping
  in "matched_test_notab.yaml", line 23, column 9
expected <block end>, but found '-'
  in "matched_test_notab.yaml", line 100, column 9
meta:
   author: Rosalyn Pearson
   keywords: [test, theory uncertainties, matched cuts]
   title: Testing theory covariance matrix with matched cuts
default_theory:
   - theoryid: 163

fivetheories: nobar
   
theoryids:
   - 163
   - 177
   - 176
   - 179
   - 174
#   - 180
#   - 173
#   - 175
#   - 178

thcovconfig:
   dataspecs:
      - theoryid: 163
        speclabel: $(\xi_F,\xi_R)=(1,1)$
        experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
        - theoryid: 177
          speclabel: $(\xi_F,\xi_R)=(2,1)$
          experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
        - theoryid: 176
          speclabel: $(\xi_F,\xi_R)=(0.5,1)$
          experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
        - theoryid: 179
          speclabel: $(\xi_F,\xi_R)=(1,2)$ 
          experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
        - theoryid: 174
          speclabel: $(\xi_F,\xi_R)=(1,0.5)$
          experiments:
        # Fixed target DIS
            - experiment: NMC
              datasets:
                 - dataset: NMCPD
                 - dataset: NMC
                 - experiment: SLAC
              datasets:
                 - dataset: SLACP
                 - dataset: SLACD
            - experiment: BCDMS
              datasets:
                 - dataset: BCDMSP
                 - dataset: BCDMSD
            - experiment: NTVDMN
              datasets:
                 - dataset: NTVNUDMN
                 - dataset: NTVNBDMN
            - experiment: CHORUS
              datasets:
                 - dataset: CHORUSNU
                 - dataset: CHORUSNB
          # Combined HERA charm production cross-sections
            - experiment: HERAF2CHARM
              datasets:
                 - dataset: HERAF2CHARM
          # HERA data
            - experiment: HERACOMB
              datasets:
                 - dataset: HERACOMBNCEM 
                 - dataset: HERACOMBNCEP460
                 - dataset: HERACOMBNCEP575
                 - dataset: HERACOMBNCEP820
                 - dataset: HERACOMBNCEP920
                 - dataset: HERACOMBCCEM 
                 - dataset: HERACOMBCCEP 
          # F2bottom data
            - experiment: F2BOTTOM
              datasets: 
                 - dataset: H1HERAF2B
                 - dataset: ZEUSHERAF2B
            - experiment: ATLAS
              datasets:
                 - dataset: ATLASWZRAP36PB
                 - dataset: ATLASZHIGHMASS49FB
                 - dataset: ATLASLOMASSDY11EXT
                 - dataset: ATLASWZRAP11
                 - dataset: ATLAS1JET11
                 - dataset: ATLASZPT8TEVMDIST
                 - dataset: ATLASZPT8TEVYDIST
                 - dataset: ATLASTTBARTOT
                 - dataset: ATLASTOPDIFF8TEVTRAPNORM
            - experiment: CMS
              datasets:
                 - dataset: CMSWEASY840PB
                 - dataset: CMSWMASY47FB
                 - dataset: CMSWCHARMRAT
                 - dataset: CMSDY2D11
                 - dataset: CMSWMU8TEV
                 - dataset: CMSJETS11
                 - dataset: CMSTTBARTOT
                 - dataset: CMSTOPDIFF8TEVTTRAPNORM
            - experiment: LHCb
              datasets:
                 - dataset: LHCBZ940PB
                 - dataset: LHCBZEE2FB
            - experiment: CDF
              datasets:
                 - dataset: CDFZRAP
                 - dataset: CDFR2KT
            - experiment: D0
              datasets:
                 - dataset: D0ZRAP
                 - dataset: D0WEASY
                 - dataset: D0WMASY
#        - theoryid: 180
#          speclabel: $(\xi_F,\xi_R)=(2,2)$ 
#        - theoryid: 173
#          speclabel: $(\xi_F,\xi_R)=(0.5,0.5)$
#        - theoryid: 175
#          speclabel: $(\xi_F,\xi_R)=(2,0.5)$   
#        - theoryid: 178
#          speclabel: $(\xi_F,\xi_R)=(0.5,2)$

shiftconfig:
   dataspecs:
      - theoryid: 52
        pdf: NNPDF31_nlo_as_0118_hessian
        speclabel: "NLO"
        fit: NNPDF31_nlo_as_0118_1000

      - theoryid: 53
        pdf: NNPDF31_nnlo_as_0118_hessian
        speclabel: "NNLO"
        fit: NNPDF31_nnlo_as_0118_1000

normalize_to: 1

use_cuts: 'fromfit'
fit: NNPDF31_nlo_as_0118_1000

pdf:
  from_: fit

template_text: |

   {@with default_theory@}

   {@plot_thcorrmat_heatmap_custom@}

   {@endwith@}

   {@with shiftconfig@}

   {@plot_matched_datasets_shift_matrix@}
   {@plot_matched_datasets_shift_matrix_correlations@}

   {@endwith@}

actions_:
  - report(main=true)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants