Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lichens as preselections, delayed lichen evaluation #229

Merged
merged 5 commits into from
Aug 5, 2018
Merged

Conversation

JelleAalbers
Copy link
Contributor

@JelleAalbers JelleAalbers commented Jun 2, 2018

This improves support for lichen cuts from lax:

[1] Apply lichens as preselections. Since preselections are applied per dataset as they are loaded, this should make loading datasets with heavy lichen-based cuts RAM-friendlier.

For example:

hax.minitrees.load(dsets, preselection=['cs1 < 200', 'FiducialCylinder1T'])

will apply the fiducial volume cut as well as the usual cs1 < 200 preselection while loading.

By default the lichens are drawn from the default lichen file. To change this, specify the lichen file followed by a colon and the preselection name, e.g. sciencerun0:FiducialCylinder1T.

[2] Apply lichens to a delayed (dask) dataframe. This was triggered by a question from @DCichon. Nothing special is needed for this: just use hax.cuts.apply_lichen on the delayed dataframe.

These changes also work together, so you can load a delayed dataset with lichen preselections.

[3] Switch the default lichen file to postsr1, see XENON1T/lax#149. This will not impact frozen analyses for the last paper as long as these are run in their appropriate frozen hax+lax environment.

Copy link
Contributor Author

@JelleAalbers JelleAalbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, maybe using the presence/absence of a space to distinguish selections from lichen names is not a good idea. If someone uses 'cs1<70' instead of 'cs1 < 70' it would not work. Perhaps checking for the presence of any non-identifier characters (anything except letters and _) is better.

@JelleAalbers
Copy link
Contributor Author

OK, I now use both the presence of a capitalized first character AND the absence of spaces to distinguish a lichen name from a preselection.

Anyone willing to review this?

@tunnell
Copy link
Member

tunnell commented Aug 3, 2018

Why not just add the word Cut? I think that's how it's stored in DataFrame anyways.

@JelleAalbers
Copy link
Contributor Author

Thanks! Hax actually passes a copy of the dataframe to hax to avoid getting those (and other) columns back. They're great for studying cuts, but when you're applying them you don't really want twelve columns that just have all False.
Using Cut or some other identifier added to the lichen name would certainly work, though it would be inconsistent with how apply_lichen works at the moment. I'm guessing nobody will write a selection string that starts with a capital letter and has no spaces, and if they do they will just get an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants