Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New reconstruction and dependent variables #174

Merged
merged 40 commits into from
Dec 25, 2017
Merged

Conversation

coderdj
Copy link
Contributor

@coderdj coderdj commented Dec 7, 2017

This adds the s1_pattern_fit using 3d FDC and neural network posrec. The other version uses the 2D FDC with TPF. This also adds the s1 area fraction top probability variable computed at the 3D-NN position.

Diagnostic plots incoming. Putting this here as placeholder.

hax.minitrees.TreeMaker.__init__(self)

# We need to pull some stuff from the pax config
self.pax_config = load_configuration("XENON1T")
Copy link
Contributor

@pdeperio pdeperio Dec 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need SR-dependent gains from configs in processing repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your question correctly then no. Gains already applied at this stage. This is just to pull some defaults. Unfortunately we wouldn't get override values in case the defaults are overridden in the run doc. I checked that they are not overridden for the values I took but I'm not sure of a better way to generalize.

weights_file = utils.data_file_name('tensorflow_nn_pos_weights_XENON1T_20171211.h5')
loaded_nn_model.load_weights(weights_file)
self.nn_tensorflow = loaded_nn_model
self.list_bad_pmts = [1, 2, 12, 26, 34, 62, 65, 79, 86, 88, 102, 118, 130, 134, 135, 139, 148, 150, 152, 162, 178,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be taken from the PMT gain list above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why needed? Bad pmts are blinded in pax (gain zero) so have no signal in the hit pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feigaodm says: defines the number of layers in NN.

Hardcoded for SR0 and SR1 now in #184. @weiyuehuan working on including this list with the NN models to replace this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in c994b08 and c1849fd

self.nn_tensorflow = loaded_nn_model
self.list_bad_pmts = [1, 2, 12, 26, 34, 62, 65, 79, 86, 88, 102, 118, 130, 134, 135, 139, 148, 150, 152, 162, 178,
183, 190, 198, 206, 213, 214, 234, 239, 244]
self.ntop_pmts = 127 # How to get this automatically?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed must be automated for SR dependency (following above comments)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above. You should just be able to get this from the XENON1T config because XENON1T will always have the same number of PMTs in the top array. Sure, some will be blinded, but they always have zero signal then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d61afe0

# Don't yell at me for hardcoding the filename into hax because it was
# also hardcoded in pax. Just kicking the can.
#aftmap_filename = utils.data_file_name('s1_aft_xyz_XENON1T_06Mar2017.json')
aftmap_filename = utils.data_file_name('s1_aft_xyz_XENON1T_20170808.json')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to hax.ini like other maps in corrections.py?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ef6fd7d

self.low_pe_threshold=10
#hax.minitrees.Treemaker.__init__(self)
# load trained NN models
nn_model_json = utils.data_file_name('tensorflow_nn_pos_XENON1T_20171211.json')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to hax.ini like other maps in corrections.py

Copy link
Contributor

@pdeperio pdeperio Dec 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ef6fd7d

for i, s2_t in enumerate(s2apc):
if i not in self.list_bad_pmts and i < self.ntop_pmts:
s2apc.clean.append(s2_t)
s2acp_clean.np.assaray(s2acp_clean)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acp -> apc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c8d44aa

@pdeperio pdeperio changed the title Newpatternlikelihood New reconstruction and dependent variables Dec 12, 2017
@@ -131,7 +130,9 @@ def extract_data(self, event):
aft = self.aft_map.get_value(self.x[i], self.y[i], self.z[i])
event_data['s1_area_fraction_top_probability_hax'] = binom_test(
size_top, size_tot, aft)

event_data['s1_area_fraction_top_probability_hax2'] = binom_test_pax(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@darrylmasson @coderdj what's the difference with this one? (better name than hax2?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The binom_test from scipy is only valid for integer inputs of PE. The version I put into pax accepts floating points and is more accurate. I don't think there's a particularly good reason to include the scipy-based calculation (unless we want to compare with the original SR0 codes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yuck. OK I had them both for testing. The scipy one fails miserably below ~10pe. The plan is to revert to one before merging. The one will be the pax one I guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #180

confused_s1_channels),
statistic=self.statistic)

event_data['s1_pattern_fit_hits_hax'] = self.pattern_fitter.compute_gof(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this (hits) is using the same MC map as for area?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In pax yes but we don't actually use hits so I don't know why I included it.

Copy link
Contributor

@pdeperio pdeperio Dec 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to keep, since the optical maps are using photons detected i think, not PE

hpc[self.tpc_channels],
pmt_selection=pmts_bottom,
statistic=self.statistic)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add also variable for hits as above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in #180

@pdeperio
Copy link
Contributor

pdeperio commented Dec 14, 2017

@pelssers have you seen this new treemaker yet? wondering if the treemaker you were working on is in a more advanced state (like how you made corrections.py more general for all reconstruction algorithms).

* Fix S1 bottom pattern, docs, and lint

* Fix oops

* Increment version
s2apc = np.array(list(s2.area_per_channel))
if(len(s2apc)!=self.ntop_pmts):
return event_data
s2apc_clean = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feigaodm @weiyuehuan: as @coderdj mentioned above, is this needed if s2apc is already 0 from pax for bad PMTs? or does nn_tensorflow.predict actually need to remove these channels from the list?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See response above and #184


# Position reconstruction based on NN from TensorFlow
s2apc = np.array(list(s2.area_per_channel))
if(len(s2apc)!=self.ntop_pmts):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feigaodm @weiyuehuan: in what case does this actually happen? i.e. is it ok to just ignore these events?

@coderdj
Copy link
Contributor Author

coderdj commented Dec 15, 2017

I'm starting some cleanup of this but it seems the NN stuff doesn't work at all. Did it ever work?

First it won't even import on midway login nodes due to some glibc error, but we can live with this for now since you can run it on batch nodes fine. Then line https://github.com/XENON1T/hax/blob/newpatternlikelihood/hax/treemakers/posrec.py#L114 causes, by definition, a return before the neural net is ever invoked. After removing that clause I start getting weird tensorflow-related errors.

Can someone say if this is working for them in some setup? I don't mind helping debugging but we should be sure to test before committing.

@pdeperio
Copy link
Contributor

Worked for me before. But @feigaodm and I were thinking that posrec stuff should be moved to another treemaker, agree?

@coderdj
Copy link
Contributor Author

coderdj commented Dec 15, 2017

Oh I just got it working. We were clearing the keras session in the init, which wiped the network we just loaded. I'll commit the fix.
Your mass production all failed unfortunately.

@coderdj
Copy link
Contributor Author

coderdj commented Dec 15, 2017

As for splitting into another minitree. May be useful for testing, but unpacking the peak objects and hit patterns is computationally slow so it might make sense to have everything that needs to do that in a single minitree.

@skazama
Copy link
Contributor

skazama commented Dec 16, 2017

The list of excluded PMTs is different between SR0 and SR1.
For SR0, it is [1, 12, 26, 27, 34, 51, 62, 65, 73, 79, 86, 88, 91, 102, 118, 130, 134, 135, 137, 139, 148, 150, 152, 162, 167, 178, 183, 198, 203, 206, 213, 214, 234, 239, 244].

Copy link
Member

@tunnell tunnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice functionality but it's almost certainly going to break down quickly as it introduces a lot of technical debt (that nobody will ever go back for). More generally, it seems like a rewrite of certain pax plugins. Is it not possible just to initialize those plugins then feed the event in? At very least, there are a few places that redefine things already in pax or messily define the keras models.


# We need to pull some stuff from the pax config
self.pax_config = load_configuration("XENON1T")
self.tpc_channels = list(range(0, 247 + 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can fetch this from pax config in previous line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d61afe0


# load trained NN models
nn_model_json = utils.data_file_name(hax.config["neural_network_model"])
json_file_nn = open(nn_model_json, 'r')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bad way to load a model. Please do model.save('blah.hdf5'), which will save the model and weights together. There's no reason to keep these apart and it can just lead to problems.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, there's also a keras.utils load_model command to reload it.

loaded_nn_model = model_from_json(loaded_model_json)
weights_file = utils.data_file_name(hax.config['neural_network_weights'])
loaded_nn_model.load_weights(weights_file)
loaded_nn_model.compile(loss='mean_squared_error', optimizer='adam')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you load the model as mentioned before, you don't need to recompile. You shouldn't need to recompile anyways since you're not retraining it, right?

130, 134, 135, 139, 148, 150, 152, 162, 178, 183,
190, 198, 206, 213, 214, 234, 239, 244, 27, 73,
91, 137, 167, 203]
self.ntop_pmts = 127 # How to get this automatically?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d61afe0

self.ntop_pmts = 127 # How to get this automatically?

def get_data(self, dataset, event_list=None):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring? What is this used for?

"y_observed_nn_tf": None,
"s1_area_upper_injection_fraction": None,
"s1_area_lower_injection_fraction": None,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort of feel that each category of variable should be its own subfunction. This is super hard to follow.

confused_s1_channels = []
for a, c in enumerate(s1.n_saturated_per_channel):
if c > 0:
confused_s1_channels.append(a)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statistic=self.statistic)

pmts_bottom = np.setdiff1d(self.tpc_channels, confused_s1_channels)
pmts_bottom[0:127] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcode, though you define before 127.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on how to get this list of PMTs above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d61afe0

event_data['s1_area_fraction_top_probability_hax'] = binom_test_pax(
size_top, size_tot, aft)

# Now do s1_pattern_fit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apc? hpc?

s2apc_clean_norm = s2apc_clean_norm.reshape(1, len(s2apc_clean_norm))
predicted_xy_tensorflow = self.nn_tensorflow.predict(s2apc_clean_norm)
event_data['x_observed_nn_tf'] = predicted_xy_tensorflow[0, 0] / 10.
event_data['y_observed_nn_tf'] = predicted_xy_tensorflow[0, 1] / 10.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this preprocessing be broken into its own function? If I am to add another model, then it's copy-paste 10 more lines... though the preprocessing is slightly different.

Also, clear is the wrong word. It's 'preprocessed' or 'normalized'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tunnell It looks like you have some nice ideas on how to clean up the codes here. Can you modify this PR and test it on a dataset. Maybe you can also fix the issue that sometimes TF fails to be running.

pdeperio and others added 12 commits December 19, 2017 10:00
* Fix bugs in event indexing and PMT selection
for S1 likelihood calculations

* Increment version

* Fix version oops

0.2 is actually less than 0.16
* Added new corrections handler class

* Made it work

* Made posrec treemaker work with new class too

* Revert version bump

* Oops. Moved instead of git moved.

* Implement in CorrectedDoubleS1Scatter and lint fixes
Forgot again...
Name should be 'event_data' instead of 'result'...
adding additional s1 width variables for AC events rejection
@@ -38,11 +38,15 @@ class PositionReconstruction(TreeMaker):

- s1_area_upper_injection_fraction: s1 area fraction near Rn220 injection points (near PMT 131)
- s1_area_lower_injection_fraction: s1 area fraction near Rn220 injection points (near PMT 243)

- s1_range_90p_area: The width of the s1 (ns), duration of region that contains 90% of the area of the peak
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skazama Should we put this in Extended treemaker instead (where s1_range_80p_area already is)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #190

@pdeperio pdeperio mentioned this pull request Dec 21, 2017
hax/hax.ini Outdated
@@ -135,6 +135,12 @@ corrections_definitions = {
{"run_min": 10223, "run_max": 12089, "correction": "FDC_SR1_data_driven_time_dependent_3d_correction_part3_v1.json.gz"},
{"run_min": 12090, "correction": "FDC_SR1_data_driven_time_dependent_3d_correction_part4_v1.json.gz"}
],
"fdc_3d_tfnn": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jingqiangye Can you add SR0 as well? (Otherwise, I think this will crash on SR0 files.)

@/all When pushing code/analyses from now, for completeness, we should ensure SR0+SR1 are both included.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in f60ce10

@pdeperio pdeperio dismissed tunnell’s stale review December 25, 2017 18:15

Get a production going over holiday. I've done my best to beautify while maintaining desired functionality. Can improve on next iteration.

@pdeperio pdeperio merged commit 284a300 into master Dec 25, 2017
@pdeperio pdeperio deleted the newpatternlikelihood branch December 25, 2017 18:16
hax/hax.ini Outdated
"fdc_3d_tfnn": [
{"run_min": 6386, "run_max": 8648, "correction": "FDC_SR1_data_driven_time_dependent_3d_correction_tf_nn_part1_v1.json.gz"},
{"run_min": 8649, "run_max": 10976, "correction": "FDC_SR1_data_driven_time_dependent_3d_correction_tf_nn_part2_v1.json.gz"},
{"run_min": 10977, "run_max": 13195, "correction": "FDC_SR1_data_driven_time_dependent_3d_correction_tf_nn_part3_v1.json.gz"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jingqiangye Why does this have different run binning than FANN 3D FDC?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because TFNN was using more Kr data for FDC. FANN 3D FDC was using Kr data up to Oct 2017, and TFNN is to Nov 2017. FANN 3D FDC will be updated with latest Kr(Jan. 2, 2018)today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants