-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option to be able to ignore N-sample while LST-binning #932
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #932 +/- ##
==========================================
- Coverage 97.18% 97.18% -0.01%
==========================================
Files 30 30
Lines 10733 10727 -6
==========================================
- Hits 10431 10425 -6
Misses 302 302
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a few comments on the code here, and it should do what you're saying it should do. However, I'm not quite sure of the motivation here. Is the assumption that in-painted data coming in will have Nsamples=0 but be un-flagged? So this would give you a way to distinguish between "true" and "inpainted" data?
hera_cal/tests/test_lstbin.py
Outdated
# test weighted_by_nsamples, nsamples are propagated but data is not weighted by nsamples if set to False | ||
output1 = lstbin.lst_bin(self.data_list, self.lst_list, dlst=dlst, | ||
flags_list=self.flgs_list, nsamples_list=self.nsmp_list, | ||
weight_by_nsamples=True) | ||
|
||
nsmps1 = copy.deepcopy(self.nsmps1) | ||
nsmps1[(24, 25, 'ee')][:, 32] = 0 | ||
nsmps2 = copy.deepcopy(self.nsmps2) | ||
nsmps2[(24, 25, 'ee')][:, 32] = 0 | ||
nsmps3 = copy.deepcopy(self.nsmps3) | ||
nsmps3[(24, 25, 'ee')][:, 32] = 0 | ||
nsmps_list = [nsmps1, nsmps2, nsmps3] | ||
output = lstbin.lst_bin(self.data_list, self.lst_list, dlst=dlst, | ||
flags_list=self.flgs_list, nsamples_list=nsmps_list, | ||
weight_by_nsamples=True) | ||
# Check Nsamples are all 0 | ||
assert np.allclose(output[-1][(24, 25, 'ee')].real[:, 32], 0) | ||
# Check data got weighted sum to 0 | ||
assert np.allclose(output[1][(24, 25, 'ee')].real[100, 32], 0) | ||
output = lstbin.lst_bin(self.data_list, self.lst_list, dlst=dlst, | ||
flags_list=self.flgs_list, nsamples_list=nsmps_list, | ||
weight_by_nsamples=False) | ||
# Check Nsamples are all 0 | ||
assert np.allclose(output[-1][(24, 25, 'ee')].real[:, 32], 0) | ||
# Check data is the same as before | ||
assert np.allclose(output[1][(24, 25, 'ee')].real, output1[1][(24, 25, 'ee')].real) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please extract this into its own test?
hera_cal/tests/test_lstbin.py
Outdated
# test weight_by_nsamples | ||
lstbin.lst_bin_files(self.data_files, ntimes_per_file=250, outdir="./", overwrite=True, | ||
verbose=False, rephase=True, weight_by_nsamples=False, file_ext=file_ext) | ||
output_lst_file = "./zen.ee.LST.0.20124.uvh5" | ||
output_std_file = "./zen.ee.STD.0.20124.uvh5" | ||
assert os.path.exists(output_lst_file) | ||
assert os.path.exists(output_std_file) | ||
os.remove(output_lst_file) | ||
os.remove(output_std_file) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also extract this into its own test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually so for here I did not do anything I just tested there is no error running this new option so it kind of falls into the "# basic execution" catalogue, similar to all the tests above like testing the rephase
option. Does this still need to be its separate test?
hera_cal/lstbin.py
Outdated
@@ -542,6 +549,7 @@ def lst_bin_arg_parser(): | |||
a.add_argument("--outdir", default=None, type=str, help="directory for writing output") | |||
a.add_argument("--overwrite", default=False, action='store_true', help="overwrite output files") | |||
a.add_argument("--lst_start", type=float, default=None, help="starting LST for binner as it sweeps across 2pi LST. Default is first LST of first file.") | |||
a.add_argument("--weight_by_nsamples", default=True, action='store_true', help="Weight by nsamples during LST binning. If set to False, weight by flags only. Default True.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will work. As far as I know, there's no way to set the flag to False on the command line. So I think you'll need to use something like --weight-by-flags-only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, thanks! I will switch all the options to weight_by_flags_only
.
Yes so the idea is because we now keep track of N-samples, channels that are originally flagged and inpainted will still have zero |
After a quick discussion with @jsdillon, I removed the tester As a reminder, this PR is for the H4C re-run so that we are able to properly propagate nsample (i.e., treat inpainted channels as having nsample 0) but still use inpainted data during lst binning (so weight data by flagging pattern instead of nsample). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks OK to me
Add an option
weight_by_nsamples
(defaultTrue
). When turning off, will weight only by flagging patterns but still propagate the nsamples.(Minor concern: Is it a bad practice to set something with default
True
? When modifying thearg_parser
it feels a bit weird to have an argument that hasaction="store_true"
but also default to beTrue
... But in my defence,weight_by_nsamples
seems more straightforward thannot_weight_by_nsamples
)