Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to be able to ignore N-sample while LST-binning #932

Merged
merged 6 commits into from
May 27, 2024

Conversation

Kai-FengChen
Copy link
Contributor

Add an option weight_by_nsamples (default True). When turning off, will weight only by flagging patterns but still propagate the nsamples.

(Minor concern: Is it a bad practice to set something with default True? When modifying the arg_parser it feels a bit weird to have an argument that has action="store_true" but also default to be True... But in my defence, weight_by_nsamples seems more straightforward than not_weight_by_nsamples)

Copy link

codecov bot commented Jan 31, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.18%. Comparing base (cc0a13d) to head (b1288c5).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #932      +/-   ##
==========================================
- Coverage   97.18%   97.18%   -0.01%     
==========================================
  Files          30       30              
  Lines       10733    10727       -6     
==========================================
- Hits        10431    10425       -6     
  Misses        302      302              
Flag Coverage Δ
unittests 97.18% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@steven-murray steven-murray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a few comments on the code here, and it should do what you're saying it should do. However, I'm not quite sure of the motivation here. Is the assumption that in-painted data coming in will have Nsamples=0 but be un-flagged? So this would give you a way to distinguish between "true" and "inpainted" data?

Comment on lines 173 to 199
# test weighted_by_nsamples, nsamples are propagated but data is not weighted by nsamples if set to False
output1 = lstbin.lst_bin(self.data_list, self.lst_list, dlst=dlst,
flags_list=self.flgs_list, nsamples_list=self.nsmp_list,
weight_by_nsamples=True)

nsmps1 = copy.deepcopy(self.nsmps1)
nsmps1[(24, 25, 'ee')][:, 32] = 0
nsmps2 = copy.deepcopy(self.nsmps2)
nsmps2[(24, 25, 'ee')][:, 32] = 0
nsmps3 = copy.deepcopy(self.nsmps3)
nsmps3[(24, 25, 'ee')][:, 32] = 0
nsmps_list = [nsmps1, nsmps2, nsmps3]
output = lstbin.lst_bin(self.data_list, self.lst_list, dlst=dlst,
flags_list=self.flgs_list, nsamples_list=nsmps_list,
weight_by_nsamples=True)
# Check Nsamples are all 0
assert np.allclose(output[-1][(24, 25, 'ee')].real[:, 32], 0)
# Check data got weighted sum to 0
assert np.allclose(output[1][(24, 25, 'ee')].real[100, 32], 0)
output = lstbin.lst_bin(self.data_list, self.lst_list, dlst=dlst,
flags_list=self.flgs_list, nsamples_list=nsmps_list,
weight_by_nsamples=False)
# Check Nsamples are all 0
assert np.allclose(output[-1][(24, 25, 'ee')].real[:, 32], 0)
# Check data is the same as before
assert np.allclose(output[1][(24, 25, 'ee')].real, output1[1][(24, 25, 'ee')].real)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please extract this into its own test?

Comment on lines 276 to 285
# test weight_by_nsamples
lstbin.lst_bin_files(self.data_files, ntimes_per_file=250, outdir="./", overwrite=True,
verbose=False, rephase=True, weight_by_nsamples=False, file_ext=file_ext)
output_lst_file = "./zen.ee.LST.0.20124.uvh5"
output_std_file = "./zen.ee.STD.0.20124.uvh5"
assert os.path.exists(output_lst_file)
assert os.path.exists(output_std_file)
os.remove(output_lst_file)
os.remove(output_std_file)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also extract this into its own test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually so for here I did not do anything I just tested there is no error running this new option so it kind of falls into the "# basic execution" catalogue, similar to all the tests above like testing the rephase option. Does this still need to be its separate test?

@@ -542,6 +549,7 @@ def lst_bin_arg_parser():
a.add_argument("--outdir", default=None, type=str, help="directory for writing output")
a.add_argument("--overwrite", default=False, action='store_true', help="overwrite output files")
a.add_argument("--lst_start", type=float, default=None, help="starting LST for binner as it sweeps across 2pi LST. Default is first LST of first file.")
a.add_argument("--weight_by_nsamples", default=True, action='store_true', help="Weight by nsamples during LST binning. If set to False, weight by flags only. Default True.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will work. As far as I know, there's no way to set the flag to False on the command line. So I think you'll need to use something like --weight-by-flags-only

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thanks! I will switch all the options to weight_by_flags_only.

@Kai-FengChen
Copy link
Contributor Author

Yes so the idea is because we now keep track of N-samples, channels that are originally flagged and inpainted will still have zero nsample. If we used the original lstbin routine, the inpainted channel will be weighted by 0 during lstbining. By changing to this only_weighted_by_flag option we can average inpainted data with real data during lstbining.

@Kai-FengChen
Copy link
Contributor Author

After a quick discussion with @jsdillon, I removed the tester test_lstbin.py for the old lst binner and resolved some conflicts to have this PR ready to be merged.

As a reminder, this PR is for the H4C re-run so that we are able to properly propagate nsample (i.e., treat inpainted channels as having nsample 0) but still use inpainted data during lst binning (so weight data by flagging pattern instead of nsample).

Copy link
Member

@jsdillon jsdillon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks OK to me

@jsdillon jsdillon merged commit f79a4b9 into main May 27, 2024
9 of 11 checks passed
@jsdillon jsdillon deleted the lstbin_ignore_nsample branch May 27, 2024 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants