Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extract autos script #393

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions hera_pspec/tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -384,6 +384,13 @@ def test_uvp_noise_error_arser():
assert args.groups == ["dset0_dset1"]
assert args.spectra is None

def test_extract_autos_post_lstbin_parser():
parser = utils.extract_autos_post_lstbin_parser()
args = parser.parse_args(["sum", "foo.bar", "--flist", "foo", "bar", "baz"])
assert args.sumdiff == "sum"
assert args.label == "foo.bar"
assert args.flist == ["foo", "bar", "baz"]

def test_job_monitor():
# open empty files
datafiles = ["./{}".format(i) for i in ['a', 'b', 'c', 'd']]
Expand Down
20 changes: 20 additions & 0 deletions hera_pspec/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1513,6 +1513,26 @@ def uvp_noise_error_parser():
"to compute, 'P_N' or 'P_SN'")
return a

def extract_autos_post_lstbin_parser():
"""
Get the argparser for the extract_autos script

Args:
N/A
Returns:
parser (ArgumentParser):
The desired parser.
"""
parser = argparse.ArgumentParser(description="Argument parser for "
"autos from the chunked files into a "
"waterfall file.")
parser.add_argument("sumdiff", type=str, help="A string identifying whether"
" the files are sum or diff files.")
parser.add_argument("label", type=str, help="The file label.")
parser.add_argument("--flist", type=str, nargs="*",
help="The list of chunked files.")
return parser

def apply_P_SN_correction(uvp, P_SN='P_SN', P_N='P_N'):
"""
Apply correction factor to P_SN errorbar in stats_array to account
Expand Down
52 changes: 52 additions & 0 deletions scripts/extract_autos_post_lstbin.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/usr/bin/env python
"""
Pipeline script to extract autocorrelations from chunked files into waterfall.
"""
from hera_pspec import utils
from pyuvdata import UVData
from hera_cal._cli_tools import parse_args, run_with_profiling
import warnings

def main(args):
def check_for_sumdiff(file):
if not args.sumdiff in file:
raise ValueError(f"Supposedly processing {args.sumdiff} files but "
f"{args.sumdiff} not in the filename.")
return

# In case there are files without autos
found_autos = False
for file_ind, file in enumerate(args.flist):
check_for_sumdiff(file)
try:
main_uvd = UVData()
main_uvd.read(file, ant_str="auto")
found_autos = True
break
except ValueError: # There were no autos in that file
continue
Comment on lines +18 to +27

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this block of code a little difficult to parse at first--it took me a few reads to wrap my head around it. I would maybe instead add a comment to the effect of "Find the first file that has autos."


if found_autos:
start_ind = file_ind + 1
if start_ind < len(args.flist):
for file in args.flist[file_ind + 1:]:
Comment on lines +30 to +32

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how important it is to track whether only one file had autocorrelations in it, but if it's not that important to do, then you can replace this block with:
for file in args.flist[file_ind + 1:]
If file_ind + 1 is greater than len(args.flist), this just doesn't do any iterations.

check_for_sumdiff(file)
try:
new_uvd = UVData()
new_uvd.read(file, ant_str="auto")
main_uvd.__add__(new_uvd, inplace=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest adding an axis argument to the parser, which defaults to None, that will let you do fast concatenation here by specifying axis=args.axis. Our typical use case should let us use axis='blt' to speed up the concatenation.

except ValueError:
continue
Comment on lines +38 to +39

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worries me. Are you certain the only way a ValueError can be raised is by a file not having autos in it? What if the autos in one file differ from the autos in a different file? I'd think that pyuvdata would also have an issue with that. At any rate, I think something like warnings.warn("Some files missing autos") should be emitted before continuing.

else:
warnings.warn("Only one file had autocorrelatons. Inputs are almost "
"certainly incorrect.")

outfile = f"zen.LST.0.00000.{args.sumdiff}.{args.label}.foreground_filled.xtalk_filtered.chunked.waterfall.autos.uvh5"
main_uvd.write_uvh5(outfile, clobber=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for best practice, clobber should be an argument passed to the script.

else:
raise ValueError("No autocorrelations found in any files. Check inputs.")

parser = utils.extract_autos_post_lstbin_parser()
args = parse_args(parser)
run_with_profiling(main, args, args)

Loading