power spectrum pipeline v1 #151

nkern · 2018-07-08T08:32:07Z

This adds the
• pipelines/pspec_pipeline/pspec_pipe.py
• pipelines/pspec_pipeline/pspec_pipe.yaml
• pipelines/pspec_pipeline/pspec_batch.sh
scripts as version 1 of the power spectrum pipeline, which goes through the analysis blocks in the following order:

Visibility data difference (e.g. for jacknives) [optional]
OQE pipeline
Bootstrap error pipeline

This branch should be merged after the stats_array branch is merged in.

A statistical evaluation step seems appropriate for this script (after 3.) but due to circular dependency with hera_stats its not possible to add it in. We should make a stats_pipe.py script and stats_pipe.yaml script in hera_stats/pipelines/stats_pipe of similar format to perform this last step of the full power spectrum pipeline.

[UPDATE]
The #145 PR and #148 PR were merged into this PR because they were all inter-dependent.

coveralls · 2018-07-08T08:42:51Z

Coverage decreased (-0.6%) to 96.552% when pulling ccff2e4 on pspec_pipe into b97441c on master.

philbull

Looks good! Just a few docstring changes needed, and a couple of minor questions. Nothing that should get in the way of doing an initial end-to-end run.

philbull · 2018-07-12T19:12:48Z

hera_pspec/container.py

+def merge_spectra(psc, groups=None, dset_split_str='_x_', ext_split_str='_', verbose=True):
+    """
+    Iterate through a PSpecContainer and, within each specified group,
+    merge spectra of similar name but different psname extension.


Does merge mean average them together, or just combine them into a single object? Should be a bit clearer in the docstring.

Also worth mentioning that this is a destructive operation, i.e. it removes the old unmerged spectra.

Okay, I changed the name to combine_psc_spectra to be more suggestive of what its actually doing, which is just setting up a combine_uvpspec call.

philbull · 2018-07-12T19:13:47Z

hera_pspec/container.py

+
+    Parameters
+    ----------
+    groups : list


Missing docstring for psc and verbose kwargs.

Also worth mentioning that this is an in-place operation?

yup, thanks

philbull · 2018-07-12T19:14:22Z

hera_pspec/container.py

+    ext_split_str : str
+        The pattern used to split the dset name from its extension in the psname.
+    """
+    from hera_pspec import uvpspec


Why is this import statement here?

Ah, I must of thought it was a circular dependency, but obviously its not. thanks

philbull · 2018-07-12T19:14:52Z

hera_pspec/container.py

+    from hera_pspec import uvpspec
+    # load container
+    if isinstance(psc, (str, np.str)):
+        psc = PSpecContainer(psc, mode='rw')


This should maybe pass an overwrite kwarg as well.

philbull · 2018-07-12T21:36:14Z

hera_pspec/grouping.py

+    seed : int
+        Random seed to use in bootstrap resampling.
+
+    normal_std : bool


Can you mention the name of the keys in the stats_array of the output UVPSpec where these error estimates can be found?

yeah, good point

philbull · 2018-07-12T22:19:59Z

hera_pspec/utils.py

+    To form cross spectra between these two files, one would feed a group_pair
+    of: group_pairs = [('even', 'odd'), ...].
+
+    A baseline-pair is formed by self-matching unique-files in the


I'm a bit confused by this first line. Does it mean it only allows baseline pairs to be formed between files that have the same identifier, e.g. for group_pairs = [('even', 'odd'),] it would do even.1234 x odd.1234, but not even.1234 x odd.1235?

So it will do the first, but not the second based on how you had it: i.e. it doesn't permute the keys for you. if you wanted to do both, you should feed:
group_pairs = [('even', 'odd'), ('odd', 'even')]

philbull · 2018-07-12T22:20:41Z

hera_pspec/utils.py

+
+    action_name : str
+        The name of the block in the pipeline
+


M kwarg is not documented.

philbull · 2018-07-12T22:20:59Z

hera_pspec/utils.py

+
+def job_monitor(run_func, iterator, action_name, M=map, lf=None, maxiter=1, verbose=True):
+    """
+    Job monitoring function.


Needs a bit more explanation.

philbull · 2018-07-12T23:10:42Z

pipelines/pspec_pipeline/pspec_pipe.py

+#-------------------------------------------------------------------------------
+if run_diff:
+    # get algorithm parameters
+    globals().update(cf['algorithm']['diff'])


I'm unsure about this strategy of updating globals. It seems a bit opaque, and I can imagine things going wrong. Perhaps we can discuss this, but probably fine to leave it for now.

Okay, we can change to a more explicit call to the input parameter attributes

philbull · 2018-07-12T23:40:58Z

pipelines/pspec_pipeline/pspec_pipe.py

+        return 0
+
+    # launch pspec jobs
+    failures = hp.utils.job_monitor(pspec, range(len(jobs)), "PSPEC", lf=lf, maxiter=maxiter, verbose=verbose)


Does this do any timing? If not, it might be neat to print some timing info so we can check on progress and estimate how long things are likely to run for.

Sure I can add some basic timing into it

So each block in the pspec pipeline already has a timer, so we can do small runs and get a sense for how long each block takes.

modified: setup.py

and new PspecData capabilities

modified: setup.py

and new PspecData capabilities

…labels

…ic err PR where cov_array was not being propagated to pspec_run, and also wasn't being loaded when read_from_group was called. added tests to check for these issues.

modified: hera_pspec/container.py modified: pipelines/pspec_pipeline/pspec_pipe.py

modified: pipelines/pspec_pipeline/pspec_pipe.py

more rebase leftovers

in uvpspec.combine_uvpspec when concat across blpts

modified: pipelines/pspec_pipeline/pspec_pipe.yaml

philbull · 2018-07-16T06:41:26Z

OK, @nkern feel free to merge when you're ready.

nkern · 2018-07-17T01:41:07Z

@philbull. Okay, I'm going to add some unittesting for the pipeline scripts, now that we've settled on keeping the pspec pipeline in hera_pspec and creating a secondary stats_pipe in hera_stats. should be done in a few hours...

nkern · 2018-07-18T14:46:55Z

...and we are there..

ghost assigned nkern Jul 8, 2018

ghost added the in progress label Jul 8, 2018

This was referenced Jul 9, 2018

pspec_run config for batch pipeline #145

Closed

bootstrap_run #148

Closed

nkern force-pushed the pspec_pipe branch from 460dd3c to c81496b Compare July 9, 2018 06:47

philbull approved these changes Jul 12, 2018

View reviewed changes

nkern added 23 commits July 16, 2018 00:06

modified: hera_pspec/utils.py

92d405d

modified: hera_pspec/utils.py

f9a3221

modified: setup.py

added config_pspec_pipe function and tests

7b25135

updated pspec_run for better handling of dset loading

c49b815

and new PspecData capabilities

updated preprocess_data.py given new read_miriad_metadata func

21e5662

made argparser in pspec_run handle lists of tuples

f2580f0

propagated pspec_run verbose to pspec

b506dc7

increased pspecdata test coverage

9f551d9

created container.merge_spectra func, and added pspec_type to uvpspec

16e11c3

first round of code edits for grouping.bootstrap_run function

de17dbd

modified: hera_pspec/utils.py

7b72dab

modified: hera_pspec/utils.py

f92afaa

modified: setup.py

added config_pspec_pipe function and tests

7cc80b5

updated pspec_run for better handling of dset loading

0bfdb3f

and new PspecData capabilities

propagated pspec_run verbose to pspec

1481fea

addressed pspec_run_config PR comments: added test for repeated dset …

0bfa626

…labels

fixed lots of PEP8 from analytic error PR. also fixed bug from analyt…

bd271f0

…ic err PR where cov_array was not being propagated to pspec_run, and also wasn't being loaded when read_from_group was called. added tests to check for these issues.

added store_cov and dsets_std to pspec_run_argparser [skip ci]

cb65021

fixed how store_cov is assigned in uvpspec_utils._select

83b7354

increased test_pspecdata coverage

d7dcd25

added tests for container.merge_spectra

eee28fe

added tests for utils.get_blvec_reds

c594113

enabled testing.uvpspec_from_data to take bl_groups

1a6599f

nkern added 7 commits July 16, 2018 01:12

split uvpspec.spw_array into spw_dly_array and spw_freq_array

e146994

updates to container.merge_spectra

907ad36

modified: hera_pspec/container.py modified: pipelines/pspec_pipeline/pspec_pipe.py

fixed bug in uvpspec.combine_uvpspec across blpts for scalar_array

19eb814

created utils.get_reds and modified utils.calc_reds to use it

7b0c88e

modified: pipelines/pspec_pipeline/pspec_batch.sh

cd9f616

modified: pipelines/pspec_pipeline/pspec_pipe.py

addressed power spectrum pipe v1 PR comments

1d6e89a

rebase leftovers

4e937f6

more rebase leftovers

nkern force-pushed the pspec_pipe branch from ccff2e4 to 4e937f6 Compare July 15, 2018 17:32

nkern added 4 commits July 16, 2018 09:56

added stats_array to bootstrap_resample_error and fixed bug

45029b7

in uvpspec.combine_uvpspec when concat across blpts

pep8 and docstrings

46713e1

modified: pipelines/pspec_pipeline/pspec_pipe.py

ae909de

modified: pipelines/pspec_pipeline/pspec_pipe.yaml

modified: pipelines/pspec_pipeline/pspec_pipe.py

6a61a22

nkern mentioned this pull request Jul 17, 2018

fix ambiguity in UVPSpec of spw_array for dlys and freqs when n_dlys != n_freqs #153

Closed

added omit_flags kwarg to UVPSpec.get_* funcs to omit flagged data

e959298

nkern mentioned this pull request Jul 17, 2018

[UVPSpec] add kwarg to uvp.get_data to omit flagged integrations #162

Closed

nkern added 9 commits July 17, 2018 11:04

modified: pipelines/pspec_pipeline/pspec_batch.sh

0803aef

added basic unit tests for pipeline scripts

d23356b

fixed tests in test_pspecdata

a9a6907

enforced numpy>=1.14 in travis

1ac9da4

specified numpy=1.14 in travis

4379c81

added conda update conda in .travis.yml

0445648

added multiprocess installation to travis

261c986

moved multiprocess install from conda to pip

7ea5f11

modified: .travis.yml

26efa40

nkern merged commit 20e7efc into master Jul 18, 2018

ghost removed the in progress label Jul 18, 2018

nkern deleted the pspec_pipe branch July 18, 2018 14:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

power spectrum pipeline v1 #151

power spectrum pipeline v1 #151

nkern commented Jul 8, 2018 •

edited

Loading

coveralls commented Jul 8, 2018 •

edited

Loading

philbull left a comment

philbull Jul 12, 2018

philbull Jul 12, 2018

nkern Jul 15, 2018

philbull Jul 12, 2018

philbull Jul 12, 2018

nkern Jul 15, 2018

philbull Jul 12, 2018

nkern Jul 15, 2018

philbull Jul 12, 2018

nkern Jul 15, 2018

philbull Jul 12, 2018

nkern Jul 15, 2018

philbull Jul 12, 2018

nkern Jul 15, 2018

philbull Jul 12, 2018

philbull Jul 12, 2018

philbull Jul 12, 2018

nkern Jul 15, 2018

philbull Jul 12, 2018

nkern Jul 15, 2018

nkern Jul 15, 2018

philbull commented Jul 16, 2018

nkern commented Jul 17, 2018

nkern commented Jul 18, 2018

power spectrum pipeline v1 #151

power spectrum pipeline v1 #151

Conversation

nkern commented Jul 8, 2018 • edited Loading

coveralls commented Jul 8, 2018 • edited Loading

philbull left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philbull commented Jul 16, 2018

nkern commented Jul 17, 2018

nkern commented Jul 18, 2018

nkern commented Jul 8, 2018 •

edited

Loading

coveralls commented Jul 8, 2018 •

edited

Loading