-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
power spectrum pipeline v1 #151
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a few docstring changes needed, and a couple of minor questions. Nothing that should get in the way of doing an initial end-to-end run.
hera_pspec/container.py
Outdated
def merge_spectra(psc, groups=None, dset_split_str='_x_', ext_split_str='_', verbose=True): | ||
""" | ||
Iterate through a PSpecContainer and, within each specified group, | ||
merge spectra of similar name but different psname extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does merge mean average them together, or just combine them into a single object? Should be a bit clearer in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also worth mentioning that this is a destructive operation, i.e. it removes the old unmerged spectra.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I changed the name to combine_psc_spectra
to be more suggestive of what its actually doing, which is just setting up a combine_uvpspec
call.
hera_pspec/container.py
Outdated
|
||
Parameters | ||
---------- | ||
groups : list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing docstring for psc
and verbose
kwargs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also worth mentioning that this is an in-place operation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, thanks
hera_pspec/container.py
Outdated
ext_split_str : str | ||
The pattern used to split the dset name from its extension in the psname. | ||
""" | ||
from hera_pspec import uvpspec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this import statement here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I must of thought it was a circular dependency, but obviously its not. thanks
hera_pspec/container.py
Outdated
from hera_pspec import uvpspec | ||
# load container | ||
if isinstance(psc, (str, np.str)): | ||
psc = PSpecContainer(psc, mode='rw') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should maybe pass an overwrite
kwarg as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah
seed : int | ||
Random seed to use in bootstrap resampling. | ||
|
||
normal_std : bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you mention the name of the keys in the stats_array
of the output UVPSpec
where these error estimates can be found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, good point
hera_pspec/utils.py
Outdated
To form cross spectra between these two files, one would feed a group_pair | ||
of: group_pairs = [('even', 'odd'), ...]. | ||
|
||
A baseline-pair is formed by self-matching unique-files in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by this first line. Does it mean it only allows baseline pairs to be formed between files that have the same identifier, e.g. for group_pairs = [('even', 'odd'),]
it would do even.1234
x odd.1234
, but not even.1234
x odd.1235
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it will do the first, but not the second based on how you had it: i.e. it doesn't permute the keys for you. if you wanted to do both, you should feed:
group_pairs = [('even', 'odd'), ('odd', 'even')]
hera_pspec/utils.py
Outdated
|
||
action_name : str | ||
The name of the block in the pipeline | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
M
kwarg is not documented.
hera_pspec/utils.py
Outdated
|
||
def job_monitor(run_func, iterator, action_name, M=map, lf=None, maxiter=1, verbose=True): | ||
""" | ||
Job monitoring function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a bit more explanation.
#------------------------------------------------------------------------------- | ||
if run_diff: | ||
# get algorithm parameters | ||
globals().update(cf['algorithm']['diff']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure about this strategy of updating globals
. It seems a bit opaque, and I can imagine things going wrong. Perhaps we can discuss this, but probably fine to leave it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, we can change to a more explicit call to the input parameter attributes
return 0 | ||
|
||
# launch pspec jobs | ||
failures = hp.utils.job_monitor(pspec, range(len(jobs)), "PSPEC", lf=lf, maxiter=maxiter, verbose=verbose) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this do any timing? If not, it might be neat to print some timing info so we can check on progress and estimate how long things are likely to run for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I can add some basic timing into it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So each block in the pspec pipeline already has a timer, so we can do small runs and get a sense for how long each block takes.
modified: setup.py
and new PspecData capabilities
modified: setup.py
and new PspecData capabilities
…ic err PR where cov_array was not being propagated to pspec_run, and also wasn't being loaded when read_from_group was called. added tests to check for these issues.
modified: hera_pspec/container.py modified: pipelines/pspec_pipeline/pspec_pipe.py
modified: pipelines/pspec_pipeline/pspec_pipe.py
more rebase leftovers
in uvpspec.combine_uvpspec when concat across blpts
modified: pipelines/pspec_pipeline/pspec_pipe.yaml
OK, @nkern feel free to merge when you're ready. |
@philbull. Okay, I'm going to add some unittesting for the pipeline scripts, now that we've settled on keeping the pspec pipeline in |
...and we are there.. |
This adds the
•
pipelines/pspec_pipeline/pspec_pipe.py
•
pipelines/pspec_pipeline/pspec_pipe.yaml
•
pipelines/pspec_pipeline/pspec_batch.sh
scripts as version 1 of the power spectrum pipeline, which goes through the analysis blocks in the following order:
This branch should be merged after the
stats_array
branch is merged in.A statistical evaluation step seems appropriate for this script (after 3.) but due to circular dependency with
hera_stats
its not possible to add it in. We should make astats_pipe.py
script andstats_pipe.yaml
script inhera_stats/pipelines/stats_pipe
of similar format to perform this last step of the full power spectrum pipeline.[UPDATE]
The #145 PR and #148 PR were merged into this PR because they were all inter-dependent.