Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ability to exclude same baseline cross spectra #160

Closed
miguelfmorales opened this issue Jul 16, 2018 · 6 comments
Closed

Remove ability to exclude same baseline cross spectra #160

miguelfmorales opened this issue Jul 16, 2018 · 6 comments

Comments

@miguelfmorales
Copy link

I've come to the conclusion that calculating even-odd cross spectra (within a baseline type) while excluding the same antenna crosses is a statistical error, and biases the answer low.

The delay PS within one set of redundant baselines fundamentally assumes that all the baselines are measuring the same physical quantity. However, if we pick a particular delay mode in the window and plot the correlation between all redundant pairs (Matt's correlator diagrams) for PAPER, there is higher power in the self-antenna correlations. This has been used as a justification for removing these in calculating the cross-spectra.

However, we have no accepted physical model for this. Noise is independent across nights. In cross-talk there should be almost no spectral power. Even if there is spectral power in the cross-talk, there is no expectation that it should be stronger on baselines crossed with themselves than others (it is more likely to be real, but the amplitude expectation is not smaller).

There is a model that our baselines are not spectrally redundant (either physically or due to calibration errors). And this might fit the data. However, in this case we'd expect the sky power to be most accurately measured in the same baseline crosses, with artificial signal loss in the cross power of non-redundant baselines.

So we are left with an experimental determination—that the same antenna crosses are stronger than the others. However, this is not a statistically valid reason for exclusion.

It is a variation of the classic image stacking mistake. Astronomer X wants to see if galaxies are faintly emitting (gamma rays, x-rays, CO radio lines, etc.). So they take observations of 30 galaxies, none of which are statistically significant. So they then pick the 15 with positive excesses, and stack those. Low and behold they have a statistically significant excess! Of course, doing the same process with a zero-mean Gaussian noise also gives the same excess. Selection bias.

The code is currently doing the same thing here, in reverse. By throwing the strongest signals out, without an identified physical model, we have artificially selected the smallest values and biased the signal low. This is a statistical error.

The code should be rewritten to not allow exclusion of same-baseline cross spectra unless an agreed upon (and verified) physical source of this effect is identified.

@miguelfmorales miguelfmorales changed the title Remove ability to not include same baseline cross spectra Remove ability to exclude same baseline cross spectra Jul 16, 2018
@nkern
Copy link
Member

nkern commented Jul 17, 2018

@miguelfmorales so right now this isn't something that's hard-coded by any means in hera_pspec. It boils down to a single keyword parameter in hera_pspec.pspecdata.pspec_run (one of the few end-user interfaces), which is exclude_auto_bls, which is by default set to True. I'm happy to change the default to False, but it seems a little excessive to strip the capability from the code if what we are saying is that this is still a "research-level" topic. How do you feel about that compromise?

@acliu
Copy link
Contributor

acliu commented Jul 17, 2018

@miguelfmorales I agree with @nkern that maybe it's a little drastic to forbid the exclusion in the code. I would support a change in the default setting though. The good thing about changing the default is that it'll make sure this issue (explaining why there seems to be a different behaviour) stays on our minds. But I think in keeping with our philosophy of allowing an "expert mode", we shouldn't forbid the option.

@miguelfmorales
Copy link
Author

Hi @nkern and @acliu, I mostly agree. I was a little quiet as I tried to think it through.

In the end my goal is to strongly discourage using this option (I think it is a real error), but expert mode is good, and there may be scenarios in which we want to bring it back. But I don't want people to just change a line in their scripts and keep using the old behavior.

So what do you think if we change the default, and if the old usage is chosen it prints a warning that explicitly links to this gitHub issue? That way we don't loose the connection to why we changed the default, and we don't have future users silently use the option without knowing why the default was changed. But it still enables blackbelt users with a warning.

@acliu
Copy link
Contributor

acliu commented Jul 17, 2018

I would be fine with that. @nkern what about you?

@nkern
Copy link
Member

nkern commented Jul 18, 2018

sounds good. I'll make it part of the "tidy-up before release v0.1" PR I'm going to make later today!

@acliu
Copy link
Contributor

acliu commented Jul 18, 2018

Oh whoops. @nkern I didn't see your last comment until after I had submitted a (tiny) PR to do this. Doesn't matter to me if we just forget my PR and put them in your tidy up PR, or we treat it separately.

Oh wait, real-time update. @philbull just approved it. I'll just merge it into master and you won't have to worry about it.

@acliu acliu closed this as completed Jul 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants