Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a check to verify autos are real-only #1110

Merged
merged 26 commits into from
Jan 14, 2022
Merged

Adding a check to verify autos are real-only #1110

merged 26 commits into from
Jan 14, 2022

Conversation

kartographer
Copy link
Contributor

@kartographer kartographer commented Dec 12, 2021

A PR a day keeps the doctor away...

Description

  • Added a new method to UVData called _fix_autos, which will force the auto-correlation data in data_array to be real-only.
  • Added a new feature to UVData.check, which will see whether or not auto-correlation data have non-zero imaginary components in them. If they detected, depending on arguments supplied, check can either raise an error or attempt to fix the auto-correlations (via a call to _fix_autos).
  • Did some minor test fixture cleanup (mostly with the SMA MIR test data set).

Motivation and Context

As originally motivated in #1024, and demonstrated in #1102, checking that the autos are real-only is a good way to help verify the health of any given UVData object, to make sure that something weird hasn't happened in the handling of the data set. However, since this check is potentially inspecting lots of values in data_array, it can be somewhat computationally expensive to run, which seems somewhat antithetical to how UVData.check should work -- quoting @bhazelton here from a comment in #1024:

It'll almost always pass so we want to make sure it's lightweight, but catching this early on a dataset is very useful.

To that end, I've set up this new functionality with the following defaults:

  • On read, if non-zero imaginary components are found in the autocorrelation data, check will raise a warning, and will subsequently fix the issue by making those values real-only (using np.abs on the autocorrelation data).
  • On write, if non-zero imaginary components are found in the autocorrelation data, check will raise an error.
  • On all other operations, no check of the autos is done.

I think the above is a reasonable balance between ease-of-use, speed, and ensuring the integrity of UVData objects. FWIW, The addition of _fix_autos was motivated in part by the fact that some of the test files so have non-zero imaginary components in their auto-correlation data -- something that I believe is a hangover from the old phase method (something I noticed while working on #979).

Closes #1024

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

New feature checklist:

  • I have added or updated the docstrings associated with my feature using the numpy docstring format.
  • I have updated the tutorial to highlight my new feature (if appropriate).
  • I have added tests to cover my new feature.
  • All new and existing tests pass.
  • I have updated the CHANGELOG.

@codecov
Copy link

codecov bot commented Dec 12, 2021

Codecov Report

Merging #1110 (3b77261) into main (c08b766) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1110   +/-   ##
=======================================
  Coverage   99.83%   99.83%           
=======================================
  Files          31       31           
  Lines       15339    15370   +31     
=======================================
+ Hits        15313    15344   +31     
  Misses         26       26           
Impacted Files Coverage Δ
pyuvdata/uvdata/fhd.py 99.66% <ø> (ø)
pyuvdata/uvdata/mir.py 100.00% <ø> (ø)
pyuvdata/uvdata/miriad.py 98.73% <ø> (ø)
pyuvdata/uvdata/ms.py 100.00% <ø> (ø)
pyuvdata/uvdata/mwa_corr_fits.py 100.00% <ø> (ø)
pyuvdata/uvdata/uvfits.py 100.00% <ø> (ø)
pyuvdata/uvdata/uvh5.py 100.00% <ø> (ø)
pyuvdata/uvdata/uvdata.py 99.91% <100.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c08b766...3b77261. Read the comment docs.

@bhazelton
Copy link
Member

I like the approach here and it's very timely! The biggest question I have is do we want to default to making the autos be real? I worry that non-real autos is a big enough problem that we should maybe default to erroring rather than warning.

@kartographer
Copy link
Contributor Author

So don't have particularly strong feelings about warn vs error, especially given that the remedy (calling _fix_autos by way of setting fix_autos=True) is fairly straight forward and already called out in the error message itself. I generally favor being more tolerant of importing imperfect data versus writing out imperfect data, but that's a nicety, not a requirement.

However, given that the old phase method used to leave behind some (small) imaginary values in the autos, I think it would be prudent to at least have a window during which we are issuing a warning rather than a full-blown error, just to give users some heads up on which of their datasets may be impacted by this new check. That includes a few of our own test datasets (which may warrant a separate issue being filed), as well as those I see passing through the HERA pipeline (or whatever the Azure Pipeline tests are running).

@bhazelton
Copy link
Member

@jsdillon This PR is adding a check for non-real autos, which is an issue that comes up from time to time and generally means there's a bug in upstream code (sometimes including the correlator). This new check seems to be breaking some tests on hera_cal, presumably because you have some old data files with these issues.

It is possible those problems originated with pyuvdata, we recently realized that we sometimes had some numerical precision issues in the old phasing code that caused the autos to have very small (< one part in 10^10) imaginary parts. If that's the source I apologize! In this new checking code, there is an option to just set the imaginary parts of the autos to zero, which you might want to use, depending on the source of the problems in your files.

@jsdillon
Copy link
Contributor

This is going to be tricky to fix in hera_cal, where it appears to be the case that the test files do have small imaginary parts in the autos (<1e-10) but also I'm finding that calibrating autos can also introduce very small imaginary components because the calibration involves multiplication by each gain individually (so there's roundoff errors). That may be fixable, I'm not sure.

In the meantime, this PR needs a fix_autos kwarg for this function:

def write_uvh5_part(

@jsdillon
Copy link
Contributor

Ok, I've done some digging and opened a draft PR which fixes 9 of the 12 test failures when running under this branch: HERA-Team/hera_cal#749

The rest are in hera_cal.vis_clean and so I'm passing this off to @aewallwi to figure out how he'd like to adapt his code.

@bhazelton
Copy link
Member

Thanks @jsdillon! Sorry to make work for you.

@kartographer
Copy link
Contributor Author

Thanks @jsdillon -- I've gone ahead and added the fix_autos keyword to the UVData.write_uvh5_part method in 0369ee1.

@jsdillon
Copy link
Contributor

jsdillon commented Jan 8, 2022

Apologies for the delay, but I think we have a PR that fixes the hera_cal issue. Does anyone here want to review it?

@kartographer
Copy link
Contributor Author

kartographer commented Jan 10, 2022

@jsdillon -- FWIW, I reran the above hera_cal checks post-merge of the aforementioned PR, and it seems like there are three tests that are still failing. On first glance though, it seems like the failing tests themselves are actually plugging random complex data into data_array, which may be inadvertently injecting complex values into the autos (and hence triggering the error).

@aewallwi
Copy link
Contributor

aewallwi commented Jan 10, 2022

Some thoughts on this issue:

While valid auto-correlations are always real, they are represented on the computer with complex numbers. Radio astronomers routinely perform completely valid complex arithmetic on auto-correlations which, in every case, introduce small non-real components from numerical precision errors. Examples include applying calibration solutions, filling in flagged channels, converting from Jy to K Sr etc...

By requiring that auto-correlations be exactly zero, pyuvdata requires that any arithmetic operation that is performed on visibilities in some external pipeline must always manually set the imaginary part auto-correlation amplitudes to zero after that operation, even when this is not actually necessary.

In order to avoid lots of downstream manual setting of autocorrelations to zero, would it make sense to define the real autos requirement to be that the imag part of the autos is below some fraction of the real part rather then the imag part being exactly equal to zero?

@bhazelton
Copy link
Member

I think that's probably a reasonable approach. I do think we need a check on this as this has been used to identify bugs in both correlators and downstream processing on multiple occasions now. The question then becomes what should the tolerance be?

@aewallwi
Copy link
Contributor

I think that's probably a reasonable approach. I do think we need a check on this as this has been used to identify bugs in both correlators and downstream processing on multiple occasions now. The question then becomes what should the tolerance be?

Maybe we can figure out what that should be through monte carlo simulations of some arithmatic operations?

@bhazelton
Copy link
Member

bhazelton commented Jan 10, 2022

If you want to do that it'd be interesting input. I think it should also be somewhat physically motivated in terms of how much phase angle it is. In some other parts of pyuvdata we use 1 milliarcsecond (~4.85e-09 radian), but those are usually for physical angles not phase...

@aewallwi
Copy link
Contributor

If you want to do that it'd be interesting input. I think it should also be somewhat physically motivated in terms of how much phase angle it is. In some other parts of pyuvdata we use 1 milliarcsecond, but those are usually for physical angles not phase...

The other option is to just pick a value that we know will never be scientifically relevant.

@bhazelton
Copy link
Member

bhazelton commented Jan 10, 2022

I guess the problem with the phase argument is that it is related to the absolute value of the visibility, which we don't want here. A value that won't be scientifically relevant was where I was leading with the "physically motivated" comment before. The problem is that pyuvdata is used over a wide range of instruments. Errors that are not scientifically relevant for HERA often are for the SMA.

If you want to do a little monte carlo for the kinds of operations you're worried about I think that would be helpful input.

@kartographer
Copy link
Contributor Author

Hi @aewallwi -- the choice of checking only that the imaginary component was zero was based on three observations/concerns:

  1. In just about every (local) test I ran, doing a relative check between real and imaginary values was considerably slower. Admittedly, my testing wasn't exhaustive, but the difference I was generally seeing was about an order-of-magnitude or worse (np.real_if_close is faster than this, but also has some peculiar behavior that actually makes it a less-than-ideal choice). Because of this, I think I'd still keep the first iscomplex check, and put additional logic under/inside it to handle the isclose case (otherwise this check starts feeling about as "lightweight" as a deep-fried Twinkie).
  2. As perhaps the above discussion already demonstrates, I was a bit concerned that any choice of tolerance would be arbitrary, particularly since the induced errors likely depend on the floating-point precision of the arithmetic in question (not necessarily something intrinsic to the instrument). A Monte Carlo simulation could be quite helpful in that regard.
  3. I was a little worried about what I can only think to call "floating point creep", where multiple subsequent calls to some function inducing complex values results in an error, that's not immediately traceable to any single operation. As you point out, needing to do this cleanup manually after each operation is a bit annoying, which was part of the motivation for breaking out UVData._fix_autos as a separate function, since it at least make "cleanup" of this issue a bit more straightforward (although it's not currently in the public API, though that could change).

The only dog I have in this fight is the first item above - I'd just like it to be fast. But I'm happy to make the code a bit more tolerant, if that's what folks desire. Though FWIW, I don't think the tolerance issue is what's causing the current fails on the hera_cal pipeline.

@david-macmahon
Copy link

Just another example of non-zero imaginary components of auto-correlations...

xGPU computes(-ed?) auto correlations the same as cross correlations because it is (was?) faster to do them the same than to have a separate real-only computation. When configured to use floating point math, the GPUs used in the early days of xGPU had a fused-multiply-add (FMA) instruction but not a corresponding fused-multiply-subtract operation, so some terms of the imaginary component of auto correlations that cancel out in theory did not cancel out completely in practice due to the difference in precision of the fused-multiply-add vs the non-fused-multiply-subtract. Not sure if that's still the case with today's GPUs (and most folks use xGPU's integer math now which is not afflicted this way), but thought I'd add it here for historical context/posterity.

@aewallwi
Copy link
Contributor

aewallwi commented Jan 14, 2022

Hi @aewallwi -- the choice of checking only that the imaginary component was zero was based on three observations/concerns:

  1. In just about every (local) test I ran, doing a relative check between real and imaginary values was considerably slower. Admittedly, my testing wasn't exhaustive, but the difference I was generally seeing was about an order-of-magnitude or worse (np.real_if_close is faster than this, but also has some peculiar behavior that actually makes it a less-than-ideal choice). Because of this, I think I'd still keep the first iscomplex check, and put additional logic under/inside it to handle the isclose case (otherwise this check starts feeling about as "lightweight" as a deep-fried Twinkie).
  2. As perhaps the above discussion already demonstrates, I was a bit concerned that any choice of tolerance would be arbitrary, particularly since the induced errors likely depend on the floating-point precision of the arithmetic in question (not necessarily something intrinsic to the instrument). A Monte Carlo simulation could be quite helpful in that regard.
  3. I was a little worried about what I can only think to call "floating point creep", where multiple subsequent calls to some function inducing complex values results in an error, that's not immediately traceable to any single operation. As you point out, needing to do this cleanup manually after each operation is a bit annoying, which was part of the motivation for breaking out UVData._fix_autos as a separate function, since it at least make "cleanup" of this issue a bit more straightforward (although it's not currently in the public API, though that could change).

The only dog I have in this fight is the first item above - I'd just like it to be fast. But I'm happy to make the code a bit more tolerant, if that's what folks desire. Though FWIW, I don't think the tolerance issue is what's causing the current fails on the hera_cal pipeline.

I agree that we should prioritize speed. I think the fix_autos option is concise enough that it will let downstream code avoid clunky manual setting of the autos. Thanks for clarifying this! I'm happy with this proceeding.

Copy link
Member

@bhazelton bhazelton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good and I think the discussion has converged.

@bhazelton bhazelton merged commit f7fe4c1 into main Jan 14, 2022
@bhazelton bhazelton deleted the check_real_autos branch January 14, 2022 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add check that autos are real
5 participants