Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow fiberassign --stdstar duplicates with --mtl #202

Merged
merged 3 commits into from May 30, 2019
Merged

Conversation

sbailey
Copy link
Contributor

@sbailey sbailey commented May 29, 2019

This PR fixes #198 by allowing fiberassign to have duplicate targets in the --stdstar file and the --mtl. It also exposures the fba_run options --sciencemask, --stdmask, etc. to fiberassign.

Background: Legacy fiberassign allowed duplicate std/mtl targets and it worked due to how it did internal bookkeeping. The refactored fba_run does not allow duplicate targets, but that broke some of the ways we had been using it.

This PR: If all --stdstar targets are already in MTL, it just ignores the file. If it has some unique targets, those are filtered out and written to a separate temporary file before passing on to fba_run. This allows users to continue to use fiberassign in the same manner as before, without worrying about std/mtl duplicates.

@sbailey sbailey requested a review from tskisner May 29, 2019 22:09
@sbailey
Copy link
Contributor Author

sbailey commented May 29, 2019

Hmm; a bunch of Travis test failures, but tests pass at NERSC. Newer fitsio is first suspicion since it made some non-backwards compatible changes.

@sbailey
Copy link
Contributor Author

sbailey commented May 30, 2019

fitsio 1.0.x switched to reading string table columns as unicode strings instead of raw bytes, requiring tests like

    science_rows = np.where(fiber_data["DEVICE_TYPE"] == b"POS")[0]

to be changed to

    science_rows = np.where(fiber_data["DEVICE_TYPE"].astype(str) == "POS")[0]

to work with both 0.9.x and 1.0.x

Now I'm getting intermittent unicode parsing crashes from fitsio, e.g.

======================================================================
ERROR: test_io (fiberassign.test.test_assign.TestAssign)
----------------------------------------------------------------------
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcb in position 0: ordinal not in range(128)

The above exception was the direct cause of the following exception:

SystemError: <class 'UnicodeDecodeError'> returned a result with an error set

[repeated many times]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/travis/build/desihub/fiberassign/py/fiberassign/test/test_assign.py", line 154, in test_io
    read_assignment_fits_tile((tid, infile))
  File "/home/travis/build/desihub/fiberassign/py/fiberassign/assign.py", line 621, in read_assignment_fits_tile
    fbassign = fd["FIBERASSIGN"].read()
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/fitsio/hdu/table.py", line 604, in read
    data = self._read_all(**keys)
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/fitsio/hdu/table.py", line 649, in _read_all
    array = self._maybe_decode_fits_ascii_strings_to_unicode_py3(array)
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/fitsio/hdu/table.py", line 1341, in _maybe_decode_fits_ascii_strings_to_unicode_py3
    array = array.astype(new_dt, copy=False)
SystemError: <class 'UnicodeDecodeError'> returned a result with an error set

I'm now trying to catch it in the act and cache the problematic test file to see if it is a problem with fitsio or with something we're writing. But it doesn't fail every time the tests run...

@sbailey
Copy link
Contributor Author

sbailey commented May 30, 2019

Unicode parsing errors are our fault. assign.py line 859 creates an empty table to fill with fiber assignments:

    tile_targets = np.empty(len(tile_tgids), dtype=tile_targets_dtype)

That includes a BRICKNAME string column that never gets initialized, and thus randomly sometimes contains non-ASCII characters which aren't allowed in FITS.

For this particular case, I think replacing it with np.zeros and also actually filling in the brickname would fix this problem. However, there are multiple other uses of np.empty throughout fiberassign, so pausing for a moment to consult with @tskisner . I'd prefer to take the minor performance hit and initialize these other cases with np.full(length, bogus_value, dtype), and then include checks at the end of initialization that no bogus values remain. Thoughts?

@sbailey
Copy link
Contributor Author

sbailey commented May 30, 2019

FYI I'm proceeding with my proposed replacement of np.empty -> np.zeros and np.full.

@sbailey sbailey merged commit 340f36f into master May 30, 2019
@sbailey sbailey deleted the duplicate_stds branch May 30, 2019 21:57
@sbailey
Copy link
Contributor Author

sbailey commented May 30, 2019

For the record: I handled the uninitialized variables case, but I didn't add the (slow) code to fill in BRICKNAME in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent behaviour for duplicate IDs across Science targets & standards
1 participant