Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for #884: do not allow stitch() to change Quantity data type #891

Closed

Conversation

barentsen
Copy link
Collaborator

The LightCurveCollection.stitch() is unexpectedly changing the data type of integer Quantity columns from int into float. This is because the underlying astropy.table.vstack operation displays this behavior (see astropy/astropy#10958)

For example:
Screen Shot 2020-10-28 at 10 33 37

This PR implements a workaround for this behavior in an attempt to fix #884:

Screen Shot 2020-10-28 at 10 34 36

However, perhaps an alternative solution would be to always use Column instead of Quantity for integer columns. (/cc @orionlee)

@orionlee
Copy link
Collaborator

I am unsure about this. I suppose we have to ask if Quantity is meant to work for integers.

I feel we have 2 choices:

  1. keep the recent behavior of forcing numeric columns as Quantity, and work out the bugs / workarounds. This path is viable only if it's determined Quantity is meant to work for integers.
  2. Retain TimeSeries default behavior. For Lightkurve, that'd mean cadenceno and quality are Column type.
    a. in this case, it'd make life easier if .value is added to Column. But my understanding is that such change can't be released any time soon.
  3. Arguably, we could possibly still force all float type columns as Quantity. I don't know how useful it'd be, however. It won't change anything for the default TESS (and probably Kepler) columns. Lightkurve users would still need to reckon with Column type.

@barentsen
Copy link
Collaborator Author

@orionlee I think we can implement option #2 in your list as follows:

  1. We go ahead and add the .value attribute to Column inside AstroPy. Over in AstroPy issue #10859 the maintainer of astropy.table seems OK with this.
  2. As a temporary workaround, we adapt QTimeSeries._convert_col_for_table to make sure all non-Quantity columns in a LightCurve object are of type QColumn rather than Column, defined as follows:
class QColumn(Column):

    @property
    def value(self):
        return self.data
  1. Whenever column.value is available in a mainstream release of AstroPy, we can remove both QTimeSeries and QColumn from Lightkurve and make that mainstream release a minimum dependency.

@orionlee
Copy link
Collaborator

@barentsen Your suggestion of keeping TimeSeries behavior while adding a QColumn temporarily sounds promising. 1 note and 1 potential concern.

  1. We also need a QMaskedColumn, TESS cadenceno and quality are both read as MaskedColumn
  2. the potential concern: if Lightkurve add a column with their own custom Column subclass, then we probably shouldn't wrap around it. If so, the following codes could still break, even though it seems to be highly unlikely in practice.
  lc.flux = UserCustomColumn(...)
  lc.scatter() # the flux column has no .value, being a custom Column

@orionlee
Copy link
Collaborator

The concern for custom Column class in conjunction with plotting is alleviated.

  • I just checked, and matplotlib can handle both Quantity and Column, without using raw data.
# plot Quantity directly
ax.scatter(lc1.time.value, lc1.flux)

# plot Column directly
ax.scatter(lc1.time.value, Column(lc1.flux.value, unit=lc1.flux.unit, name='flux'))
  • I don't know if such support requires specific version of matplotlib / astropy. The versions I tried is 3.1.0 / 4.1rc1 respectively

  • Lightkurve's plotting codes, e.g., LightCurve._create_plot(), still needs to be modified to take advantage of it.

@barentsen
Copy link
Collaborator Author

Closing this PR as the issue has been resolved by #895, which includes this PR's unit test! 🎉

Thanks @orionlee!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KeplerLightCurve.to_fits() doesn't work with Lightkurve.read() for stitched data.
2 participants