Skip to content

Minutes for NDData telecon (16 August 2012)

astrofrog edited this page Aug 17, 2012 · 5 revisions

Participants

Technical issues meant that some people were not present all the time, but the full list of participants is:

  • Kyle Barbary
  • Andy Casey
  • Steve Crawford
  • Matt Davis
  • Wolfgang Kerzendorf
  • Adrian Price-Whelan
  • Thomas Robitaille
  • Erik Tollerud
  • James Turner

Aims

To finalize API/functionality decisions regarding the core NDData class.

Masks

The current implementation of masks is a 'mask' attribute that is required to be a boolean Numpy array with the same dimensions as the data. Values where the mask is 'True' are masked, and values where the mask is 'False' are unmasked. Everyone was happy with the current implementation.

Later on in the call, no objections were raised to having the mask behave consistently with Numpy masked arrays for arithmetic.

Flags

The current implementation of masks is a 'flags' attribute that is required to be a Numpy array that is broadcastable to the dimensions of the data. Wolfgang suggested we have a Flags class to represent flags, which has no real restrictions as we can just let the user deal with the contents of the class. He also suggested we could include an example BitFlags class. However, after some more discussion, the bottom line is that we agreed on the following rules. The flags object can be one of:

  • None
  • A Numpy array with the same dimensions as the data
  • A dictionary of Numpy arrays with the same dimensions as the data

The idea behind the last one was (as suggested by Adrian) that one might want multiple layers of flags, so a dictionary provides a nice way to do that, since it automatically allows the layers to have names. The nice thing about these rules is that if flags are present, the user knows the shape of the flags will always be the same as the data, and can therefore be used e.g. to set the mask. Adrian provided some examples:

# With layers:
nddata.mask = nddata.flags["sextractor"] & 8181 == 0

# Without layers:
nddata.mask = nddata.flags & 8181 == 0

We also agreed that we would need to create a simple class 'FlagCollection' that would overload from the dictionary class and provide shape and type checking for the flags. For example, if a user does::

nddata.flags["sextractor"] = 'invalid'

an exception should be raised.

Wolfgang suggested that we have a mechanism to allow the user to register functions for allowing flag arithmetic, but we then all agreed that this would be too complex for now, and could be added as a future pull request. In the mean time, users can just set the flags manually after an operation.

Errors

The current implementation of errors is just to have broadcastable Numpy arrays. We decided that it would indeed make sense to have classes for different kinds of errors, for example:

  • Standard deviations
  • Poisson errors
  • Analytical probability distribution functions (PDFs)
  • Numerical PDFs

We agreed that we should implement ways to propagate errors when doing arithmetic, but that it would be ok initially to not be able to do so with errors that had different types.

We also decided that explicit is better than implicit, so one should never be able to use a plain Numpy array for errors to mean standard deviations. So:

d = NDData(..., error=np.array(...))

would not work.

We decided that all error types should inherit from NDError to make it easy to check instance types.

The arithmetic for errors should be handled inside the error classes.

Arithmetic

Arguably one of the more important points was how NDData objects should be able to interact. The discussion is too long to report in detail, but the bottom line is that we decided that NDData should have public methods to allow for arithmetic, and would automatically propagate the errors (if possible).

Initially, until the units framework is complete, only addition and subtraction would be implemented, and only when the units matched (since they do not require any changes in units). The WCS would also be required to match exactly.

Once the units framework is implemented, we would add support for handling different units to the addition and subtraction, and then expand to allow multiplication and division.

We also agreed to not overload the +-*/ operators because they are harder to document, and since the operations follow certain strict rules that may often fail, it made more sense to use documented methods.

The spectrum, image, etc. classes inheriting from NDData would then deal with things such as interpolation, changing WCS, etc., but the base class methods would always be there to provide the final arithmetic for objects with matching WCS.

Steve suggested that in order to allow operations with datasets that had errors that could not be combined, we could have a 'propagate_errors' option in the operators that if set to False, any action would only be performed on the NDData array and not propagated to the error array.

We also agreed that the error classes should have matching methods for operations that would get called by the NDData methods.

Future

Erik T. pointed out that discussing a general WCS framework (beyond FITS-WCS would be an ideal topic for the face-to-face meeting at STScI).

Action points

  • Tom R. will prepare a pull request implementing the above decisions
  • Wolfgang will then finalize the SDError or StandardDeviationError class once the main pull request is merged in

Result

Pull request: https://github.com/astropy/astropy/pull/348

Clone this wiki locally