Preserve non-float dtypes in Quantity columns from serialized mixins #12505

dhomeier · 2021-11-19T22:56:11Z

Description

Columns of integer types that are serialised with Quantity information in FITS tables are presently automatically "upcast" when read back in, since Quantity(data) is called with its default settings, converting all numerical input to float[64]-type.

Fixes #12494, enabling both Table.read() and QTable.read() to recover the original (and stored) dtypes from a FITS table.

Fixes #15447 also.

The second commit addresses an additional issue that came up with the above: instantiating a QTable from integer columns also casts every non-float column to float64, overriding any types set with the dtype option as well.
Since the docstring states that dtype=[...] will set the column dtypes, I consider that latter part a bug.
The commit here implements the simple fix of adopting all original dtypes from the column, so creating a QTable from a Table or list of Columns will always preserve their dtypes, whether explicitly set via the option or not (tested interactively to work with Masked as well, but proper tests are to follow when we have agreement on the desired functionality).
The above may be reasonable behaviour and is not breaking any existing tests, but of course changes results; in particular it is not consistent with the default behaviour of Quantity(), so I am leaving this to further discussion.
Honouring only an explicitly set list of dtype=[...] would be a bit lengthier, probably involving replicating a good bit of he Table.__init__ setup in QTable.

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

Do the proposed changes actually accomplish desired goals?
Do the proposed changes follow the Astropy coding guidelines?
Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see "When to rebase and squash commits".
Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the Extra CI label.
Is a change log needed? If yes, did the change log check pass? If no, add the no-changelog-entry-needed label. If this is a manual backport, use the skip-changelog-checks label unless special changelog handling is necessary.
Is a milestone set? Milestone must be set but astropy-bot check might be missing; do not let the green checkmark fool you.
At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate backport-X.Y.x label(s) before merge.

github-actions · 2021-11-19T22:56:56Z

👋 Thank you for your draft pull request! Do you know that you can use [ci skip] or [skip ci] in your commit messages to skip running continuous integration tests until you are ready?

taldcroft · 2021-11-20T10:58:57Z

There's a bit of a problem with this:

In [5]: t = Table([Time([1,2], format='cxcsec')])
In [6]: t.write('junk.ecsv', overwrite=True)
In [7]: t2 = Table.read('junk.ecsv')
In [8]: t2['col0'].info.dtype
Out[8]: dtype('float64')   # <== dtype('O') in main

For mixin classes like Time, the dtype is left unset which then defaults to object.

I think a better approach is to change how Quantity serializes itself to add a dtype object attribute:

Out[12]: 
<QTable length=2>
 col0
  m  
int64
-----
    1
    2

In [13]: ascii.write(qt, format='ecsv')
# %ECSV 1.0
# ---
# datatype:
# - {name: col0, unit: m, datatype: int64}
# meta: !!omap
# - __serialized_columns__:
#     col0:
#       __class__: astropy.units.quantity.Quantity
#       unit: !astropy.units.Unit {unit: m}
#       value: !astropy.table.SerializedColumn {name: col0}
#       dtype: int64                                         <=== ADD THIS
# schema: astropy-2.0
col0
1
2

taldcroft · 2021-11-20T11:03:32Z

This seems to work:

diff --git a/astropy/table/serialize.py b/astropy/table/serialize.py
index 5dfa402818..62d69ccf44 100644
--- a/astropy/table/serialize.py
+++ b/astropy/table/serialize.py
@@ -342,7 +342,6 @@ def _construct_mixin_from_columns(new_name, obj_attrs, out):
 
         # Now copy the relevant attributes
         for attr, nontrivial in (('unit', lambda x: x not in (None, '')),
-                                 ('dtype', lambda x: x is not None),
                                  ('format', lambda x: x is not None),
                                  ('description', lambda x: x is not None),
                                  ('meta', lambda x: x)):
diff --git a/astropy/units/quantity.py b/astropy/units/quantity.py
index 2c60005a84..35b01159e5 100644
--- a/astropy/units/quantity.py
+++ b/astropy/units/quantity.py
@@ -171,10 +171,16 @@ class QuantityInfo(QuantityInfoBase):
     required when the object is used as a mixin column within a table, but can
     be used as a general way to store meta information.
     """
-    _represent_as_dict_attrs = ('value', 'unit')
     _construct_from_dict_args = ['value']
     _represent_as_dict_primary_data = 'value'
 
+    def _represent_as_dict(self, attrs=None):
+        q = self._parent
+        out = {'value': q.value,
+               'unit': q.unit,
+               'dtype': q.dtype.name}
+        return out
+
     def new_like(self, cols, length, metadata_conflicts='warn', name=None):
         """
         Return a new Quantity instance which is consistent with the

In [3]: qt = QTable([u.Quantity([1,2], unit=u.m, dtype=int)])

In [4]: qt.write('junk.ecsv', overwrite=True)

In [5]: qt2 = Table.read('junk.ecsv')

In [6]: qt2
Out[6]: 
<Table length=2>
 col0
  m  
int64
-----
    1
    2

(astropy) ➜  astropy git:(decode-dtype-mixins) ✗ cat junk.ecsv
# %ECSV 1.0
# ---
# datatype:
# - {name: col0, unit: m, datatype: int64}
# meta: !!omap
# - __serialized_columns__:
#     col0:
#       __class__: astropy.units.quantity.Quantity
#       dtype: int64
#       unit: !astropy.units.Unit {unit: m}
#       value: !astropy.table.SerializedColumn {name: col0}
# schema: astropy-2.0
col0
1
2

taldcroft · 2021-11-20T11:15:25Z

astropy/io/fits/connect.py

+        if 'datatype' in col:
+            if col['name'] in tbl.meta['__serialized_columns__']:
+                tbl.meta['__serialized_columns__'][col['name']]['dtype'] = DATA_TYPES.get(
+                        col['datatype'], col['datatype'])


With the other fix, this is no longer necessary:

In [2]: qt = QTable([u.Quantity([1,2], unit=u.m, dtype=int)]) In [3]: qt.write('junk.fits', overwrite=True) In [4]: qt2 = QTable.read('junk.fits') In [5]: qt2 Out[5]: <QTable length=2> col0 m int64 ----- 1 2

taldcroft · 2021-11-20T11:23:15Z

astropy/table/table.py

@@ -3963,7 +3963,7 @@ def _convert_col_for_table(self, col):
            # Quantity subclasses identified in the unit (such as u.mag()).
            q_cls = Masked(Quantity) if isinstance(col, MaskedColumn) else Quantity
            try:
-                qcol = q_cls(col.data, col.unit, copy=False, subok=True)
+                qcol = q_cls(col.data, col.unit, copy=False, subok=True, dtype=col.dtype)


Looks fine but this should have an explicit test (even if it is implicitly tested in the mixin handling stuff).

Sure; was just waiting whether to test QTable(tab) or only QTable(tab, dtype=tab.dtype).

Added under TestNewFromColumns

mhvk · 2021-11-20T19:03:02Z

Certainly a bug that the dtype does not round-trip. I like @taldcroft's solution! I wondered whether it is worth not storing the dtype if it float64 (i.e., the default), but checking quickly, I saw that Column also stores its dtype in the header. So, I think this is better generally - really seems more like an oversight that we did not do this already!

dhomeier · 2021-11-21T02:26:19Z

This seems to work:

+    def _represent_as_dict(self, attrs=None):
+        q = self._parent
+        out = {'value': q.value,
+               'unit': q.unit,
+               'dtype': q.dtype.name}
+        return out
+

Thanks, this looks much neater and more concise than my approach. And I guess keeping datatype to record the format the columns were written in, while meta stores the format and dtype they are supposed to be returned in, has some logic to it. The only drawback would be that this does not decode files written with earlier versions.
And I found one test failure on the Longitude mixin, due to the wrap_angle not being included with the definition above.
A naïve attempt to just add it to the dict as

             out['wrap_angle'] = {'value': q.wrap_angle.value, 'unit': q.wrap_angle.unit}

fails on decoding with a ValueError: Invalid character at col 0 in angle 'unit', since this is not fully reproducing the previous description

COMMENT     lon:
COMMENT       __class__: astropy.coordinates.angles.Longitude
COMMENT       unit: &id001 !astropy.units.Unit {unit: deg}
COMMENT       value: !astropy.table.SerializedColumn {name: lon}
COMMENT       wrap_angle: !astropy.coordinates.Angle
COMMENT         unit: *id001
COMMENT         value: 360.0

but instead outputs

COMMENT     lon:
COMMENT       __class__: astropy.coordinates.angles.Longitude
COMMENT       dtype: float32
COMMENT       unit: &id001 !astropy.units.Unit {unit: deg}
COMMENT       value: !astropy.table.SerializedColumn {name: lon}
COMMENT       wrap_angle:
COMMENT         unit: *id001
COMMENT         value: 360.0

i.e. the !astropy.coordinates.Angle still needs to be reconstructed, but how? Also, need to make sure there are no other attributes missed. I did not find any failures pointing to others in the io, coordinates or units tests though.

But I do see several more failures in table I cannot really make sense of:

___________________________________________________________________________________ test_vstack ____________________________________________________________________________________

    def test_vstack():
        """
        Vstack tables with mixin cols.
        """
        t1 = QTable(MIXIN_COLS)
        t2 = QTable(MIXIN_COLS)
        with pytest.raises(NotImplementedError):
>           vstack([t1, t2])

opt/lib/python3.10/site-packages/astropy/table/tests/test_mixin.py:504: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
opt/lib/python3.10/site-packages/astropy/table/operations.py:651: in vstack
    out = _vstack(tables, join_type, col_name_map, metadata_conflicts)
opt/lib/python3.10/site-packages/astropy/table/operations.py:1396: in _vstack
    col = col_cls.info.new_like(cols, n_rows, metadata_conflicts, out_name)
opt/lib/python3.10/site-packages/astropy/units/quantity.py:224: in new_like
    out = self._construct_from_dict(map)
opt/lib/python3.10/site-packages/astropy/utils/data_info.py:387: in _construct_from_dict
    args = [map.pop(attr) for attr in self._construct_from_dict_args]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

.0 = <list_iterator object at 0x12adeda20>

>   args = [map.pop(attr) for attr in self._construct_from_dict_args]
E   KeyError: 'value'

opt/lib/python3.10/site-packages/astropy/utils/data_info.py:387: KeyError
...
============================================================================= short test summary info ==============================================================================
FAILED opt/lib/python3.10/site-packages/astropy/table/tests/test_mixin.py::test_vstack - KeyError: 'value'
FAILED opt/lib/python3.10/site-packages/astropy/table/tests/test_operations.py::TestJoin::test_col_meta_merge[QTable] - KeyError: 'value'
FAILED opt/lib/python3.10/site-packages/astropy/table/tests/test_operations.py::TestVStack::test_col_meta_merge_inner[QTable] - KeyError: 'value'
FAILED opt/lib/python3.10/site-packages/astropy/table/tests/test_operations.py::TestVStack::test_mixin_functionality[latitude] - KeyError: 'value'
FAILED opt/lib/python3.10/site-packages/astropy/table/tests/test_operations.py::TestVStack::test_mixin_functionality[longitude] - KeyError: 'value'
FAILED opt/lib/python3.10/site-packages/astropy/table/tests/test_operations.py::TestVStack::test_mixin_functionality[quantity] - KeyError: 'value'
FAILED opt/lib/python3.10/site-packages/astropy/table/tests/test_operations.py::test_stack_columns - KeyError: 'value'
FAILED opt/lib/python3.10/site-packages/astropy/table/tests/test_operations.py::test_mixin_join_regression - KeyError: 'value'

(all with the same KeyError in map.pop(attr) for attr in self._construct_from_dict_args, but why is 'value' not present in those cases???)

dhomeier · 2021-11-22T17:23:08Z

Stealing the _represent_as_dict implementation from representations seems to do.

mhvk · 2021-11-22T19:24:21Z

astropy/units/quantity.py

+    def _represent_as_dict(self, attrs=None):
+        q = self._parent
+        out = super()._represent_as_dict(attrs)
+        out['dtype'] = q.dtype.name


This won't work for structured quantities; do we need to overwrite _represent_as_dict at all?

I actually wonder why dtype is needed at all; in principle, value has the dtype information, so that should be propagated correctly if it isn't already...

p.s. For the structured types, would be good to include #12509.

It needs some instruction on how to use dtype.name to store the dtype information, apparently. value is simply written as something like
value: !astropy.table.SerializedColumn {name: lon}
so the dtype will have to be passed on to the Quantity constructor through some channel.

OK, thanks, I now see the problem: the .value does have a dtype and that gets used, but then the result is passed through u.Quantity(value, unit) and the dtype is lost at that point. What perhaps we could do is only pass the dtype that actually is converted, i.e.,

if q.dtype.kind in 'iu': out['dtype'] = q.dtype.name

This also solves the problem that what you have now would fail for structured dtype (which probably means there should be a test for that...)

Yes; I thought about something like that for the first implementation to resolve the Time regression either intercepting object or other complex dtypes, or restricting explicitly to integers. The latter choice is probably more prudent; this only leaves the corner case of wrap_angle in Longitude, which is converted to a float64 default; also floats will no longer be returned in native endianness.

I think currently scalars are treated differently from arrays, as they are typeset as string instead of a binary blob. Not sure if that is worth changing...

p.s. For the structured types, would be good to include #12509.

@dhomeier @mhvk, merged!

taldcroft · 2021-11-22T19:35:12Z

astropy/table/tests/conftest.py

@@ -69,7 +69,7 @@ class MyTable(table.Table):
 # and masked (MaskedArray) column.


-@pytest.fixture(params=['unmasked', 'masked', 'subclass'])
+@pytest.fixture(params=['unmasked', 'masked', 'subclass', 'quantity'])


This adds a new case to a lot (hundreds?) of tests since this fixture is used a lot. While on principle it never hurts to add more tests, I'm not sure if this is global addition is helping our coverage or just burning CI time and carbon.

Yes, initially I was only looking for a simple way to add QTable coverage to some tests...
This is adding 339 tests to the existing 2079 tests in table – I found about 0.8 s or 5 % longer runtime for that.
Don't know if that is worth it, or QTable is just not that critical compared to MaskedTable or the subclass.

mhvk · 2021-11-22T19:38:55Z

More general comment: there have been quite a few issues about Quantity automatically converting to float, and whether or not that is desired. My sense remains that generally it is desired, as an integer quantity makes little physical sense (apart from spin!). I think there is a lot to be said for avoiding silent truncation errors from, e.g., in-place operations.

Given this, I wonder if it isn't better to use some kind of option for the cases where one really wants to preserve the data type with which the data are stored internally. After all, conversion to float is not that different from applying BZERO and BSCALE to integer image data.

taldcroft

This is looking good to me apart from the one comment about the conftest update.

dhomeier · 2021-11-22T22:35:28Z

More general comment: there have been quite a few issues about Quantity automatically converting to float, and whether or not that is desired. My sense remains that generally it is desired, as an integer quantity makes little physical sense (apart from spin!). I think there is a lot to be said for avoiding silent truncation errors from, e.g., in-place operations.

Yes, though to me the biggest concern seemed that many in-place operations would simply fail on integer quantities.

Put it another way, operations that may automatically cast to float, in order of decreasing desirability, to me would be

u.Quantity(arr)
QTable(tab), QTable([columns])
QTable.read(file)
QTable(tab, dtype=tab.dtype)
QTable.read(file, do_not_scale_table_data=True)

where the last option would offer a yet to be implemented option analogous to do_not_scale_image_data for ensuring that the data are read in as stored. That was actually a kind of minimal solution I should have pointed out a bit more in the initial comment. But the original specifically issue arose from the large memory footprint of data read in from file, so there should be some form of control at that step.

mhvk · 2021-11-23T01:27:37Z

@dhomeier - good to see your summary - I agree with the order, and that (4) and (5) for sure should respect the dtype. I think I'm actually also on-board with (3) , though maybe prefer (5) with the keyword argument to control it (as in the difference between (2) and (4)).

Also, stopping the auto-conversion to float (i.e., (3)) may well break code that relies on QTable doing that...

dhomeier · 2021-11-23T16:50:30Z

This version implements (3) - (5) now.

There are pros and cons for adding 'f' (and 'c'?) to the dtype kinds to be serialised; the wrap_angle case pointed out above is probably not very relevant, and both keeping read-in float columns in big-endian format and automatically converting them to native has its own points in favour.

Also, stopping the auto-conversion to float (i.e., (3)) may well break code that relies on QTable doing that...

That's probably still worth discussing; I'd think that since creating a QTable with integer quantities already requires some very explicit steps, people would generally be aware of the format. But it might be safer to keep the old behaviour as default.
On adding an option to QTable.read() for controlling this: both outcomes can already be effected relatively easily by forcing conversion with QTable(Table.read(file)) in the current version, or
t = Table.read(file); qt = QTable(t, dtype=t.dtype) if conversion is made the default.

I've checked locally this works with #12509, so this should be safe to rebase once either PR is merged. Need to revert the yaml docstring changes anyway.
Pinging @saimn and @nstarman for opinions.

nstarman · 2021-11-29T17:21:35Z

I've checked locally this works with #12509, so this should be safe to rebase once either PR is merged. Need to revert the yaml docstring changes anyway.

Merged! So good to rebase.

More general comment: there have been quite a few issues about Quantity automatically converting to float, and whether or not that is desired. My sense remains that generally it is desired, as an integer quantity makes little physical sense (apart from spin!). I think there is a lot to be said for avoiding silent truncation errors from, e.g., in-place operations.

I have more mixed feelings about this. When wearing my science hat I agree completely: rarely is the int dtype desirable and if it's necessary should be explicitly stated.

stays int: Quantity([1, 2, 3], dtype=int)
upcast to float: Quantity([1, 2, 3])

When wearing my comp-sci hat, the thought of auto-upcasting is less attractive. Quantity is an extension of ndarray and this is a departure from their treatment of dtype. At the very least we should more explicitly document this. I don't feel I was sufficiently aware of these issues until reading the discussion in this PR.

Given this, I wonder if it isn't better to use some kind of option for the cases where one really wants to preserve the data type with which the data are stored internally. After all, conversion to float is not that different from applying BZERO and BSCALE to integer image data.

I do think there might be a comfortable middle ground:

For a list x = [1, 2, 3], using the following constructor

`dtype=`	`np.array()`	`Quantity()`	comment
`float`	`float`	`float`	specified
`int`	`int`	`int`	specified
`None`	`int`	`float`	upcast
`...`	TypeError	`int`	passed to numpy

The Ellipsis option would essentially mean "pass it on". Numpy would be the end-point here, so a Longitude(x, dtype=...) -> Angle -> SpecificTypeQuantity -> Quantity -> ndarray, and only all the Quantity subclasses understand to delegate dtype to super, until it hits numpy. When passed to numpy the Ellipsis would be changed to None to use numpy's dtype detection machinery.

The advantage of this approach is that there is a simple means of specifying the dtype without knowing what it is. It's essentially equivalent to

x = [1, 2, 3]
arr = np.array(x)
q = Quantity(arr, dtype=arr.dtype)

but allows the user to skip the explicit construction of an intermediate array.

If we apply this consistently, I hope it can solve (or at least alleviate) @dhomeier's list of upcast situations (#12505 (comment)). 🤞
Each class can have appropriate (& well documented) default rules for upcasting, and allow for an easy override to force the dtypes to be kept unchanged.

dhomeier · 2021-12-18T14:03:55Z

Rebased (it works!) to get the discussion going again.

When wearing my comp-sci hat, the thought of auto-upcasting is less attractive. Quantity is an extension of ndarray and this is a departure from their treatment of dtype. At the very least we should more explicitly document this. I don't feel I was sufficiently aware of these issues until reading the discussion in this PR.

The unexpected upcasting was also the initial issue triggering this, so the use case is certainly relevant enough.
I do quite like that proposal @nstarman!

The Ellipsis option would essentially mean "pass it on". Numpy would be the end-point here, so a Longitude(x, dtype=...) -> Angle -> SpecificTypeQuantity -> Quantity -> ndarray, and only all the Quantity subclasses understand to delegate dtype to super, until it hits numpy. When passed to numpy the Ellipsis would be changed to None to use numpy's dtype detection machinery.

The advantage of this approach is that there is a simple means of specifying the dtype without knowing what it is. It's essentially equivalent to
x = [1, 2, 3]
arr = np.array(x)
q = Quantity(arr, dtype=arr.dtype)
but allows the user to skip the explicit construction of an intermediate array.

Just wondering how many in places code needs to be changed to handle the dtype='...' setting...

And for Quantity that would need a decision whether '...' should become the new default (still a change in behaviour), or keep that at None+upcast. But even being able to simply call Quantity(array_like, dtype='...') without having to bother what the actual dtype is would be a great advantage IMO.

github-actions · 2022-07-20T05:39:46Z

I'm going to close this pull request as per my previous message. If you think what is being added/fixed here is still important, please remember to open an issue to keep track of it. Thanks!

If this is the first time I am commenting on this issue, or if you believe I closed this issue incorrectly, please report this here.

hamogu · 2023-10-16T01:38:21Z

One point from #15447 which I opened and then learned that this PR already exists: I'm looking at specific fits tables that are the standard for all of X-ray astronomy, a so-called ARF/RMF/PHA files. All three of those file types are tables where most columns have good, physical fits units, e.g. keV or counts or counts/s or cm**2; thus I'd like to read those files as a QTable. However, there are other columns that need to remain integers; in particular the "channel". A channel is by definition a discrete quantity and downstream software will fail with floating point numbers of channels; also channel numbers are frequently used as indexes into the other vector-valued columns (e.g. in an RMF, there a "response matrix". Since this matrix is very sparse, it's not stored in a rectangular form, but instead only the non-zero elements of a row are stored in the file and other columns give the integer indices for that, e.g. a row in the matrix might be non-zero for elements 567 to 681, and then there would be a column that says N_start=567 - obviously, that column should be an integer).

Without explaining the format in more detail here, there is no way to represent this as a QTable without the ability to have some of the columns as an int.

mhvk · 2023-10-16T11:53:00Z

@hamogu - yes, this PR was a good one that we should try to revive.

One question: does the channel column have a unit?

Also, for a possible rebase: what @nstarman ended up implementing was that u.Quantity(..., dtype=None) is what is needed to avoid the upcast (not the ... mentioned in the comments above; the change that we ended up with is to change the default to dtype=np.inexact to indicate auto-convert to float).

p.s. Apologies, but I won't be able to spend time here before feature-freeze: I'm stuck writing my next 5-year research grant until Nov 1st.

hamogu · 2023-10-16T12:51:32Z

It has a unit set in the fits header that does not parse with astropy, probably because it's not part of the main fits standard, but only defined by NASA's HEASARC fits working group (formerly "Office of guest investigator programs") (e.g. https://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/spectra/ogip_92_007/node7.html for PHA type I files)

WARNING: UnitsWarning: 'channel' did not parse as fits unit: At col 0, Unit 'channel' not supported by the FITS standard.

(Links given for future reference, not because it's needed to follow all of them to understand what I need Quantity to do to make this work.)

mhvk

I looked briefly at this PR again and am not sure it is quite complete: missing is some kind of option to tell QTable.read() to use it. Indeed, it may be that just adjusting the reading is the right thing to do.

Nevertheless a few comments on what is here too.

mhvk · 2023-10-16T13:19:02Z

astropy/table/table.py

@@ -3948,6 +3948,13 @@ class QTable(Table):

    """

+    def __init__(self, data=None, masked=False, names=None, dtype=None, **kwargs):


Just a quick look through the PR: here, I think one could write quantity_dtype=np.inexact - and then allow the option None.
We cannot use dtype since that is supposed to give the separate dtype for every column.

mhvk · 2023-10-16T13:19:44Z

astropy/table/table.py

            try:
-                qcol = q_cls(col.data, col.unit, copy=False, subok=True)
+                qcol = q_cls(col.data, col.unit, copy=False, subok=True, dtype=dtype)


Here, we can then just pass on dtype=self._quantity_dtype

mhvk · 2023-10-16T13:20:50Z

astropy/table/tests/test_info.py

    cls = t.ColumnClass.__name__
-    assert np.all(tinfo['class'] == [cls, cls, cls, cls, 'Time', 'SkyCoord'])
+    # QTable does not preserve 'quantity' description - should it?


Yes, it should...

Co-authored-by: Tom Aldcroft <taldcroft@noreply.users.github.com>

dhomeier · 2023-10-27T16:53:52Z

Just rebasing this with the black/merge conflicts resolved; I'll need more time to look into @mhvk's last round of comments, too.

dhomeier added io.fits units Bug labels Nov 19, 2021

github-actions bot added table and removed units labels Nov 19, 2021

dhomeier added the units label Nov 19, 2021

dhomeier requested review from mhvk and taldcroft November 19, 2021 22:58

dhomeier mentioned this pull request Nov 19, 2021

Round tripping non-float Quantities from QTable to fits converts all Quantities to floats #12494

Open

taldcroft reviewed Nov 20, 2021

View reviewed changes

dhomeier force-pushed the decode-dtype-mixins branch 2 times, most recently from e4500cc to 0333c0f Compare November 22, 2021 16:47

dhomeier marked this pull request as ready for review November 22, 2021 17:19

mhvk reviewed Nov 22, 2021

View reviewed changes

taldcroft reviewed Nov 22, 2021

View reviewed changes

dhomeier force-pushed the decode-dtype-mixins branch from 85dfc64 to 3b240a8 Compare December 18, 2021 12:14

github-actions bot added the closed-by-bot label Jul 20, 2022

github-actions bot closed this Jul 20, 2022

dhomeier reopened this Nov 8, 2022

dhomeier removed Close? closed-by-bot labels Nov 8, 2022

dhomeier added this to the v5.3 milestone Nov 8, 2022

github-actions bot added the Close? label Nov 9, 2022

nstarman added keep-open and removed Close? labels Nov 9, 2022

saimn modified the milestones: v5.3, v6.0 Apr 21, 2023

mhvk mentioned this pull request Oct 12, 2023

QTable converts numbers to float, if unit is unknown #15447

Open

mhvk reviewed Oct 16, 2023

View reviewed changes

dhomeier and others added 6 commits October 27, 2023 16:59

BUG: preserve dtypes from serialized mixin columns

af18bc1

Preserve dtypes from columns/kwarg on instantiating QTable

11e1e07

Add Quantity.dtype serialization to _represent_as_dict

9dfa5bd

Co-authored-by: Tom Aldcroft <taldcroft@noreply.users.github.com>

TST: add QTable tests including from columns/Table

83b5d5b

DOC: doctest fix + changelog

2c75c57

Limit serialisation to integer dtypes, QTable() to kwarg

af8f1d3

dhomeier force-pushed the decode-dtype-mixins branch from 3b240a8 to af8f1d3 Compare October 27, 2023 16:29

dhomeier requested a review from saimn as a code owner October 27, 2023 16:29

saimn modified the milestones: v6.0, v6.1 Nov 3, 2023

astrofrog modified the milestones: v6.1.0, v7.0.0 Apr 4, 2024

		@@ -3948,6 +3948,13 @@ class QTable(Table):

		"""

		def __init__(self, data=None, masked=False, names=None, dtype=None, **kwargs):

Preserve non-float dtypes in Quantity columns from serialized mixins #12505

Are you sure you want to change the base?

Preserve non-float dtypes in Quantity columns from serialized mixins #12505

Conversation

dhomeier commented Nov 19, 2021 • edited by pllim

Description

Checklist for package maintainer(s)

github-actions bot commented Nov 19, 2021

taldcroft commented Nov 20, 2021

taldcroft commented Nov 20, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhomeier Nov 22, 2021 • edited

Choose a reason for hiding this comment

mhvk commented Nov 20, 2021

dhomeier commented Nov 21, 2021 • edited

dhomeier commented Nov 22, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhomeier Nov 23, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhomeier Nov 22, 2021 • edited

Choose a reason for hiding this comment

mhvk commented Nov 22, 2021

taldcroft left a comment

Choose a reason for hiding this comment

dhomeier commented Nov 22, 2021

mhvk commented Nov 23, 2021

dhomeier commented Nov 23, 2021 • edited

nstarman commented Nov 29, 2021 • edited

dhomeier commented Dec 18, 2021

github-actions bot commented Jul 20, 2022

hamogu commented Oct 16, 2023

mhvk commented Oct 16, 2023

hamogu commented Oct 16, 2023

mhvk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhomeier commented Oct 27, 2023

dhomeier commented Nov 19, 2021 •

edited by pllim

dhomeier Nov 22, 2021 •

edited

dhomeier commented Nov 21, 2021 •

edited

dhomeier Nov 23, 2021 •

edited

dhomeier Nov 22, 2021 •

edited

dhomeier commented Nov 23, 2021 •

edited

nstarman commented Nov 29, 2021 •

edited