Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix a crash when calling Column.pprint on a scalar column #15749

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

neutrinoceros
Copy link
Contributor

Description

Fixes #12584

This is a proof of concept fix because it breaks another regression test, however I haven't been able to find a general solution that passes all tests yet, so instead of going in circles I'd rather open this to early feedback.

ping @hamogu @mhvk @taldcroft and @nstarman

  • By checking this box, the PR author has requested that maintainers do NOT use the "Squash and Merge" button. Maintainers should respect this when possible; however, the final decision is at the discretion of the maintainer that merges the PR.

Copy link

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see instructions for rebase and squash.
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • Is a milestone set? Milestone must be set but we cannot check for it on Actions; do not let the green checkmark fool you.
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

@neutrinoceros neutrinoceros changed the title TST: add a regression test for bug 15736 BUG: fix a crash when calling Column.pprint on a scalar column Dec 15, 2023
@@ -52,6 +52,9 @@ ctypedef object (*item_getter)(object, object)


cdef inline object base_getitem(object self, object item, item_getter getitem):
if (<np.ndarray>self).ndim == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change may be the one causing problems... Though see larger comment on what would be the right solution in the original issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like my latest attempt actually works (and doesn't break anything) !

def test_pprint_scalar(self, scalar, show_dtype):
# see https://github.com/astropy/astropy/issues/12584
c = Column(scalar)
c.pprint(show_dtype=show_dtype)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I am deliberately not checking what's actually printed because I think it may be incorrect at the moment for reasons that go beyond the scope of the PR.
Specifically I'm looking at

# If scalar then just convert to correct numpy type and use numpy repr
if self.ndim == 0:
return repr(self.item())
, which effectively forces scalar Columns to print like pure numerical scalars (leaving the unit out if data is a Quantity !). I'll open a separate issue for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be valuable to see the scalar aspect of the printing, even if the unit is wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with @nstarman here, we should test the actual output, especially now that I think we have a more permanent solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly #15754 should be closed as "not a bug" and the behaviour, as of the current state of this branch, should be made the baseline of the test ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for .pprint , we can call the current behaviour fine. But #15754 is about repr, which I'm not sure about. Since it is not addressed here, I think we should just leave it open.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on testing the output. In fact this whole issue has been focused on pprint but really the problem is in pformat. I would not worry about show_dtype here and just have the two cases of 1 and 1.0 eV. (There is no need here to test formatting of the Quantity itself, so it helps in testing to pick values that have an exact floating point repr).

In a lot of tests I use a pattern like:

c = Column(scalar, name="a")
out = c.pformat()
exp = [' a ', '---', '  1']
assert out == exp

This way if there are diffs then running pytest with -vv will show what's going on. You can parametrize the scalar and exp values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done !

@pllim

This comment was marked as resolved.

@neutrinoceros
Copy link
Contributor Author

@pllim actually I think it's good now. Hopefully this is simple enough !

@neutrinoceros neutrinoceros marked this pull request as ready for review December 15, 2023 20:33
@pllim pllim modified the milestones: v6.1.0, v6.0.1 Dec 15, 2023
@pllim pllim added the 💤 backport-v6.0.x on-merge: backport to v6.0.x label Dec 15, 2023
@neutrinoceros
Copy link
Contributor Author

@mhvk can I ask you to review this again ?

@taldcroft
Copy link
Member

@neutrinoceros - can you do some quick performance checking to see how much speed impact there is to Column getitem? I would not want to slow down astropy for this fix.

@neutrinoceros
Copy link
Contributor Author

@taldcroft Here's a very quick benchmark

#benchmark.py
from time import monotonic_ns
from astropy.table import Column
import numpy as np


SMALL_C = Column(np.random.random_sample(16))
BIG_C = Column(np.random.random_sample(2048))
NRUNS = 1_000_000

SMALL_random_points = np.random.randint(0, len(SMALL_C) - 1, NRUNS)
BIG_random_points = np.random.randint(0, len(BIG_C) - 1, NRUNS)
for data, points, label in [
    (SMALL_C, SMALL_random_points, "small column"),
    (BIG_C, BIG_random_points, "big column"),
]:
    tstart = monotonic_ns()
    for p in points:
        data[p]
    tstop = monotonic_ns()
    res_ns = (tstop - tstart) / NRUNS
    print(
        f"Accessed one item from {label} in {res_ns:.2f} ns (averaged over {NRUNS:g} runs)"
    )

Main

Accessed one item from small column in 145.71 ns (averaged over 1e+06 runs)
Accessed one item from big column in 135.75 ns (averaged over 1e+06 runs)

This branch

Accessed one item from small column in 149.53 ns (averaged over 1e+06 runs)
Accessed one item from big column in 139.42 ns (averaged over 1e+06 runs)

So I see about 4% overhead on Column.__getitem__ from this branch. Is this acceptable to you ?

@mhvk
Copy link
Contributor

mhvk commented Dec 18, 2023

Copying from #12584 (comment), I think we first need to decide how a zero-length column would actually be pretty-printed. Note that the repr is consciously different:

In [11]: Column(1)
Out[11]: 1

In [12]: Column([1])
Out[12]: 
<Column dtype='int64' length=1>
1

I do think pprint() should not fail, but it may be OK to just typeset the number with the format function without having the column name, etc. I.e., I'd advocate some form of if self.ndim == 0; return <something-simple>.

@neutrinoceros
Copy link
Contributor Author

@mhvk if we want to just go with your suggestion (just typesetting the number), then this patch is sufficient and we can close #15754 as "not a bug". In that case I'll just need to complete my test

@taldcroft
Copy link
Member

@neutrinoceros - is there any solution that does not require changing that Cython mixin code? I haven't dug into this, but what is driving that exactly? Can't we just do something trivial in pprint() to catch this like what @mhvk said if self.ndim == 0; return <something-simple>?

I'm a little hesitant to sacrifice 3-4% performance for this bugfix given the low likelihood of normal users hitting it.

@neutrinoceros
Copy link
Contributor Author

@taldcroft it's very hard to inspect what happening exactly before the cython function is called because the callsite is actually visited many times before it crashes and it happens in repr-related code, making it difficult/impossible to inspect variables in a debugger REPL, so patching the Cython function is the only way I was able to find so far.
Is it possible that my benchmark isn't representative of actual user code that may be impacted ? In other words, do we know of an actual performance-critical use case ?

@taldcroft
Copy link
Member

@neutrinoceros - below is a patch to your PR that passes the new tests. This reverts the Cython update and deals with a scalar column only in the pprint code.

In [2]: c = Column(0, name="scalar")

In [3]: c.pprint()  # Not obvious it is scalar, but mostly we just want to prevent a crash
scalar
------
     0

It also occurred to me that the Cython update would have the undesired effect of ignoring the index in getitem. I didn't try, but I think that code would mean that c[250] == 0 for the above column would be True instead of raising IndexError.

Here is the diff:

diff --git a/astropy/table/_column_mixins.pyx b/astropy/table/_column_mixins.pyx
index bdd075528b..5ab4fe66d3 100644
--- a/astropy/table/_column_mixins.pyx
+++ b/astropy/table/_column_mixins.pyx
@@ -52,9 +52,6 @@ ctypedef object (*item_getter)(object, object)
 
 
 cdef inline object base_getitem(object self, object item, item_getter getitem):
-    if (<np.ndarray>self).ndim == 0:
-        return self.data
-
     if (<np.ndarray>self).ndim > 1 and isinstance(item, INTEGER_TYPES):
         return self.data[item]
 
diff --git a/astropy/table/pprint.py b/astropy/table/pprint.py
index 25d990ec0c..e78f099d35 100644
--- a/astropy/table/pprint.py
+++ b/astropy/table/pprint.py
@@ -424,6 +424,7 @@ class TableFormatter:
         """
         max_lines, _ = self._get_pprint_size(max_lines, -1)
         dtype = getattr(col, "dtype", None)
+        is_onedim = getattr(col, "ndim", 1) == 1
         multidims = getattr(col, "shape", [0])[1:]
         if multidims:
             multidim0 = tuple(0 for n in multidims)
@@ -524,8 +525,11 @@ class TableFormatter:
                     left = format_func(col_format, col[(idx,) + multidim0])
                     right = format_func(col_format, col[(idx,) + multidim1])
                     return f"{left} .. {right}"
-            else:
+            elif is_onedim:
                 return format_func(col_format, col[idx])
+            else:
+                # Scalar column
+                return format_func(col_format, col)
 
         # Add formatted values if within bounds allowed by max_lines
         for idx in indices:

@neutrinoceros
Copy link
Contributor Author

I didn't try, but I think that code would mean that c[250] == 0 for the above column would be True instead of raising IndexError.

I confirm that's exactly what happens, nice catch ! I've added this check to the test to make sure this behaviour never gets checked in. I took your patch in and rebase the whole branch too; thanks a lot for your help

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good - just a few nits.

astropy/table/pprint.py Outdated Show resolved Hide resolved
astropy/table/pprint.py Show resolved Hide resolved
astropy/table/pprint.py Outdated Show resolved Hide resolved
def test_pprint_scalar(self, scalar, show_dtype):
# see https://github.com/astropy/astropy/issues/12584
c = Column(scalar)
c.pprint(show_dtype=show_dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with @nstarman here, we should test the actual output, especially now that I think we have a more permanent solution.

@neutrinoceros neutrinoceros force-pushed the table/bug/pprint_scalar_12584 branch 2 times, most recently from 5236612 to c9f1ca6 Compare December 26, 2023 10:37
@neutrinoceros
Copy link
Contributor Author

rebased to refresh CI. ping @hamogu and @taldcroft for second review

@saimn saimn modified the milestones: v6.0.1, v6.0.2 Mar 25, 2024
@astrofrog astrofrog modified the milestones: v6.0.2, v6.1.1 Apr 4, 2024
@pllim pllim added backport-v6.1.x on-merge: backport to v6.1.x and removed 💤 backport-v6.0.x on-merge: backport to v6.0.x labels May 6, 2024
@neutrinoceros neutrinoceros force-pushed the table/bug/pprint_scalar_12584 branch from c501c93 to 46c2e30 Compare May 16, 2024 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v6.1.x on-merge: backport to v6.1.x Bug table
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Column.pprint fails for scalars
7 participants