Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: TypeError when plotting against list of datetime.date where 0th element of list is None #23580

Closed
myUsernameIsNotMyPassword opened this issue Aug 7, 2022 · 13 comments
Labels
Good first issue Open a pull request against these issues if there are no active ones! status: has patch patch suggested, PR still needed topic: date handling
Milestone

Comments

@myUsernameIsNotMyPassword
Copy link

myUsernameIsNotMyPassword commented Aug 7, 2022

Bug summary

pyplot raises an error when a list of dates starts with None.

Code for reproduction

from datetime import *

from matplotlib import pyplot as plt


y = [6, 2, 8, 3, 1, 8, 5, 3, 0, 7]

x = [date.today() + timedelta(days=i) for i in range(10)]
x[5] = None
x[-1] = None

plt.plot(x, y) # works
plt.show()

x[0] = None

plt.plot(x, y) # TypeError
plt.show()

Actual outcome

Traceback (most recent call last):
  File "pyplot_test.py", line 17, in <module>
    plt.plot(x, y)
  File "<python_path>\lib\site-packages\matplotlib\pyplot.py", line 2769, in plot
    return gca().plot(
  File "<python_path>\lib\site-packages\matplotlib\axes\_axes.py", line 1634, in plot
    self.add_line(line)
  File "<python_path>\lib\site-packages\matplotlib\axes\_base.py", line 2288, in add_line
    self._update_line_limits(line)
  File "<python_path>\lib\site-packages\matplotlib\axes\_base.py", line 2311, in _update_line_limits
    path = line.get_path()
  File "<python_path>\lib\site-packages\matplotlib\lines.py", line 999, in get_path
    self.recache()
  File "<python_path>\lib\site-packages\matplotlib\lines.py", line 652, in recache
    x = _to_unmasked_float_array(xconv).ravel()
  File "<python_path>\lib\site-packages\matplotlib\cbook\__init__.py", line 1298, in _to_unmasked_float_array
    return np.asarray(x, float)
  File "<python_path>\lib\site-packages\numpy\core\_asarray.py", line 85, in asarray
    return array(a, dtype, copy=False, order=order)
TypeError: float() argument must be a string or a number, not 'datetime.date'

Expected outcome

plot data starting at index 1.

Additional information

No response

Operating system

windows 10

Matplotlib Version

3.5.2

Matplotlib Backend

TkAgg

Python version

3.8.2

Jupyter version

No response

Installation

pip

@tacaswell
Copy link
Member

Unfortunately I think this is the expected behavior (I was actually a bit surprised it worked with Nones at all). It works because in the time handling code paths we delegate most of the work to numpy and it will cast None to NaT :

In [7]: np.asarray(x, dtype=np.datetime64)
Out[7]: 
array(['2022-08-07', '2022-08-08', '2022-08-09', '2022-08-10',
       '2022-08-11',        'NaT', '2022-08-13', '2022-08-14',
       '2022-08-15', '2022-08-16'], dtype='datetime64[D]')

In [8]: x
Out[8]: 
[datetime.date(2022, 8, 7),
 datetime.date(2022, 8, 8),
 datetime.date(2022, 8, 9),
 datetime.date(2022, 8, 10),
 datetime.date(2022, 8, 11),
 None,
 datetime.date(2022, 8, 13),
 datetime.date(2022, 8, 14),
 datetime.date(2022, 8, 15),
 datetime.date(2022, 8, 16)]

The reason this fails with None as the first element is that we do not correctly identify the units (and hence which handlers to use). For the automatic unit detection (we handle datetime through the unit system because time is a number that has time-like units attached to it) we look at the first element (because we assume that the input lists are homogeneous).

We could add a special case to that dispatch to keep walking down the sequence if a None is found. On one hand that would fix this issue and smooth out the rough edge identified here. On the other hand it makes that code even more complex.

I could also see a case for exploding a lot earlier if there is a None in the data, but that is just doubling down on a failure mode that depends on the location in the input of a bad value which is less than ideal.

In the short term, the user-side fix is to use np.array(..., dytpe=np.datetime64) before you pass the values to Matplotlib which ensure that the first element is always a type that will go down the correct unit dispatch.

@anntzer
Copy link
Contributor

anntzer commented Aug 7, 2022

diff --git i/lib/matplotlib/cbook/__init__.py w/lib/matplotlib/cbook/__init__.py
index b9d5f5eb26..9cbf273259 100644
--- i/lib/matplotlib/cbook/__init__.py
+++ w/lib/matplotlib/cbook/__init__.py
@@ -1691,19 +1691,12 @@ def safe_first_element(obj):
     This is an type-independent way of obtaining the first element, supporting
     both index access and the iterator protocol.
     """
-    if isinstance(obj, collections.abc.Iterator):
-        # needed to accept `array.flat` as input.
-        # np.flatiter reports as an instance of collections.Iterator
-        # but can still be indexed via [].
-        # This has the side effect of re-setting the iterator, but
-        # that is acceptable.
-        try:
-            return obj[0]
-        except TypeError:
-            pass
-        raise RuntimeError("matplotlib does not support generators "
-                           "as input")
-    return next(iter(obj))
+    if isinstance(obj, np.flatiter):
+        return obj[0]
+    elif isinstance(obj, collections.abc.Iterator):
+        raise RuntimeError("matplotlib does not support generators as input")
+    else:
+        return next(val for val in obj if val is not None)
 
 
 def sanitize_sequence(data):

seems reasonable enough to me? (except that the function becomes a bit misnamed, but such is life).
This assumes that the goal of the first cutout is indeed just to support flatiter (which seems to be the intent of #6712).

@jklymak
Copy link
Member

jklymak commented Aug 7, 2022

None working elsewhere in the list seems to me an implementation accident rather than something we purposely designed. I'm not aware of us using None to mean no data elsewhere in the library (but I could be wrong) so I am not sure we should special case it here.

@tacaswell
Copy link
Member

I think this is an "in for a penny, in for a pound" situation. We are very permissive in terms of what we take as input and currently do take lists of datetimes with None anywhere but the first element. To be consistent we either need to lock down what we take as input more (which is not really feasible) or take this extra complexity.

I'm leaning towards "take the complexity" because that is our job.

@tacaswell tacaswell added this to the v3.7.0 milestone Aug 7, 2022
@tacaswell tacaswell added Good first issue Open a pull request against these issues if there are no active ones! status: has patch patch suggested, PR still needed labels Aug 7, 2022
@tacaswell
Copy link
Member

I'm labeling this as a good first issue because we have a plausbile looking patch and there is no (explicit) new public API.

Steps:

  1. convert the patch in [Bug]: TypeError when plotting against list of datetime.date where 0th element of list is None #23580 (comment) into a commit
  2. add a test (the OP's example is enough)

@MikiPWata
Copy link

Hi there,
Can I work on this?

@tacaswell
Copy link
Member

@MikiPWata Yes, just be aware we do not consider an issue "claimed" until there is a PR and that there is not complete consensus among the devs on this yet.

@jklymak
Copy link
Member

jklymak commented Aug 7, 2022

I checked that None is also ignored in bare lists of floats, so I guess this is OK. We probably should document this somehow though...

@MikiPWata
Copy link

@jklymak
my PR is pretty much good to go(passed all checks), so would love to have a review. Thank you in advance!

@tacaswell
Copy link
Member

well, we finally had two people take up a "good first issue" at about the same time.

@MikiPWata
Copy link

Is there anything I can/should do to fix the test failures from CI?

@MikiPWata
Copy link

Finally passed all checks!
I was wondering if I should keep using the original methods(safe_first_element()) or the _safe_non_none() for all tests that uses safe_first_element.

I am pretty sure that I should be using the new method for the test made for this issue.

@timhoffm
Copy link
Member

Closed by #23587.

@tacaswell tacaswell modified the milestones: v3.7.0, v3.6.0 Aug 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good first issue Open a pull request against these issues if there are no active ones! status: has patch patch suggested, PR still needed topic: date handling
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants