Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tab completion on DataArray instance fails in REPL when using Readline #8718

Closed
5 tasks done
kbvw opened this issue Feb 7, 2024 · 9 comments
Closed
5 tasks done

Tab completion on DataArray instance fails in REPL when using Readline #8718

kbvw opened this issue Feb 7, 2024 · 9 comments

Comments

@kbvw
Copy link

kbvw commented Feb 7, 2024

What happened?

In the standard Python REPL running on Linux, pressing Tab after a DataArray instance and a dot results in no completions. For example:

>>> import xarray as xr
>>> da = xr.DataArray()
>>> da.  # Press Tab, no completions

If another character is typed after the dot, completions seem to work fine again.

This issue does not occur in IPython launched directly in the shell. (It does, and gets worse, when using IPython from Emacs: see "Anything else?" below.)

After some digging and comparing I suspect this happens specifically in REPLs using Readline, e.g. the standard Python REPL when on Linux. (IPython itself does not use Readline, running it in Emacs does by default.) When using Readline, the completions come from the rlcompleter module in the standard library. Trying to manually obtain the completions of da. from the example above results in a TypeError raised by xarray's CombinedDatetimelikeAccessor: see the minimal example below.

What did you expect to happen?

I expected to see the attributes of the DataArray after pressing Tab, and I did not expect manually obtaining the completions to result in a TypeError raised by xarray.

Minimal Complete Verifiable Example

import xarray as xr
from rlcompleter import Completer

completer = Completer()
da = xr.DataArray()

# The following results in a TypeError pasted below,
# for a DataArray, a dot, and no further characters:
completer.complete('da.', 0)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/projects/code/xarray_completion_bug.py:7
      4 completer = Completer()
      5 da = xr.DataArray()
----> 7 completer.complete('da.', 0)

File ~/.pyenv/versions/3.11.2/lib/python3.11/rlcompleter.py:92, in Completer.complete(self, text, state)
     90 if state == 0:
     91     if "." in text:
---> 92         self.matches = self.attr_matches(text)
     93     else:
     94         self.matches = self.global_matches(text)

File ~/.pyenv/versions/3.11.2/lib/python3.11/rlcompleter.py:189, in Completer.attr_matches(self, text)
    187     matches.append(match)
    188     continue
--> 189 if (value := getattr(thisobject, word, None)) is not None:
    190     matches.append(self._callable_postfix(value, match))
    191 else:

File ~/projects/code/.pyvenv/lib/python3.11/site-packages/xarray/core/utils.py:1162, in UncachedAccessor.__get__(self, obj, cls)
   1159 if obj is None:
   1160     return self._accessor
-> 1162 return self._accessor(obj)

File ~/.pyenv/versions/3.11.2/lib/python3.11/typing.py:1248, in _BaseGenericAlias.__call__(self, *args, **kwargs)
   1245 if not self._inst:
   1246     raise TypeError(f"Type {self._name} cannot be instantiated; "
   1247                     f"use {self.__origin__.__name__}() instead")
-> 1248 result = self.__origin__(*args, **kwargs)
   1249 try:
   1250     result.__orig_class__ = self

File ~/projects/code/.pyvenv/lib/python3.11/site-packages/xarray/core/accessor_dt.py:623, in CombinedDatetimelikeAccessor.__new__(cls, obj)
    617 def __new__(cls, obj: T_DataArray) -> CombinedDatetimelikeAccessor:
    618     # CombinedDatetimelikeAccessor isn't really instantiated. Instead
    619     # we need to choose which parent (datetime or timedelta) is
    620     # appropriate. Since we're checking the dtypes anyway, we'll just
    621     # do all the validation here.
    622     if not _contains_datetime_like_objects(obj.variable):
--> 623         raise TypeError(
    624             "'.dt' accessor only available for "
    625             "DataArray with datetime64 timedelta64 dtype or "
    626             "for arrays containing cftime datetime "
    627             "objects."
    628         )
    630     if is_np_timedelta_like(obj.dtype):
    631         return TimedeltaAccessor(obj)  # type: ignore[return-value]

TypeError: '.dt' accessor only available for DataArray with datetime64 timedelta64 dtype or for arrays containing cftime datetime objects.

Anything else we need to know?

I found this issue by using xarray in IPython, launched from Emacs. Not seeing the completions as described above is barely a nuisance and I probably would not have noticed, but from Emacs, using the default "native completions" (i.e. Readline), pressing Tab freezes the editor for about five seconds, followed by no completions. The next command that is run results in a SyntaxError, printing back the entered command with 0__dum prepended.

This exacerbation is of course an Emacs issue and not an xarray issue, but I am including it here since this is how I found out in the first place, and since the root cause seems to be the xarray error above, which is not properly handled by Emacs.

For Emacs users reading this, the workaround is to disable "native completions": for the case of IPython this uses IPython's own completer directly, which seems to work fine.

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.2 (main, Feb 12 2023, 16:44:30) [GCC 12.2.1 20230201]
python-bits: 64
OS: Linux
OS-release: 6.7.1-arch1-1
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.3-development

xarray: 2024.1.1
pandas: 2.0.3
numpy: 1.25.2
scipy: 1.10.1
netCDF4: 1.6.5
pydap: None
h5netcdf: 1.3.0
h5py: 3.10.0
Nio: None
zarr: None
cftime: 1.6.3
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.1.1
distributed: 2024.1.1
matplotlib: 3.7.2
cartopy: None
seaborn: 0.12.2
numbagg: None
fsspec: 2023.6.0
cupy: None
pint: None
sparse: None
flox: 0.9.0
numpy_groupies: 0.10.2
setuptools: 68.1.0
pip: 24.0
conda: None
pytest: 7.4.0
mypy: None
IPython: 8.21.0
sphinx: 7.2.6

@kbvw kbvw added bug needs triage Issue that has not been reviewed by xarray team member labels Feb 7, 2024
Copy link

welcome bot commented Feb 7, 2024

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@dcherian
Copy link
Contributor

dcherian commented Feb 7, 2024

From https://docs.python.org/3/library/rlcompleter.html

Any exception raised during the evaluation of the expression is caught, silenced and None is returned.

This seems like a bug in the rlcompleter?

@dcherian dcherian removed bug needs triage Issue that has not been reviewed by xarray team member labels Feb 7, 2024
@kbvw
Copy link
Author

kbvw commented Feb 7, 2024

Looking at the source of rlcompleter, what seems to be "evaluated" here with its exceptions silenced is the name of the DataArray itself.

Later there is a call to getattr with dt as found in the __dir__ of the DataArray, using a default of None. This results in the TypeError above, which is not silenced.

Example:

import xarray as xr
da = xr.DataArray()

# This results in the same TypeError from xarray
getattr(da, 'dt', None)

While I agree that the wording in the rlcompleter documentation is a bit vague there, I would still argue that this is a rather unexpected result of calling getattr on the DataArray?

@max-sixty
Copy link
Collaborator

Possibly we could only supply .dt in __dir__ iff the array has a datetime type?

__dir__ needs to be really fast, but possibly this is 500 mics we can spare?

@kbvw
Copy link
Author

kbvw commented Feb 8, 2024

@max-sixty, I'm honestly new to xarray even as a user, so unfamiliar with the internals. Hence the aggressive Tab-hitting to explore. :)

And the initial issue is admittedly very minor: after some time and red herrings, I just thought I'd better document the root cause.

Intuitively however, I'd say changing __dir__ might indeed solve the initial problem with rlcompleter, but it wouldn't change the fact that getattr(xr.DataArray(), 'dt', None) raises that TypeError. That might come back in other places or with other tools: in general with Python, if you call getattr on something with a default, you'd expect it to be safe?

@max-sixty
Copy link
Collaborator

And the initial issue is admittedly very minor: after some time and red herrings, I just thought I'd better document the root cause.

For sure, appreciate it!

if you call getattr on something with a default, you'd expect it to be safe?

Maaaybe. I agree it's a bit unusual to have a non-attribute error on an attribute access.

We could use AttributeError; notably pandas does this:

pd.DataFrame(np.array([0., 1.]))[0].dt

AttributeError: Can only use .dt accessor with datetimelike values

...which would also likely avoid any magic with __dir__.

What do others think?

@kbvw
Copy link
Author

kbvw commented Feb 8, 2024

Sure, but just to clarify:

getattr(pd.DataFrame(np.array([0., 1.]))[0], 'dt', None)

does not raise anything.

Of course, if you go around accessing attributes at will, that might not be safe, and you would expect an AttributeError or perhaps another type of error. It's specifically when you supply a default to getattr that it becomes unexpected. getattr(xr.DataArray(), 'foo', None) does not raise anything either for example.

Apparently rlcompleter in the standard library expects that to be safe, and I wouldn't be surprised to find other cases.

@max-sixty
Copy link
Collaborator

does not raise anything.

Right, that's because without the default, it raises an AttributeError. So my suggestion was to switch our current TypeError to an AttributeError...

@kbvw
Copy link
Author

kbvw commented Feb 8, 2024

Yes makes sense, getattr will catch it then, sorry :)

max-sixty added a commit to max-sixty/xarray that referenced this issue Feb 8, 2024
max-sixty added a commit that referenced this issue Feb 9, 2024
* Switch `.dt` to raise an `AttributeError`

Discussion at #8718

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
flamingbear pushed a commit to flamingbear/xarray that referenced this issue Feb 12, 2024
* Switch `.dt` to raise an `AttributeError`

Discussion at pydata#8718

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants