Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: pandas window operator without group_by raises AttributeError: 'numpy.int64' object has no attribute 'reorder_levels' #5417

Closed
1 task done
ogrisel opened this issue Feb 2, 2023 · 4 comments · Fixed by #6217
Labels
bug Incorrect behavior inside of ibis

Comments

@ogrisel
Copy link
Contributor

ogrisel commented Feb 2, 2023

What happened?

This is a follow-up for #4676 that was recently fixed. But there remains a problem initially reported as a comment: #4676 (comment)

Here is the reproducer for that part:

import ibis
import pandas as pd

df = pd.DataFrame(
    {
        "x": [0, 1, 2, 3, 4],
        "y": [3, 2, 0, 1, 1],
    }
)
t_duckdb = ibis.memtable(df)
t_pandas = ibis.pandas.connect({"t": df}).table("t")


def simple_window_ops(t):
    w = ibis.window(          # no groub_by here!
        order_by=[t.x, t.y],
        preceding=1,
        following=0,
    )
    return t.mutate(
        x_first=t.x.first().over(w),
        x_last=t.x.last().over(w),
        y_first=t.y.first().over(w),
        y_last=t.y.last().over(w),
    )

Here is the traceback when executing:

>>> simple_window_ops(t_pandas).execute()
Traceback (most recent call last):
  Cell In[22], line 1
    simple_window_ops(t_pandas).execute()
  File ~/code/ibis/ibis/expr/types/core.py:300 in execute
    return self._find_backend(use_default=True).execute(
  File ~/code/ibis/ibis/backends/pandas/__init__.py:240 in execute
    return execute_and_reset(node, params=params, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:491 in execute_and_reset
    result = execute(
  File ~/mambaforge/envs/ibisdev/lib/python3.11/site-packages/multipledispatch/dispatcher.py:278 in __call__
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:437 in main_execute
    return execute_with_scope(
  File ~/code/ibis/ibis/backends/pandas/core.py:224 in execute_with_scope
    result = execute_until_in_scope(
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:356 in execute_until_in_scope
    result = execute_node(
  File ~/mambaforge/envs/ibisdev/lib/python3.11/site-packages/multipledispatch/dispatcher.py:278 in __call__
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/execution/selection.py:287 in execute_selection_dataframe
    result = build_df_from_projection(
  File ~/code/ibis/ibis/backends/pandas/execution/selection.py:251 in build_df_from_projection
    data_pieces = [
  File ~/code/ibis/ibis/backends/pandas/execution/selection.py:252 in <listcomp>
    compute_projection(node, op, data, **kwargs) for node in selection_exprs
  File ~/code/ibis/ibis/backends/pandas/execution/selection.py:108 in compute_projection
    result = execute(node, scope=scope, timecontext=timecontext, **kwargs)
  File ~/mambaforge/envs/ibisdev/lib/python3.11/site-packages/multipledispatch/dispatcher.py:278 in __call__
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:437 in main_execute
    return execute_with_scope(
  File ~/code/ibis/ibis/backends/pandas/core.py:224 in execute_with_scope
    result = execute_until_in_scope(
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:328 in execute_until_in_scope
    scopes = [
  File ~/code/ibis/ibis/backends/pandas/core.py:329 in <listcomp>
    execute_until_in_scope(
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:356 in execute_until_in_scope
    result = execute_node(
  File ~/mambaforge/envs/ibisdev/lib/python3.11/site-packages/multipledispatch/dispatcher.py:278 in __call__
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/execution/window.py:389 in execute_window_op
    result = post_process(
  File ~/code/ibis/ibis/backends/pandas/execution/window.py:93 in _post_process_order_by
    series = series.reorder_levels(names)
AttributeError: 'numpy.int64' object has no attribute 'reorder_levels'

while this same query runs fine with the duckdb backend:

>>> simple_window_ops(t_duckdb).execute()
   x  y  x_first  x_last  y_first  y_last
0  0  3        0       0        3       3
1  1  2        0       1        3       2
2  2  0        1       2        2       0
3  3  1        2       3        0       1
4  4  1        3       4        1       1

What version of ibis are you using?

Today's master branch.

What backend(s) are you using, if any?

pandas (and duckdb as reference)

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@ogrisel ogrisel added the bug Incorrect behavior inside of ibis label Feb 2, 2023
@ogrisel
Copy link
Contributor Author

ogrisel commented Feb 2, 2023

Note, that it's possible to further simplify this reproducer to:

import ibis
import pandas as pd

df = pd.DataFrame({"x": [0, 1, 2, 3, 4]})
t_pandas = ibis.pandas.connect({"t": df}).table("t")


def simple_window_ops(t):
    w = ibis.window(          # no groub_by here!
        order_by=t.x,
        preceding=1,
        following=0,
    )
    return t.mutate(
        x_first=t.x.first().over(w),
    )

simple_window_ops(t_pandas).execute()

and then get the following slightly different exception instead:

Traceback (most recent call last):
  Cell In[26], line 1
    simple_window_ops(t_pandas).execute()
  File ~/code/ibis/ibis/expr/types/core.py:300 in execute
    return self._find_backend(use_default=True).execute(
  File ~/code/ibis/ibis/backends/pandas/__init__.py:240 in execute
    return execute_and_reset(node, params=params, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:491 in execute_and_reset
    result = execute(
  File ~/mambaforge/envs/ibisdev/lib/python3.11/site-packages/multipledispatch/dispatcher.py:278 in __call__
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:437 in main_execute
    return execute_with_scope(
  File ~/code/ibis/ibis/backends/pandas/core.py:224 in execute_with_scope
    result = execute_until_in_scope(
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:356 in execute_until_in_scope
    result = execute_node(
  File ~/mambaforge/envs/ibisdev/lib/python3.11/site-packages/multipledispatch/dispatcher.py:278 in __call__
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/execution/selection.py:287 in execute_selection_dataframe
    result = build_df_from_projection(
  File ~/code/ibis/ibis/backends/pandas/execution/selection.py:251 in build_df_from_projection
    data_pieces = [
  File ~/code/ibis/ibis/backends/pandas/execution/selection.py:252 in <listcomp>
    compute_projection(node, op, data, **kwargs) for node in selection_exprs
  File ~/code/ibis/ibis/backends/pandas/execution/selection.py:108 in compute_projection
    result = execute(node, scope=scope, timecontext=timecontext, **kwargs)
  File ~/mambaforge/envs/ibisdev/lib/python3.11/site-packages/multipledispatch/dispatcher.py:278 in __call__
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:437 in main_execute
    return execute_with_scope(
  File ~/code/ibis/ibis/backends/pandas/core.py:224 in execute_with_scope
    result = execute_until_in_scope(
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:328 in execute_until_in_scope
    scopes = [
  File ~/code/ibis/ibis/backends/pandas/core.py:329 in <listcomp>
    execute_until_in_scope(
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/core.py:356 in execute_until_in_scope
    result = execute_node(
  File ~/mambaforge/envs/ibisdev/lib/python3.11/site-packages/multipledispatch/dispatcher.py:278 in __call__
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/trace.py:136 in traced_func
    return func(*args, **kwargs)
  File ~/code/ibis/ibis/backends/pandas/execution/window.py:389 in execute_window_op
    result = post_process(
  File ~/code/ibis/ibis/backends/pandas/execution/window.py:94 in _post_process_order_by
    series = series.iloc[index.argsort(kind='mergesort')]
AttributeError: 'numpy.int64' object has no attribute 'iloc'

@ogrisel
Copy link
Contributor Author

ogrisel commented Feb 2, 2023

If this operation cannot easily or efficiently mapped to pandas operations, it would be fine to raise NotImplementedError instead.

But then reporting the Window operation as supported for pandas in the backends matrix would be a bit misleading.

@cpcloud
Copy link
Member

cpcloud commented Feb 2, 2023

Thanks @ogrisel!

Let's see if @kszucs's #5014 PR fixes this issue.

@cpcloud cpcloud added this to the 5.0 milestone Feb 2, 2023
@ogrisel
Copy link
Contributor Author

ogrisel commented Feb 2, 2023

I just gave it a try and, at this time (at commit 0346e0b), I can still reproduce the originally reported problem.

@cpcloud cpcloud removed this from the 5.0 milestone Feb 20, 2023
mesejo added a commit to mesejo/ibis that referenced this issue May 15, 2023
mesejo added a commit to mesejo/ibis that referenced this issue May 15, 2023
cpcloud pushed a commit that referenced this issue May 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants