Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: The translate method of strings is not working as expected for pandas #6157

Closed
1 task done
mesejo opened this issue May 5, 2023 · 3 comments · Fixed by #6161
Closed
1 task done

bug: The translate method of strings is not working as expected for pandas #6157

mesejo opened this issue May 5, 2023 · 3 comments · Fixed by #6161
Labels
bug Incorrect behavior inside of ibis pandas The pandas backend
Milestone

Comments

@mesejo
Copy link
Contributor

mesejo commented May 5, 2023

What happened?

The following code raises a TypeError exception:

import pandas as pd

import ibis

ibis.options.interactive = True
ibis.set_backend("pandas")

df = pd.DataFrame({
    'fruit': ['apple', 'banana', 'orange', 'kiwi', 'mango'],
    'to_column': ['red', 'ted', 'pet', 'bed', 'sed']
})

table = ibis.memtable(df)

result = table.fruit.translate('abc', table.to_column).execute()
print(result)

Note that using a column for Translate is not documented nor implemented for DuckDB, but it is implemented in here for pandas.

What would be the solution? Delete these methods or fix them?

What version of ibis are you using?

dev

What backend(s) are you using, if any?

pandas

Relevant log output

Traceback (most recent call last):
  File "/Users/daniel.mesejo/PycharmProjects/ibis/try_substr.py", line 15, in <module>
    result = table.fruit.translate('abc', table.to_column).execute()
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/expr/types/core.py", line 303, in execute
    return self._find_backend(use_default=True).execute(
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/__init__.py", line 299, in execute
    return execute_and_reset(node, params=params, **kwargs)
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/core.py", line 482, in execute_and_reset
    result = execute(
  File "/Users/daniel.mesejo/opt/anaconda3/envs/ibis-dev/lib/python3.10/site-packages/multipledispatch/dispatcher.py", line 278, in __call__
    return func(*args, **kwargs)
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/trace.py", line 136, in traced_func
    return func(*args, **kwargs)
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/core.py", line 428, in main_execute
    return execute_with_scope(
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/core.py", line 218, in execute_with_scope
    result = execute_until_in_scope(
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/trace.py", line 136, in traced_func
    return func(*args, **kwargs)
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/core.py", line 350, in execute_until_in_scope
    result = execute_node(
  File "/Users/daniel.mesejo/opt/anaconda3/envs/ibis-dev/lib/python3.10/site-packages/multipledispatch/dispatcher.py", line 278, in __call__
    return func(*args, **kwargs)
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/trace.py", line 136, in traced_func
    return func(*args, **kwargs)
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/execution/strings.py", line 336, in execute_series_translate_scalar_series
    table = to_string.map(lambda y, x=from_string: str.maketrans(x=x, y=y))
  File "/Users/daniel.mesejo/opt/anaconda3/envs/ibis-dev/lib/python3.10/site-packages/pandas/core/series.py", line 4393, in map
    new_values = self._map_values(arg, na_action=na_action)
  File "/Users/daniel.mesejo/opt/anaconda3/envs/ibis-dev/lib/python3.10/site-packages/pandas/core/base.py", line 924, in _map_values
    new_values = map_f(values, mapper)
  File "pandas/_libs/lib.pyx", line 2834, in pandas._libs.lib.map_infer
  File "/Users/daniel.mesejo/PycharmProjects/ibis/ibis/backends/pandas/execution/strings.py", line 336, in <lambda>
    table = to_string.map(lambda y, x=from_string: str.maketrans(x=x, y=y))
TypeError: str.maketrans() takes no keyword arguments

Code of Conduct

  • I agree to follow this project's Code of Conduct
@mesejo mesejo added the bug Incorrect behavior inside of ibis label May 5, 2023
@cpcloud
Copy link
Member

cpcloud commented May 5, 2023

@mesejo Thanks for the issue!

I think we should try to avoid deleting methods without a strong justification or equivalent functionality.

Is the fix for the pandas backend as simple as removing the caller's explicit keyword arguments?

@cpcloud cpcloud added the pandas The pandas backend label May 5, 2023
@mesejo
Copy link
Contributor Author

mesejo commented May 5, 2023

Removing the caller's explicit keyword arguments makes the function run, but I don't know if the output makes sense. For the original code, it produces:

0     apple
1    banana
2    orange
3      kiwi
4     mango
Name: Translate(fruit, 'abc', to_column), dtype: object

it also passes the pd.Series:

table = to_string.map(lambda y, x=from_string: str.maketrans(x, y))

to Series.str.translate, the expected input is a dictionary made by str.make_trans.

@mesejo
Copy link
Contributor Author

mesejo commented May 5, 2023

Just as a side note, translate with two strings as input works:

result = table.fruit.translate('abc', 'efg').execute()
print(result)

Output

0     epple
1    fenene
2    orenge
3      kiwi
4     mengo
Name: Translate(fruit, 'abc', 'efg'), dtype: object

mesejo added a commit to mesejo/ibis that referenced this issue May 6, 2023
- use list comprehension in execute_substring_series_series, which is
slightly faster and removes the use of the stateful function
- use str.contains in execute_string_like_series_string, which is
supported by pandas
- remove the duplicated implementation of EndsWith and StartsWith

fixes ibis-project#6157
@cpcloud cpcloud added this to the 6.0 milestone May 7, 2023
cpcloud pushed a commit that referenced this issue May 7, 2023
- use list comprehension in execute_substring_series_series, which is
slightly faster and removes the use of the stateful function
- use str.contains in execute_string_like_series_string, which is
supported by pandas
- remove the duplicated implementation of EndsWith and StartsWith

fixes #6157
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis pandas The pandas backend
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants