Skip to content
This repository has been archived by the owner on Aug 26, 2024. It is now read-only.

process.extract*: Inconsistent behaviour between python2 and python3 when using pandas.Series #181

Closed
normanius opened this issue Nov 3, 2017 · 3 comments

Comments

@normanius
Copy link

normanius commented Nov 3, 2017

When executing the following code, process.extract* functions do not behave consistently.

from fuzzywuzzy import process
import pandas as pd

a = pd.Series.from_array(['abc', 'def', 'ghi'])
process.extractOne('abd', a)

In python2 (python 2.7.12, pandas 0.20.3, fuzzywuzzy 0.15.1) the output is:

('abc', 67,)

In python3 (python 3.5.2, pandas 0.20.3, fuzzywuzzy 0.15.1) the output is:

('abc', 67, 0)

So apparently (following the documentation in process.p), fuzzywuzzy under python3 interprets pd.Series as a dictionary and not as a list-like object.

Not entirely sure if the problem is on the fuzzy side - but some users might get confused when migrating from python2 to python3.

@josegonzalez
Copy link
Contributor

We're unlikely to fix this as all of our applications are python2. That said, I'd be happy to review a pull request that fixes the issue :)

@normanius
Copy link
Author

normanius commented Nov 4, 2017

The inconsistency indeed originates from pandas (pandas-dev/pandas#13918) and will be solved for future versions (pandas>=0.21).

No further action required, I guess. Sorry for bothering you. :)

@josegonzalez
Copy link
Contributor

No worries, thanks for digging!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants