Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_ReindexCatIds is broken #52

Closed
thovden opened this issue Apr 29, 2022 · 4 comments · Fixed by #53
Closed

_ReindexCatIds is broken #52

thovden opened this issue Apr 29, 2022 · 4 comments · Fixed by #53

Comments

@thovden
Copy link
Contributor

thovden commented Apr 29, 2022

https://github.com/thovden/pylabel/blob/c17d2c644f1dcb85ef3f9835c8771b96b969a673/pylabel/shared.py#L39

The __ReindexCatIds function is broken, mainly because it's operating on a copy of a DataFrame.

The following code does not work in-place by default:

    df = df.replace(r"^\s*$", np.nan, regex=True)

Hence, the rest of the function operates on a copy of the DataFrame, but the intention is to do in-place replacements of df['cat_id'].

I also suspect the intention of the following code is probably to coerce df['cat_id'] to an numeric type:

pd.to_numeric(df["cat_id"])

However, these values are not written back to the original data frame.

I'm going to cleanup this function a little and submit a PR.

@thovden
Copy link
Contributor Author

thovden commented Apr 29, 2022

The issue seems to be introduced by this commit 23b28bf

@alexheat alexheat reopened this Apr 30, 2022
@alexheat
Copy link
Contributor

Thank you again @thovden . I have created pre-release version 0.1.39a1 with your fix.

Could you give it a try by installing
pip install pylabel==0.1.39a1

@thovden
Copy link
Contributor Author

thovden commented May 1, 2022

Verified to work in 0.1.39a1. Thanks, @alexheat!

@thovden thovden closed this as completed May 1, 2022
@alexheat
Copy link
Contributor

alexheat commented May 1, 2022

Thank you again @thovden . I have released 0.1.39 with the fix.

I also added some test cases to make sure this is not broken again in the future a976224

The test cases run whenever a new PR is made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants