BUG: fixed to_numeric loss in precision when converting decimal type to integer #57213 #58139

Leventide · 2024-04-03T19:11:25Z

closes BUG: to_numeric loses precision when converting decimal type to integer #57213
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions. (Not applicable)
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

mroeschke · 2024-04-04T16:28:19Z

pandas/core/tools/numeric.py

-                    values = maybe_downcast_numeric(values, dtype)
+                    try:
+                        if (len(old_values) != 0) and isinstance(
+                            old_values[0], Decimal


I think this fix is too specific.

Excuse me, but i'm confused as to what you mean by too specific, the bug was specifically about the decimal type no?
I'm sorry but I'm pretty new to open source contributing.

Hi @Leventide!

I previously looked into this issue and thought about working on it! So, I think it is great that you gave it a shot and started working on a fix for it. I am also a new contributor to open source and have to confess that it was a little bit tricky to get my first PR accepted. Since this seems to be your first contribution, let's talk about a few things that can possibly help you go forward (those are reflections that helped me keep my effort to contribute and learn in this community).

Context: If you look into the size of this project (in terms of issues, for example) and track the number of active maintainers actually working on it you will realize that they are completely overwhelmed by work: there is a huge list of bugs, issues doesn't stop to arrive and they need to evolve the code toward current goals. Just take a look on PDEP and on the list of issues for 3.0 release. Besides, there is a lot of people trying to enter in the community and, among the skilled one, few are willing or have the proper skills to mentor new members. In this direction, I would recommend you to watch this talk Brett Cannon: Setting Expectations for Open Source Participation. Considering this, do not take it to the personal side if a core member take a few months to review your PR, this is actually very common. It takes some time for us to understand the details of the library, the social codes of the community, and it also takes some time for the members of the community to know new members.

What happens if your code is accepted: pandas is huge (by any measure). Any changes you make into the code will eventually be used by a lot of people and, your contribution being accepted, someone will have to maintain and evolve it (including fixing new bugs, regressions, etc.). You got see yourself in the place of a maintainer to understand the responsibility of accepting a PR (even one that is passing all tests!).

You got a trick problem for a first issue! This is not the first issue on pandas and numpy with problems when trying round-trip a big integer number. There could be many strategies for attacking this problem, but I would check into the representation of floating points on python, would read a little bit more about representing integers on this ecosystem, and would follow the tip given by @rhshadrach on the original issue.

Your code: I looked briefly into the function and also into your code. Probably @mroeschke wanted tell you is that it is necessary to make the logic as general as possible, without introducing too much specificity in the code. Well, you may think that this piece of code already is too much specific, that it is the way it is... nevertheless, things ended up how it is for historical reasons and the idea is, as far as possible, improve the code, simplifying its logic. This is my point of view. Regarding you code, it introduces three new levels of nesting: a try-except (which doesn't seem to used in the current context within this library), and two nested if-else blocks. I think this is something that could be avoided. Maybe there is some intrinsic pattern for capturing the data type and, for some reason, the data is falling in the wrong path... maybe the problem is in the function maybe_downcast_numeric, or it is being used out of context. Those are possibilities. Possibilities are so much that I would recommend debugging it and following through step by step. In any way, maybe you already did it.

Well, it turned out that I was planning to write about the first three points and they ended up appearing here. Hope that you keep on trying this PR, continue contributing, and learning.

Regards

mroeschke · 2024-05-15T17:59:02Z

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

BUG: fixed to_numeric loss in precision when converting decimal type …

02590c8

…to integer pandas-dev#57213

mroeschke requested changes Apr 4, 2024

View reviewed changes

mroeschke added the Dtype Conversions Unexpected or buggy dtype conversions label Apr 4, 2024

mroeschke closed this May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: fixed to_numeric loss in precision when converting decimal type to integer #57213 #58139

BUG: fixed to_numeric loss in precision when converting decimal type to integer #57213 #58139

Uh oh!

Leventide commented Apr 3, 2024

Uh oh!

mroeschke Apr 4, 2024

Uh oh!

Leventide Apr 4, 2024

Uh oh!

aureliobarbosa Apr 15, 2024 •

edited

Loading

Uh oh!

mroeschke commented May 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

BUG: fixed to_numeric loss in precision when converting decimal type to integer #57213 #58139

BUG: fixed to_numeric loss in precision when converting decimal type to integer #57213 #58139

Uh oh!

Conversation

Leventide commented Apr 3, 2024

Uh oh!

mroeschke Apr 4, 2024

Choose a reason for hiding this comment

Uh oh!

Leventide Apr 4, 2024

Choose a reason for hiding this comment

Uh oh!

aureliobarbosa Apr 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mroeschke commented May 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aureliobarbosa Apr 15, 2024 •

edited

Loading