-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
BUG: fixed to_numeric loss in precision when converting decimal type to integer #57213 #58139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| values = maybe_downcast_numeric(values, dtype) | ||
| try: | ||
| if (len(old_values) != 0) and isinstance( | ||
| old_values[0], Decimal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this fix is too specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse me, but i'm confused as to what you mean by too specific, the bug was specifically about the decimal type no?
I'm sorry but I'm pretty new to open source contributing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Leventide!
I previously looked into this issue and thought about working on it! So, I think it is great that you gave it a shot and started working on a fix for it. I am also a new contributor to open source and have to confess that it was a little bit tricky to get my first PR accepted. Since this seems to be your first contribution, let's talk about a few things that can possibly help you go forward (those are reflections that helped me keep my effort to contribute and learn in this community).
-
Context: If you look into the size of this project (in terms of issues, for example) and track the number of active maintainers actually working on it you will realize that they are completely overwhelmed by work: there is a huge list of bugs, issues doesn't stop to arrive and they need to evolve the code toward current goals. Just take a look on PDEP and on the list of issues for 3.0 release. Besides, there is a lot of people trying to enter in the community and, among the skilled one, few are willing or have the proper skills to mentor new members. In this direction, I would recommend you to watch this talk Brett Cannon: Setting Expectations for Open Source Participation. Considering this, do not take it to the personal side if a core member take a few months to review your PR, this is actually very common. It takes some time for us to understand the details of the library, the social codes of the community, and it also takes some time for the members of the community to know new members.
-
What happens if your code is accepted: pandas is huge (by any measure). Any changes you make into the code will eventually be used by a lot of people and, your contribution being accepted, someone will have to maintain and evolve it (including fixing new bugs, regressions, etc.). You got see yourself in the place of a maintainer to understand the responsibility of accepting a PR (even one that is passing all tests!).
-
You got a trick problem for a first issue! This is not the first issue on pandas and numpy with problems when trying round-trip a big integer number. There could be many strategies for attacking this problem, but I would check into the representation of floating points on python, would read a little bit more about representing integers on this ecosystem, and would follow the tip given by @rhshadrach on the original issue.
-
Your code: I looked briefly into the function and also into your code. Probably @mroeschke wanted tell you is that it is necessary to make the logic as general as possible, without introducing too much specificity in the code. Well, you may think that this piece of code already is too much specific, that it is the way it is... nevertheless, things ended up how it is for historical reasons and the idea is, as far as possible, improve the code, simplifying its logic. This is my point of view. Regarding you code, it introduces three new levels of nesting: a
try-except(which doesn't seem to used in the current context within this library), and two nestedif-elseblocks. I think this is something that could be avoided. Maybe there is some intrinsic pattern for capturing the data type and, for some reason, the data is falling in the wrong path... maybe the problem is in the functionmaybe_downcast_numeric, or it is being used out of context. Those are possibilities. Possibilities are so much that I would recommend debugging it and following through step by step. In any way, maybe you already did it.
Well, it turned out that I was planning to write about the first three points and they ended up appearing here. Hope that you keep on trying this PR, continue contributing, and learning.
Regards
|
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.