-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
BUG: fixed to_numeric loss in precision when converting decimal type to integer #57213 #58139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this fix is too specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse me, but i'm confused as to what you mean by too specific, the bug was specifically about the decimal type no?
I'm sorry but I'm pretty new to open source contributing.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Leventide!
I previously looked into this issue and thought about working on it! So, I think it is great that you gave it a shot and started working on a fix for it. I am also a new contributor to open source and have to confess that it was a little bit tricky to get my first PR accepted. Since this seems to be your first contribution, let's talk about a few things that can possibly help you go forward (those are reflections that helped me keep my effort to contribute and learn in this community).
Context: If you look into the size of this project (in terms of issues, for example) and track the number of active maintainers actually working on it you will realize that they are completely overwhelmed by work: there is a huge list of bugs, issues doesn't stop to arrive and they need to evolve the code toward current goals. Just take a look on PDEP and on the list of issues for 3.0 release. Besides, there is a lot of people trying to enter in the community and, among the skilled one, few are willing or have the proper skills to mentor new members. In this direction, I would recommend you to watch this talk Brett Cannon: Setting Expectations for Open Source Participation. Considering this, do not take it to the personal side if a core member take a few months to review your PR, this is actually very common. It takes some time for us to understand the details of the library, the social codes of the community, and it also takes some time for the members of the community to know new members.
What happens if your code is accepted: pandas is huge (by any measure). Any changes you make into the code will eventually be used by a lot of people and, your contribution being accepted, someone will have to maintain and evolve it (including fixing new bugs, regressions, etc.). You got see yourself in the place of a maintainer to understand the responsibility of accepting a PR (even one that is passing all tests!).
You got a trick problem for a first issue! This is not the first issue on pandas and numpy with problems when trying round-trip a big integer number. There could be many strategies for attacking this problem, but I would check into the representation of floating points on python, would read a little bit more about representing integers on this ecosystem, and would follow the tip given by @rhshadrach on the original issue.
Your code: I looked briefly into the function and also into your code. Probably @mroeschke wanted tell you is that it is necessary to make the logic as general as possible, without introducing too much specificity in the code. Well, you may think that this piece of code already is too much specific, that it is the way it is... nevertheless, things ended up how it is for historical reasons and the idea is, as far as possible, improve the code, simplifying its logic. This is my point of view. Regarding you code, it introduces three new levels of nesting: a
try-except(which doesn't seem to used in the current context within this library), and two nestedif-elseblocks. I think this is something that could be avoided. Maybe there is some intrinsic pattern for capturing the data type and, for some reason, the data is falling in the wrong path... maybe the problem is in the functionmaybe_downcast_numeric, or it is being used out of context. Those are possibilities. Possibilities are so much that I would recommend debugging it and following through step by step. In any way, maybe you already did it.Well, it turned out that I was planning to write about the first three points and they ended up appearing here. Hope that you keep on trying this PR, continue contributing, and learning.
Regards