New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Improve conditional_join
#1224
Conversation
🚀 Deployed on https://deploy-preview-1224--pyjanitor.netlify.app |
Codecov Report
@@ Coverage Diff @@
## dev #1224 +/- ##
===========================================
+ Coverage 82.51% 97.64% +15.13%
===========================================
Files 78 78
Lines 3770 3617 -153
===========================================
+ Hits 3111 3532 +421
+ Misses 659 85 -574 |
Hey @samukweku, it looked like the PR was heading in a good direction. Any reason for closing? |
I just did a minor speed test and noticed a regression. Need to spend some more time on the logic. I'll open it again if I can improve on the logic. If not I'll keep the existing logic |
I love the rigor here, @samukweku! ❤️ |
…or scenarios where the range columns are both monotonically increasing when sorted
@samukweku when you're ready for review, do let us know! Maybe we should also try using draft vs. actual PRs in the future? |
@ericmjl my bad... It is ready for review. Thanks for the draft suggestion... I will def use that for future PRs |
@samukweku in the interest of keeping things moving, I will merge this when all of the CI checks are done. |
@samukweku I think this PR can be merged once the pre-commit hooks are satisfied! |
@ericmjl I think there is an issue with the pre commit ci - locally everything checks fine (can't fit push without satisfying d pre commit hooks) |
@ericmjl the precommit hooks times out - no idea why : |
No worries, @samukweku. I realize that this PR is quite old now. I'm going to clean up the PRs here by reviewing or merging them, starting with this one! |
PR Description
Please describe the changes proposed in the pull request:
conditional_join
to allow user return either onlydf
orright
indicator
parameter added - similar to theindicator
parameter inpd.merge
_range_indices
algorithm, taking advantage of binary search for improved performance, for scenarios where both columns on the right dataframe are monotonically increasingThis PR resolves #1223 .
Speed test when both columns on the right are monotonically increasing
indicator
parameter usage:PR Checklist
Please ensure that you have done the following:
<your_username>
:dev
, but rather from<your_username>
:<feature-branch_name>
.AUTHORS.md
.CHANGELOG.md
under the latest version header (i.e. the one that is "on deck") describing the contribution.Automatic checks
There will be automatic checks run on the PR. These include:
Relevant Reviewers
Please tag maintainers to review.