Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
pd.merge_asof() matches out of tolerance when timestamps are duplicated #13709
This is a continuation of #13695.
Starting with the original DataFrames from that issue:
I now get the null:
However, if I change the first DataFrame to have duplicate timestamps:
then the bug reappears:
This is in pandas version 0.18.0+418.gc46dcfa.
Alright, I've issued a pull request:
I just rewrote the Cython logic to compare the factorized keys directly since that was the easiest way forward. Though we don't actually have to factorize the keys at all; we could just compare the timestamps directly, which would be even faster.