Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement kvitems in the C extension for increased performance. #21

Closed
rtobar opened this issue Jan 20, 2020 · 5 comments
Closed

Implement kvitems in the C extension for increased performance. #21

rtobar opened this issue Jan 20, 2020 · 5 comments
Labels
enhancement New feature or request

Comments

@rtobar
Copy link

rtobar commented Jan 20, 2020

Currently the new kvitems method is implemented in python, which is what all backends use. The C backend however should see performance benefits by seeing this method implemented in C, with an expected increase of something around ~3x, 4x, depending on the use case.

See #18 (comment) and #18 (comment) for reference.

@rtobar rtobar added the enhancement New feature or request label Jan 20, 2020
rtobar added a commit that referenced this issue Jan 21, 2020
This implementation follows the same logic of the python kvitems
function, reusing the builder_t object used by the items function.
Performance wise, kvitems is slightly faster than items, which is what
we expected in the first place.

This commit provides a first implementation for #21.

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
@rtobar
Copy link
Author

rtobar commented Jan 21, 2020

@ltalirz could you check this one? As expected I'm getting about ~4x improvement on kvitems times for the C backend compared to the previous method, and (also as expected) kvitems seems to be slightly faster than items.

I also sped up a bit the python implementation of kvitems, which now seems to be ~10, 20% faster depending on the use case, as far as I could measure.

@ltalirz
Copy link

ltalirz commented Jan 21, 2020

Hi @rtobar - thanks a lot for your efforts!

A few observations:

  • Parsing the first 1M items of my JSON now takes ~16s (0.02ms/pair). That is within a factor of 2 of my revised timings for json.load, which is great.
  • It seems like the C implementation leaks memory (it grows up to 2GB in my test)
  • I checked that the pure python implementation uses ~13MB of memory throughout, i.e. the problem seems to be in the C extension

@rtobar
Copy link
Author

rtobar commented Jan 21, 2020

@ltalirz oops, bad reference counting there... I just pushed some further fixes to the kvitems branch (which also boosts performance by yet another tiny bit, at least in my local benchmarks), please give it another nudge when possible.

@ltalirz
Copy link

ltalirz commented Jan 21, 2020

Thanks - I'm now at 13s (0.01ms/pair), memory issues seem gone.

Looking great!

@rtobar
Copy link
Author

rtobar commented Jan 22, 2020

Code merged to the master branch, closing issue now.

@rtobar rtobar closed this as completed Jan 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants