Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve performance of C implementation #43

Merged
merged 2 commits into from
Jul 29, 2024
Merged

Conversation

akx
Copy link
Contributor

@akx akx commented Jul 27, 2024

This PR:

  • improves the performance of the C implementation by eschewing C++ std::string allocations in favor of simple byte buffers and trying to make sure the Cython code does as little copying and unnecessary non-ASCII decoding as possible.
  • lifts the "get timestamp" function from the tests to a public API and adds a C implementation for it.

Caution

I tried to be extra fastidious and careful about the use of raw buffers here, but it's likely a good idea to look over this with a pretty fine comb.


On my machine, this seems to improve performance approximately thus:

  • ulid_now_bytes: 1,439 OPS -> 1,757 OPS (+22%)
  • ulid_hex: 1,057 OPS -> 1,414 OPS (+33%)
  • ut_decode: 8,862 OPS -> 12,660 OPS (+42%)
  • ulid_at_time: 5,879 OPS -> 6,390 OPS (+9%)
  • ulid_now: 5,972 -> 6,189 OPS (+3%)

The new timestamp parsing API is also pretty snappy:

--------------------- benchmark 'timestamp': 4 tests ---------------------
Name (time in ns)                   Mean            OPS (Kops/s)
--------------------------------------------------------------------------
test_ut_timestamp[c]             81.4350 (1.0)       12,279.7339 (1.0)
test_ut_timestamp[python]     1,855.7862 (22.79)        538.8552 (0.04)
test_ulidpy_uuid              2,843.5127 (34.92)        351.6777 (0.03)
test_ulid2_timestamp          3,261.0542 (40.04)        306.6493 (0.02)
--------------------------------------------------------------------------

Copy link

codecov bot commented Jul 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (975079d) to head (d7c1d00).
Report is 2 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main       #43      +/-   ##
===========================================
+ Coverage   95.74%   100.00%   +4.25%     
===========================================
  Files           2         2              
  Lines          47        52       +5     
  Branches        4         5       +1     
===========================================
+ Hits           45        52       +7     
+ Misses          2         0       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

akx added 2 commits July 29, 2024 10:39
This improves the performance of the C implementation by eschewing
C++ string allocations and trying to make sure the Cython code
does as little copying and unnecessary non-ASCII decoding as possible.
@akx akx marked this pull request as ready for review July 29, 2024 07:50
@bdraco
Copy link
Owner

bdraco commented Jul 29, 2024

Buffer sizes all look right.

@bdraco
Copy link
Owner

bdraco commented Jul 29, 2024

generated code looks good

cythonize -X language_level=3 -a -i src/ulid_transform/_ulid_impl.pyx
Screenshot 2024-07-29 at 8 00 44 AM

@bdraco
Copy link
Owner

bdraco commented Jul 29, 2024

Screenshot 2024-07-29 at 8 01 47 AM

@bdraco
Copy link
Owner

bdraco commented Jul 29, 2024

pushed on production system.
all good

Copy link
Owner

@bdraco bdraco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @akx
Nice work!

@bdraco bdraco merged commit f07d838 into bdraco:main Jul 29, 2024
22 checks passed
@akx akx deleted the c-perf-2 branch July 29, 2024 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants