Skip to content

Conversation

@ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented Nov 11, 2024

What does this PR do?

This PR builds on the work started by @AlexJF on #607 to introduce "managed string storage" for profiling.

The idea is to introduce another level of string storage for profiling that is decoupled in lifetime from individual profiles, and that is managed by the libdatadog client.

At its core, managed string storage provides a hashtable that stores strings and returns ids. These ids can then be provided to libdatadog instead of CharSlices when recording profiling samples.

For FFI users, this PR adds the following APIs to manage strings:

  • ddog_prof_ManagedStringStorage_new
  • ddog_prof_ManagedStringStorage_intern(String)
  • ddog_prof_ManagedStringStorage_unintern(id)
  • ddog_prof_ManagedStringStorage_advance_gen
  • ddog_prof_ManagedStringStorage_drop
  • ddog_prof_ManagedStringStorage_get_string

A key detail of the current implementation is that each intern call with the same string will increase an internal usage counter, and unintern call with reduce the counter.

Then at advance_gen time, if the counter is zero, we get rid of the string.

Then to interact with profiles, there's a new ddog_prof_Profile_with_string_storage API to create a profile with a given ManagedStringStorage, and all structures that make up a Sample (Mapping, Function, Label) etc have been extended so that they either take a CharSlice or a ManagedStringId.

Thus, after interning all strings for a sample, it's possible to add a sample to a profile entirely by referencing strings by ids, rather than CharSlices.

Motivation

The initial use-case is to support heap profiling -- "samples" related to heap profiling usually live across multiple profiles (as long as a given object is alive) and so this data must be kept somewhere.
Previously for Ruby we were keeping this on the Ruby profiler side, but having libdatadog manage this instead presents a few optimization opportunities.

We also hope to replace a few other "string tables" that other profilers had to build outside of libdatadog for similar use-cases.

This topic was also discussed in the following two documents (Datadog-only, sorry!):

Additional Notes

In keeping with the experimental nature of this feature, I've tried really hard to not disturb existing profiling API users with the new changes.

That is -- I was going for, if you're not using managed string storage, you should NOT be affected AT ALL by it -- be it API changes or overhead.

(This is why on the pure-Rust profiling crate side, I ended up duplicating a bunch of structures and functions. I couldn't think of a great way to not disturb existing API users other than introducing alternative methods, but to be honest the duplication is all in very simple methods so I don't think this substantially increases complexity/maintenance vs trying to be smarter to bend Rust to our will.)

There's probably a lot of improvements we can make, but with this PR I'm hoping to have something in a close to "good enough" state, that we can merge this in and then start iterating on master, rather than have this continue living in a branch for a lot longer.

This doesn't mean we shouldn't fix or improve things before merging, but I'll be trying to identify what needs to go in now and what can go in as separate, follow-up PRs.

As an addendum, there's still a bunch of expects sprinkled that should be turned into proper errors. I plan to do a pass on all of those. (But again, none of the panics affect existing code, so they're harmless and inert unless you're experimenting with the new APIs)

How to test the change?

The branch in https://github.com/DataDog/dd-trace-rb/tree/ivoanjo/prof-9476-managed-string-storage-try2 is where I'm testing the changes on the Ruby profiler side.

It may not be entirely up-to-date with the latest ffi changes on the libdatadog side (I've been prettying up the API), but it shows how to use this concept, while passing all the profiling unit/integration tests, and has shown improvements in memory and latency in the reliability environment.

@ivoanjo ivoanjo requested a review from a team as a code owner November 11, 2024 15:41
@github-actions github-actions bot added the profiling Relates to the profiling* modules. label Nov 11, 2024
@pr-commenter
Copy link

pr-commenter bot commented Nov 11, 2024

Benchmarks

Comparison

Benchmark execution time: 2025-01-20 11:57:04

Comparing candidate commit 022e6d8 in PR branch ivoanjo/prof-9476-managed-string-storage-try3-clean with baseline commit de64524 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 52 metrics, 2 unstable metrics.

Candidate

Candidate benchmark details

Group 1

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
sql/obfuscate_sql_string execution_time 69.491µs 69.668µs ± 0.152µs 69.636µs ± 0.046µs 69.705µs 69.827µs 69.923µs 71.309µs 2.40% 7.062 69.048 0.22% 0.011µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
sql/obfuscate_sql_string execution_time [69.647µs; 69.689µs] or [-0.030%; +0.030%] None None None

Group 2

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching deserializing traces from msgpack to their internal representation execution_time 59.839ms 60.046ms ± 0.185ms 59.999ms ± 0.049ms 60.051ms 60.419ms 60.813ms 61.267ms 2.11% 3.292 13.405 0.31% 0.013ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching deserializing traces from msgpack to their internal representation execution_time [60.021ms; 60.072ms] or [-0.043%; +0.043%] None None None

Group 3

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_trace/test_trace execution_time 259.549ns 272.537ns ± 14.992ns 268.460ns ± 5.227ns 274.833ns 306.404ns 328.636ns 329.682ns 22.80% 2.247 4.906 5.49% 1.060ns 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_trace/test_trace execution_time [270.459ns; 274.615ns] or [-0.762%; +0.762%] None None None

Group 4

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
ip_address/quantize_peer_ip_address_benchmark execution_time 5.395µs 5.468µs ± 0.038µs 5.470µs ± 0.031µs 5.494µs 5.530µs 5.539µs 5.613µs 2.61% 0.247 -0.216 0.69% 0.003µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
ip_address/quantize_peer_ip_address_benchmark execution_time [5.463µs; 5.473µs] or [-0.095%; +0.095%] None None None

Group 5

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
write only interface execution_time 1.461µs 3.314µs ± 1.476µs 3.116µs ± 0.033µs 3.149µs 3.767µs 14.372µs 15.428µs 395.07% 7.485 56.601 44.43% 0.104µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
write only interface execution_time [3.110µs; 3.519µs] or [-6.173%; +6.173%] None None None

Group 6

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
two way interface execution_time 18.156µs 27.831µs ± 14.485µs 18.647µs ± 0.315µs 36.308µs 46.600µs 52.071µs 151.101µs 710.32% 3.996 28.135 51.92% 1.024µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
two way interface execution_time [25.823µs; 29.838µs] or [-7.213%; +7.213%] None None None

Group 7

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
tags/replace_trace_tags execution_time 2.654µs 2.715µs ± 0.018µs 2.718µs ± 0.006µs 2.723µs 2.744µs 2.752µs 2.755µs 1.36% -1.282 2.986 0.65% 0.001µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
tags/replace_trace_tags execution_time [2.713µs; 2.718µs] or [-0.090%; +0.090%] None None None

Group 8

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time 179.461µs 182.084µs ± 1.322µs 182.023µs ± 0.994µs 182.968µs 184.512µs 185.145µs 185.668µs 2.00% 0.473 -0.443 0.72% 0.093µs 1 200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput 5385949.028op/s 5492244.217op/s ± 39730.554op/s 5493804.808op/s ± 30001.323op/s 5525297.266op/s 5545195.777op/s 5553833.907op/s 5572241.323op/s 1.43% -0.445 -0.482 0.72% 2809.374op/s 1 200
normalization/normalize_name/normalize_name/bad-name execution_time 21.055µs 21.282µs ± 0.143µs 21.253µs ± 0.078µs 21.336µs 21.540µs 21.675µs 22.046µs 3.73% 1.457 3.691 0.67% 0.010µs 1 200
normalization/normalize_name/normalize_name/bad-name throughput 45359112.034op/s 46990479.736op/s ± 311999.308op/s 47053124.222op/s ± 172369.821op/s 47200270.232op/s 47369961.083op/s 47423748.116op/s 47494401.856op/s 0.94% -1.388 3.293 0.66% 22061.683op/s 1 200
normalization/normalize_name/normalize_name/good execution_time 14.201µs 14.346µs ± 0.077µs 14.341µs ± 0.054µs 14.401µs 14.479µs 14.532µs 14.636µs 2.06% 0.483 0.057 0.54% 0.005µs 1 200
normalization/normalize_name/normalize_name/good throughput 68323502.340op/s 69706296.278op/s ± 373230.843op/s 69727832.200op/s ± 263618.043op/s 69981683.873op/s 70281850.793op/s 70370977.865op/s 70417173.088op/s 0.99% -0.454 -0.009 0.53% 26391.406op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time [181.901µs; 182.268µs] or [-0.101%; +0.101%] None None None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput [5486737.944op/s; 5497750.490op/s] or [-0.100%; +0.100%] None None None
normalization/normalize_name/normalize_name/bad-name execution_time [21.262µs; 21.302µs] or [-0.093%; +0.093%] None None None
normalization/normalize_name/normalize_name/bad-name throughput [46947239.633op/s; 47033719.840op/s] or [-0.092%; +0.092%] None None None
normalization/normalize_name/normalize_name/good execution_time [14.336µs; 14.357µs] or [-0.074%; +0.074%] None None None
normalization/normalize_name/normalize_name/good throughput [69654570.073op/s; 69758022.483op/s] or [-0.074%; +0.074%] None None None

Group 9

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
concentrator/add_spans_to_concentrator execution_time 6.357ms 6.375ms ± 0.011ms 6.374ms ± 0.004ms 6.378ms 6.385ms 6.424ms 6.472ms 1.54% 4.981 39.010 0.17% 0.001ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
concentrator/add_spans_to_concentrator execution_time [6.373ms; 6.376ms] or [-0.023%; +0.023%] None None None

Group 10

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
redis/obfuscate_redis_string execution_time 38.534µs 39.224µs ± 1.163µs 38.707µs ± 0.057µs 38.798µs 41.725µs 41.797µs 42.863µs 10.74% 1.722 1.055 2.96% 0.082µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
redis/obfuscate_redis_string execution_time [39.063µs; 39.386µs] or [-0.411%; +0.411%] None None None

Group 11

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time 700.769µs 702.117µs ± 0.809µs 702.072µs ± 0.422µs 702.481µs 703.091µs 704.071µs 708.094µs 0.86% 2.954 17.900 0.11% 0.057µs 1 200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput 1412242.338op/s 1424265.113op/s ± 1634.760op/s 1424354.713op/s ± 857.366op/s 1425226.102op/s 1426309.803op/s 1426671.992op/s 1427004.390op/s 0.19% -2.916 17.559 0.11% 115.595op/s 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time 480.237µs 480.910µs ± 0.351µs 480.876µs ± 0.197µs 481.079µs 481.454µs 481.922µs 483.225µs 0.49% 1.785 8.871 0.07% 0.025µs 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput 2069428.453op/s 2079393.922op/s ± 1517.476op/s 2079537.949op/s ± 852.613op/s 2080354.230op/s 2081443.032op/s 2081998.754op/s 2082305.652op/s 0.13% -1.769 8.745 0.07% 107.302op/s 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time 190.765µs 191.261µs ± 0.285µs 191.213µs ± 0.135µs 191.367µs 191.816µs 192.225µs 192.363µs 0.60% 1.296 2.310 0.15% 0.020µs 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput 5198510.453op/s 5228482.108op/s ± 7784.933op/s 5229763.982op/s ± 3704.748op/s 5233112.344op/s 5239081.065op/s 5241235.533op/s 5242050.330op/s 0.23% -1.285 2.273 0.15% 550.478op/s 1 200
normalization/normalize_service/normalize_service/[empty string] execution_time 46.209µs 46.420µs ± 0.068µs 46.417µs ± 0.043µs 46.464µs 46.529µs 46.580µs 46.621µs 0.44% 0.067 0.228 0.15% 0.005µs 1 200
normalization/normalize_service/normalize_service/[empty string] throughput 21449775.200op/s 21542419.992op/s ± 31466.950op/s 21544015.600op/s ± 19773.192op/s 21561981.035op/s 21591547.153op/s 21606255.411op/s 21640914.740op/s 0.45% -0.057 0.229 0.15% 2225.049op/s 1 200
normalization/normalize_service/normalize_service/test_ASCII execution_time 48.915µs 49.090µs ± 0.064µs 49.091µs ± 0.039µs 49.129µs 49.192µs 49.244µs 49.312µs 0.45% 0.056 0.609 0.13% 0.005µs 1 200
normalization/normalize_service/normalize_service/test_ASCII throughput 20278963.955op/s 20370805.755op/s ± 26627.443op/s 20370175.255op/s ± 16166.191op/s 20387017.071op/s 20416661.348op/s 20432468.657op/s 20443828.466op/s 0.36% -0.046 0.601 0.13% 1882.845op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time [702.005µs; 702.229µs] or [-0.016%; +0.016%] None None None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput [1424038.551op/s; 1424491.675op/s] or [-0.016%; +0.016%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time [480.861µs; 480.958µs] or [-0.010%; +0.010%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput [2079183.615op/s; 2079604.230op/s] or [-0.010%; +0.010%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time [191.221µs; 191.300µs] or [-0.021%; +0.021%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput [5227403.191op/s; 5229561.025op/s] or [-0.021%; +0.021%] None None None
normalization/normalize_service/normalize_service/[empty string] execution_time [46.411µs; 46.430µs] or [-0.020%; +0.020%] None None None
normalization/normalize_service/normalize_service/[empty string] throughput [21538058.976op/s; 21546781.009op/s] or [-0.020%; +0.020%] None None None
normalization/normalize_service/normalize_service/test_ASCII execution_time [49.081µs; 49.099µs] or [-0.018%; +0.018%] None None None
normalization/normalize_service/normalize_service/test_ASCII throughput [20367115.448op/s; 20374496.063op/s] or [-0.018%; +0.018%] None None None

Group 12

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching string interning on wordpress profile execution_time 138.976µs 139.631µs ± 0.288µs 139.566µs ± 0.152µs 139.767µs 140.175µs 140.551µs 140.823µs 0.90% 1.098 1.839 0.21% 0.020µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching string interning on wordpress profile execution_time [139.591µs; 139.670µs] or [-0.029%; +0.029%] None None None

Group 13

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 022e6d8 1737373504 ivoanjo/prof-9476-managed-string-storage-try3-clean
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
credit_card/is_card_number/ execution_time 4.273µs 4.289µs ± 0.008µs 4.288µs ± 0.001µs 4.290µs 4.293µs 4.296µs 4.395µs 2.48% 11.770 154.469 0.19% 0.001µs 1 200
credit_card/is_card_number/ throughput 227547397.011op/s 233158849.419op/s ± 423526.559op/s 233193791.888op/s ± 65754.033op/s 233256503.495op/s 233418250.866op/s 233566687.150op/s 234006723.706op/s 0.35% -11.669 152.753 0.18% 29947.850op/s 1 200
credit_card/is_card_number/ 3782-8224-6310-005 execution_time 89.214µs 90.675µs ± 0.643µs 90.631µs ± 0.369µs 91.017µs 91.581µs 91.916µs 95.849µs 5.76% 2.634 19.568 0.71% 0.045µs 1 200
credit_card/is_card_number/ 3782-8224-6310-005 throughput 10433074.086op/s 11028949.297op/s ± 76909.939op/s 11033745.123op/s ± 44897.710op/s 11074747.036op/s 11132431.892op/s 11189714.669op/s 11209015.522op/s 1.59% -2.340 16.745 0.70% 5438.354op/s 1 200
credit_card/is_card_number/ 378282246310005 execution_time 83.220µs 83.716µs ± 0.389µs 83.699µs ± 0.136µs 83.826µs 83.939µs 84.057µs 88.612µs 5.87% 10.037 124.395 0.46% 0.027µs 1 200
credit_card/is_card_number/ 378282246310005 throughput 11285150.069op/s 11945416.059op/s ± 53027.861op/s 11947613.304op/s ± 19442.362op/s 11967794.666op/s 11989483.924op/s 11998883.541op/s 12016289.282op/s 0.57% -9.680 118.522 0.44% 3749.636op/s 1 200
credit_card/is_card_number/37828224631 execution_time 4.277µs 4.289µs ± 0.004µs 4.289µs ± 0.001µs 4.291µs 4.294µs 4.296µs 4.339µs 1.17% 7.419 83.972 0.10% 0.000µs 1 200
credit_card/is_card_number/37828224631 throughput 230479443.695op/s 233141266.420op/s ± 233740.274op/s 233168720.428op/s ± 76468.716op/s 233232360.990op/s 233344968.013op/s 233453782.300op/s 233828976.564op/s 0.28% -7.326 82.627 0.10% 16527.933op/s 1 200
credit_card/is_card_number/378282246310005 execution_time 80.579µs 80.879µs ± 0.101µs 80.883µs ± 0.055µs 80.936µs 81.021µs 81.131µs 81.288µs 0.50% 0.115 1.650 0.13% 0.007µs 1 200
credit_card/is_card_number/378282246310005 throughput 12301987.604op/s 12364165.808op/s ± 15494.561op/s 12363542.391op/s ± 8434.675op/s 12372712.614op/s 12390151.983op/s 12403497.762op/s 12410168.892op/s 0.38% -0.101 1.633 0.13% 1095.631op/s 1 200
credit_card/is_card_number/37828224631000521389798 execution_time 58.522µs 58.690µs ± 0.051µs 58.683µs ± 0.027µs 58.713µs 58.784µs 58.851µs 58.866µs 0.31% 0.673 1.520 0.09% 0.004µs 1 200
credit_card/is_card_number/37828224631000521389798 throughput 16987807.423op/s 17038586.652op/s ± 14912.008op/s 17040796.864op/s ± 7743.657op/s 17046832.210op/s 17058903.396op/s 17068675.458op/s 17087693.443op/s 0.28% -0.665 1.511 0.09% 1054.438op/s 1 200
credit_card/is_card_number/x371413321323331 execution_time 6.434µs 6.443µs ± 0.004µs 6.443µs ± 0.002µs 6.445µs 6.450µs 6.453µs 6.455µs 0.19% 0.440 1.363 0.05% 0.000µs 1 200
credit_card/is_card_number/x371413321323331 throughput 154919715.867op/s 155202483.510op/s ± 84550.174op/s 155215030.637op/s ± 36254.337op/s 155246513.291op/s 155339541.730op/s 155412232.658op/s 155427054.550op/s 0.14% -0.435 1.359 0.05% 5978.600op/s 1 200
credit_card/is_card_number_no_luhn/ execution_time 4.274µs 4.288µs ± 0.003µs 4.288µs ± 0.001µs 4.290µs 4.293µs 4.295µs 4.298µs 0.24% -0.153 3.932 0.06% 0.000µs 1 200
credit_card/is_card_number_no_luhn/ throughput 232642704.402op/s 233195878.523op/s ± 149720.722op/s 233203474.734op/s ± 78474.194op/s 233277261.429op/s 233418193.519op/s 233549640.627op/s 233957554.562op/s 0.32% 0.165 3.959 0.06% 10586.854op/s 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time 68.942µs 69.734µs ± 0.399µs 69.701µs ± 0.291µs 70.026µs 70.392µs 70.776µs 70.855µs 1.66% 0.278 -0.341 0.57% 0.028µs 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput 14113336.956op/s 14340700.023op/s ± 82028.084op/s 14347031.608op/s ± 60068.214op/s 14398130.056op/s 14464907.691op/s 14499769.928op/s 14504965.181op/s 1.10% -0.251 -0.365 0.57% 5800.261op/s 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time 63.847µs 64.520µs ± 0.363µs 64.496µs ± 0.248µs 64.745µs 65.156µs 65.375µs 66.046µs 2.40% 0.763 0.632 0.56% 0.026µs 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 throughput 15140943.132op/s 15499598.479op/s ± 86771.121op/s 15504815.113op/s ± 59837.340op/s 15564624.320op/s 15614058.054op/s 15645110.200op/s 15662414.540op/s 1.02% -0.729 0.520 0.56% 6135.645op/s 1 200
credit_card/is_card_number_no_luhn/37828224631 execution_time 4.270µs 4.288µs ± 0.003µs 4.288µs ± 0.002µs 4.290µs 4.292µs 4.294µs 4.294µs 0.15% -1.215 8.490 0.06% 0.000µs 1 200
credit_card/is_card_number_no_luhn/37828224631 throughput 232860103.978op/s 233208703.680op/s ± 145648.164op/s 233218326.942op/s ± 85965.858op/s 233293391.965op/s 233417501.207op/s 233481674.285op/s 234176546.230op/s 0.41% 1.231 8.611 0.06% 10298.880op/s 1 200
credit_card/is_card_number_no_luhn/378282246310005 execution_time 61.533µs 61.928µs ± 0.085µs 61.942µs ± 0.047µs 61.981µs 62.041µs 62.076µs 62.084µs 0.23% -0.987 2.019 0.14% 0.006µs 1 200
credit_card/is_card_number_no_luhn/378282246310005 throughput 16107107.092op/s 16147885.619op/s ± 22192.230op/s 16144210.458op/s ± 12329.742op/s 16159998.760op/s 16186264.831op/s 16209645.960op/s 16251396.667op/s 0.66% 1.000 2.068 0.14% 1569.228op/s 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time 58.533µs 58.688µs ± 0.051µs 58.683µs ± 0.028µs 58.713µs 58.773µs 58.834µs 58.879µs 0.33% 0.456 1.283 0.09% 0.004µs 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput 16984103.872op/s 17039212.644op/s ± 14901.678op/s 17040585.443op/s ± 8049.579op/s 17048265.335op/s 17059189.984op/s 17072614.798op/s 17084432.665op/s 0.26% -0.448 1.276 0.09% 1053.708op/s 1 200
credit_card/is_card_number_no_luhn/x371413321323331 execution_time 6.275µs 6.442µs ± 0.013µs 6.443µs ± 0.002µs 6.445µs 6.450µs 6.454µs 6.457µs 0.21% -11.935 157.108 0.19% 0.001µs 1 200
credit_card/is_card_number_no_luhn/x371413321323331 throughput 154881672.993op/s 155221178.197op/s ± 309631.231op/s 155205635.982op/s ± 51650.369op/s 155258132.881op/s 155372234.982op/s 155434769.520op/s 159363350.593op/s 2.68% 12.032 158.794 0.20% 21894.234op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
credit_card/is_card_number/ execution_time [4.288µs; 4.290µs] or [-0.026%; +0.026%] None None None
credit_card/is_card_number/ throughput [233100152.712op/s; 233217546.127op/s] or [-0.025%; +0.025%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 execution_time [90.586µs; 90.764µs] or [-0.098%; +0.098%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 throughput [11018290.319op/s; 11039608.275op/s] or [-0.097%; +0.097%] None None None
credit_card/is_card_number/ 378282246310005 execution_time [83.662µs; 83.770µs] or [-0.064%; +0.064%] None None None
credit_card/is_card_number/ 378282246310005 throughput [11938066.908op/s; 11952765.211op/s] or [-0.062%; +0.062%] None None None
credit_card/is_card_number/37828224631 execution_time [4.289µs; 4.290µs] or [-0.014%; +0.014%] None None None
credit_card/is_card_number/37828224631 throughput [233108872.266op/s; 233173660.574op/s] or [-0.014%; +0.014%] None None None
credit_card/is_card_number/378282246310005 execution_time [80.865µs; 80.893µs] or [-0.017%; +0.017%] None None None
credit_card/is_card_number/378282246310005 throughput [12362018.411op/s; 12366313.205op/s] or [-0.017%; +0.017%] None None None
credit_card/is_card_number/37828224631000521389798 execution_time [58.683µs; 58.697µs] or [-0.012%; +0.012%] None None None
credit_card/is_card_number/37828224631000521389798 throughput [17036519.991op/s; 17040653.313op/s] or [-0.012%; +0.012%] None None None
credit_card/is_card_number/x371413321323331 execution_time [6.443µs; 6.444µs] or [-0.008%; +0.008%] None None None
credit_card/is_card_number/x371413321323331 throughput [155190765.669op/s; 155214201.351op/s] or [-0.008%; +0.008%] None None None
credit_card/is_card_number_no_luhn/ execution_time [4.288µs; 4.289µs] or [-0.009%; +0.009%] None None None
credit_card/is_card_number_no_luhn/ throughput [233175128.671op/s; 233216628.376op/s] or [-0.009%; +0.009%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time [69.679µs; 69.789µs] or [-0.079%; +0.079%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput [14329331.719op/s; 14352068.326op/s] or [-0.079%; +0.079%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time [64.470µs; 64.570µs] or [-0.078%; +0.078%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 throughput [15487572.836op/s; 15511624.121op/s] or [-0.078%; +0.078%] None None None
credit_card/is_card_number_no_luhn/37828224631 execution_time [4.288µs; 4.288µs] or [-0.009%; +0.009%] None None None
credit_card/is_card_number_no_luhn/37828224631 throughput [233188518.245op/s; 233228889.114op/s] or [-0.009%; +0.009%] None None None
credit_card/is_card_number_no_luhn/378282246310005 execution_time [61.916µs; 61.940µs] or [-0.019%; +0.019%] None None None
credit_card/is_card_number_no_luhn/378282246310005 throughput [16144809.989op/s; 16150961.248op/s] or [-0.019%; +0.019%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time [58.681µs; 58.695µs] or [-0.012%; +0.012%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput [17037147.414op/s; 17041277.873op/s] or [-0.012%; +0.012%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 execution_time [6.441µs; 6.444µs] or [-0.027%; +0.027%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 throughput [155178266.287op/s; 155264090.108op/s] or [-0.028%; +0.028%] None None None

Baseline

Omitted due to size.

#[must_use]
#[no_mangle]
/// TODO: @ivoanjo Should this take a `*mut ManagedStringStorage` like Profile APIs do?
pub unsafe extern "C" fn ddog_prof_ManagedStringStorage_advance_gen(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this function ?
It's not clear from the name what advance gen means. Does it mean increase the reference counter/generation number so it's not collected ?

Copy link
Contributor

@morrisonlevi morrisonlevi Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would appreciate some more exposition here too. This basically seems like a re-implementation of reference counted strings? Do we not use those std lib types for some specific reason? Edit: I mean internally. Obviously Rust "references" and lifetimes cannot be accurately tracked across FFI. Maybe they just provide the requisite API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent question. Quoting Alex's original notes:

String storage cleanup in the PoC is based on usage counts.

The initial implementation was not based on usage counts but on last_usage_gen: if we didn't use a string while building the current profile, we can drop it for the next profile. This led to crashes when we implemented the optimization to not report objects with age == 0 though (the interned strings associated with those objects would only be used after a GC but if a profile flush occurred in between the 2 events, the interned strings would become invalidated).

A trivial change to that would have been to only clean up when unused for x generations. But this felt brittle (what if we have a usecase where we'll skip objects with age < 10) and memory wasteful.

So yeah the original intent was to automatically clean up unused strings based on "was this used in the last profile or not".

This (almost) matches really well with heap profiling, because heap profiling is all about repeating a sample on every profile until the object gets collected, so this was a somewhat natural fit.

But, as Alex pointed out, this becomes a bit thornier as we have an optimization on the Ruby heap profiler to not report objects that haven't at least survived a single GC cycle.

Thus, in the current version, this is purely done based on the caller doing all the tracking work, rather than the generational approach.

I'd say the code is a bit... weird right now because it's not very confident on what to do here. So here's my questions to y'all:

  • Does the purely reference-counted mechanism work for you?
  • Would the usage-in-generation one work for you?

I'm thinking we could even easily support both (as a setting when the managed string table gets constructed), but I don't want to get too feature-happy if nobody's interested in it yet.

(I think once we settle this, it does make sense to re-examine if we can clean up the code as much as possible, as Levi pointed out -- I'm happy to have as little custom fancy stuff as possible)

@nsrip-dd
Copy link

Forgot to comment sooner, but I tried out using this in the Python profiler. Commit here, very WIP because I was trying out other approaches as well at the time. The biggest issue I ran into was around forking. If the program forks while the RwLock in the table is held, the child process will deadlock trying to access the table. I think we'd need to just make a new table on fork. Not the end of the world, but I guess just something to look out for.

@taegyunkim taegyunkim self-requested a review November 25, 2024 16:45
@ivoanjo
Copy link
Member Author

ivoanjo commented Nov 26, 2024

If the program forks while the RwLock in the table is held, the child process will deadlock trying to access the table. I think we'd need to just make a new table on fork. Not the end of the world, but I guess just something to look out for.

Yeap, this is a good point. I think even without the lock, it's probably not a great idea to do anything other than throw away the table after the fork.

(It would, on the other hand, be cool to have a lock-free implementation for this, but I fear the cost would probably not be great).

Would having a way to clear the table ignoring the lock work as a temporary solution for Python? How do y'all handle this issue for the regular profile data structures? Reset on fork as well?

@codecov-commenter
Copy link

codecov-commenter commented Nov 26, 2024

Codecov Report

Attention: Patch coverage is 11.71171% with 392 lines in your changes missing coverage. Please review.

Project coverage is 70.72%. Comparing base (de64524) to head (022e6d8).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #725      +/-   ##
==========================================
- Coverage   71.28%   70.72%   -0.57%     
==========================================
  Files         319      321       +2     
  Lines       46871    47300     +429     
==========================================
+ Hits        33414    33453      +39     
- Misses      13457    13847     +390     
Components Coverage Δ
crashtracker 39.73% <ø> (-0.03%) ⬇️
crashtracker-ffi 5.74% <ø> (ø)
datadog-alloc 98.73% <ø> (ø)
data-pipeline 91.48% <ø> (ø)
data-pipeline-ffi 90.08% <ø> (ø)
ddcommon 80.24% <ø> (ø)
ddcommon-ffi 62.11% <ø> (ø)
ddtelemetry 59.51% <ø> (ø)
ddtelemetry-ffi 22.46% <ø> (ø)
dogstatsd 90.29% <ø> (ø)
dogstatsd-client 79.77% <ø> (ø)
ipc 82.69% <ø> (ø)
profiling 78.96% <11.71%> (-5.34%) ⬇️
profiling-ffi 67.66% <13.48%> (-9.90%) ⬇️
serverless 0.00% <ø> (ø)
sidecar 41.79% <ø> (ø)
sidecar-ffi 10.78% <ø> (ø)
spawn-worker 54.37% <ø> (ø)
tinybytes 93.60% <ø> (ø)
trace-mini-agent 72.48% <ø> (ø)
trace-normalization 98.23% <ø> (ø)
trace-obfuscation 95.96% <ø> (ø)
trace-protobuf 77.67% <ø> (ø)
trace-utils 94.13% <ø> (ø)

@r1viollet
Copy link
Contributor

Would everyone agree with merging this ?

  • It does not impact other user's workflows.
  • It allows us to experiment / measure with this approach.

@nsrip-dd
Copy link

Would everyone agree with merging this ?

Yeah. I followed up on this with @ivoanjo on slack but forgot to update here. It turns out we don't need this right now for Python so nothing about this needs to change on our account :)

}

#[no_mangle]
/// TODO: @ivoanjo Should this take a `*mut ManagedStringStorage` like Profile APIs do?
Copy link
Contributor

@morrisonlevi morrisonlevi Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this is something even in C I go back and forth on. It's one of those "do I trust the user or do I be more defensive?" things. Setting the C pointer to null makes it easier to debug when things go wrong, and can sometimes even prevent further things from going wrong. Sometimes it also makes it harder to debug because you don't get a use-after-free warning from ASAN, so that can swing both ways. But it's theoretically wholly wasted work because nobody should use the thing after it's been dropped...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the mut option I think is nice since it means we're often returning an error back on the api calls wrongly, and since the client should handle those anyway, it means we're turning something that's a definitively a bug into a nice error message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't have the "answer" to this one. All I can say is that I made the others re-assign the pointer null on drop when it was feasible, so I clearly thought at the time that it was a good idea.


#[must_use]
#[no_mangle]
/// TODO: Consider having a variant of intern (and unintern?) that takes an array as input, instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a use case for this in your PoC for Ruby? If so, I'd do it, and if not, I'd pass.

Copy link
Member Author

@ivoanjo ivoanjo Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do! We're literally interning in a loop when I need to consume a whole stack. (Note: intern_or_raise here is just a nice helper to call ddog_prof_ManagedStringStorage_intern and check if there's an error in the result)

heap_stack* heap_stack_new(heap_recorder *recorder, ddog_prof_Slice_Location locations) {
  uint16_t frames_len = locations.len;
  // ...some error checking...
  heap_stack *stack = ruby_xcalloc(1, sizeof(heap_stack) + frames_len * sizeof(heap_frame));
  stack->frames_len = frames_len;
  for (uint16_t i = 0; i < stack->frames_len; i++) {
    const ddog_prof_Location *location = &locations.ptr[i];
    stack->frames[i] = (heap_frame) {
      .name = intern_or_raise(recorder->string_storage, location->function.name),
      .filename = intern_or_raise(recorder->string_storage, location->function.filename),
      // ddog_prof_Location is a int64_t. We don't expect to have to profile files with more than
      // 2M lines so this cast should be fairly safe?
      .line = (int32_t) location->line,
    };
  }
  return stack;
}

(from my working branch which is based off of DataDog/dd-trace-rb#3628 ).

My thinking is that, unlike most other libdatadog APIs where either a) Are very small but we don't call them very often (e.g. setup and reporting); b) We do a big chunk of work on every call (profile add), this API does c) Both very little work and gets called many times.

Thus, it seems like a prime candidate to turn C into B -> by having a more coarse-grained call that lowers the overhead cost of the ffi and locking.

Comment on lines 17 to 18
cached_seq_num_for: Cell<Option<*const StringTable>>,
cached_seq_num: Cell<Option<StringId>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are storing the cache on each string, but aren't these all added in a batch to the same string table? I think you said that you add all these managed strings to the Profile's string table just before serialization, right? Couldn't we perform a larger batch operation and store the cache there? That way the memory is only used on serialization rather than kept around but largely not being used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I as looking at this, I did a small tweak to store these as a tuple, rather than as separate entries, as they're related anyway -- 1f2b953 .

Your suggestion is interesting, but I'm curious how far were you thinking about the "large batch operation". In particular, were you thinking of moving the cache entirely away from the string table, to the profile? Or even to the caller of the profile?

Copy link
Contributor

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite finished but publishing what I have.

Copy link
Contributor

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still considered "experimental"? What's the rough plan to either revert it or make it non-experimental? Asking mostly because it it does intrude, albeit only a little, onto the FFI structs of the "main" thing.

Approved in general, I've blocked this long enough I think ^_^

}

#[no_mangle]
/// TODO: @ivoanjo Should this take a `*mut ManagedStringStorage` like Profile APIs do?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't have the "answer" to this one. All I can say is that I made the others re-assign the pointer null on drop when it was feasible, so I clearly thought at the time that it was a good idea.

@ivoanjo
Copy link
Member Author

ivoanjo commented Jan 17, 2025

Is this still considered "experimental"? What's the rough plan to either revert it or make it non-experimental?

Good question. I guess my current thinking about calling it "experimental" is because it's a feature we may end up iterating a lot on, including breaking APIs and whatnot, so if anyone's building on top of this, I was trying to get that across.

Since the early benchmarks we had for Ruby showed good results for this I wouldn't expect this to be entirely reverted; at most maybe we could throw it in a Ruby-specific folder if it turns out this use-case is too specific and nobody else needs it.

Asking mostly because it it does intrude, albeit only a little, onto the FFI structs of the "main" thing.

My thinking is that it's still harmless, even if ffi callers don't properly zero out those fields, for two reasons:

  1. The code that picks whether to use ids at the ffi level goes "is there a regular string? yes, then use that and don't look at the id"
  2. If somehow we missed a spot and somehow the garbage ids get accidentally used instead of strings, then, because there's no string table installed (e.g. because only Ruby will have one), the FFI will report a clear "there's no string table" error, and we can use that to spot the issue.

To be honest I'm somewhat more annoyed about the duplication in the profile implementation, but there, because I think the right solution is to have separate structures (thus enforcing that you either have ids OR strings), the only way of avoiding the duplication would be to introduce a bunch of macros to "hide" the fact that we'd have two versions of the code with only the very minor differences. (At least given the current API....)

What's the rough plan [...]

I'd say:

  1. Get this out for use in a stable libdatadog version
  2. If we spot any issues with other libraries caused by these APIs, change them
  3. Consider it no longer experimental once we've validated it and put out the first Ruby profiler version that's using this for heap profiling (although still very open for feedback/changes, if anyone also wants to use it)

Approved in general, I've blocked this long enough I think ^_^

cc @danielsn any concerns with me going ahead and merging the PR?

/// TODO: Consider having a variant of intern (and unintern?) that takes an array as input, instead
/// of just a single string at a time.
/// TODO: @ivoanjo Should this take a `*mut ManagedStringStorage` like Profile APIs do?
pub unsafe extern "C" fn ddog_prof_ManagedStringStorage_intern(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these have #safety comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure they'd be especially helpful here.

In particular, the only comment I can think of is "the charslice needs to be valid or empty, and the managed string storage needs to be valid or null", which... IDK... seems to describe every function in our ffi that takes a pointer? 👀

This will later allow us to introduce the new StringId code without
disturbing existing API clients.

This duplication is intended to be only an intermediate step to
introducing support for StringId.
Credit goes to @AlexJF, this is lifted from his earlier PR
#607
…called with id 0

This is much nicer than having a weird panic in there.
…safer

With the current structure of the code, the `expect` inside `resolve`
should never fail; hopefully we don't introduce a bug in the future
that changes this.

(I know that ideally in Rust we would represent such constraints
in the type system, but I don't think I could do so without a lot of
other changes, so I decided to go for the more self-contained
solution for now.)
In particular, in the unlikely event that we would overflow the id,
we signal an error back to the caller rather than impacting the
application.

The caller is expected to stop using this string table and create
a new one instead. In the future we may make it easy to do so, by
e.g. having an API to create a new table from the existing strings
or something like that.
This will enable us to propagate failures when a ManagedStringId is not
known, which will improve debugability and code quality by allowing us
to signal the error.
This string is supposed to live for as long as the managed string
storage does.

Treating it specially in intern matches what we do in other functions
and ensures that we can never overflow the reference count (or
something weird like that).
Adding it as `pub` was an oversight, since the intent is for this to
be an inner helper that's only used by `intern` and by `new`.

Having this as `pub` is quite dangerous as this method can easily
be used to break a lot of the assumptions for the string storage.
There's currently nothing that can fail in this conversion, so let's
take advantage of this in the code.

(The `TryFrom` was somewhat of a leftover from copy/pasting these
conversion functions from the variants that did need to deal with
Strings, but in the id variants we can simplify).
This should be safer than the existing helper, since according to Levi
it doesn't rely on the string being null-terminated (and I'm guessing
doesn't need to measure it either).
I spotted during code review that this was incorrect -- `unintern` is
a mutable operation on the managed string table (it decreases the
refcount of items) so the write lock must be used for it.
Not sure if this was there from the beginning or the result of
successive refactors, but yeah, we were unpacking and repacking
the `ManagedStringId`s uselessly (the input and output types were
the same!).

I'm pretty sure the compiler would optimize all of this away -- in
the end this is a struct with an int -- but our code sure does look
better.
@ivoanjo
Copy link
Member Author

ivoanjo commented Jan 20, 2025

I think the commit history for this PR is worth preserving. I'm preparing to merge this to master with a regular merge commit (not with a squash). As part of it, I'll push force a rebase so that there's no more "merge from master" commits.

@ivoanjo ivoanjo force-pushed the ivoanjo/prof-9476-managed-string-storage-try3-clean branch from 8de5289 to 022e6d8 Compare January 20, 2025 11:46
@ivoanjo ivoanjo enabled auto-merge (rebase) January 20, 2025 11:48
@ivoanjo ivoanjo merged commit 723277d into main Jan 20, 2025
31 checks passed
@ivoanjo ivoanjo deleted the ivoanjo/prof-9476-managed-string-storage-try3-clean branch January 20, 2025 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

profiling Relates to the profiling* modules.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants