-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(profiling)!: add arena-based string table #306
Conversation
9a650e3
to
eb600be
Compare
In PHP, I got a 3-4% memory savings, ~2.22MiB, on the containers I tested. I'm happy with this number for PHP and look forward to hearing numbers from any other runtime which can test it. |
This is a breaking change: 1. Many routines which weren't fallable now are. 2. The minimum string arena capacity must be passed to the profile.
600678b
to
f613721
Compare
6353b90
to
751cc07
Compare
profiling-ffi/src/profiles.rs
Outdated
/// and must have the correct number of elements for the slice. | ||
#[no_mangle] | ||
#[must_use] | ||
pub unsafe extern "C" fn ddog_prof_Profile_new( | ||
sample_types: Slice<ValueType>, | ||
period: Option<&Period>, | ||
start_time: Option<&Timespec>, | ||
string_arena_min_capacity: libc::size_t, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to expose this to the client libraries ? I feel like that should be internal.
To be honest, when I create a profile object I do not know the minimal.
I feel like it's worth hiding it (since it's a technical detail)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's exposed because every runtime has different ideas about how much memory usage is okay. You making this comment made me realize that "min_capacity" is not the best name: it rounds up to a page size, and if you exceed that it will fail. So this number should be close to the maximum you want. It's lazily allocated, so you don't need to worry too much about the memory unless you actually use it, as long as your request is sane (you know, like 1 GiB, not 1 TiB).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's exposed because every runtime has different ideas about how much memory usage is okay.
Hum I see, but for .NET I do not know what would be a good value.
It will push libraries to use a big value (since it's lazy) and forget about it. Going that way, having it internal still makes sense.
If you still want to let the libraries to set this value, you could make it optional no?
ProfileNewResult::Ok(ffi_profile) | ||
match internal::Profile::new(start_time, &types, period, string_arena_min_capacity) { | ||
Ok(internal_profile) => ProfileNewResult::Ok(Profile::new(internal_profile)), | ||
Err(err) => ProfileNewResult::Err(Error::from(err.context("ddog_prof_Profile_new failed"))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a test that actually fails no?
@@ -468,8 +469,13 @@ pub unsafe extern "C" fn ddog_prof_Profile_set_endpoint( | |||
} | |||
}; | |||
let endpoint = endpoint.to_utf8_lossy(); | |||
profile.add_endpoint(local_root_span_id, endpoint); | |||
ProfileResult::Ok(true) | |||
if let Err(err) = profile.add_endpoint(local_root_span_id, endpoint) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment about testing
This helps keep the APIs for various platforms in sync. It also helps with testing, to ensure the code that handles virtual_alloc failures actually works.
This is a more accurate name that suggests the exact value will not necessarily be used.
@@ -884,6 +893,7 @@ mod test { | |||
Slice::from_raw_parts(sample_type, 1), | |||
None, | |||
None, | |||
4096, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any tests that vary this value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not specifically in FFI, no. There are tests on arena and string table for small values like 64
, and there is a test on string table which does 4 * u16::MAX
or in other words, a number just under 256 KiB.
But not currently testing other values in FFI.
@@ -151,7 +151,8 @@ fn main() -> anyhow::Result<()> { | |||
replayer.start_time, | |||
&replayer.sample_types, | |||
replayer.period, | |||
); | |||
u32::MAX as usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to provide some value. A max of 4 GiB seemed okay for request replayer.
let layout = layout.pad_to_align(); | ||
let size = layout.size(); | ||
if size == 0 { | ||
return Ok(NonNull::from(&mut [])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not NonNull::dangling()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's a wide pointer NonNull<[u8]>
, so we also need size info.
error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
--> profiling/src/alloc/arena.rs:70:23
|
70 | return Ok(NonNull::dangling());
| ^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
|
= help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `NonNull::<T>::dangling`
--> /Users/levi.morrison/.rustup/toolchains/1.69.0-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/non_null.rs:67:6
|
67 | impl<T: Sized> NonNull<T> {
| ^ required by this bound in `NonNull::<T>::dangling`
// but in practice, it won't fail because it's rounded to a page size, | ||
// and I've never seen pages that small, even for 16 bit. However, in | ||
// any case, it should not panic, which is the point of the call. | ||
_ = std::hint::black_box(arena.allocate_zeroed(Layout::new::<u8>())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL that we don't need a let
before _
|
||
#[test] | ||
fn test_arena_simple_alignment() -> anyhow::Result<()> { | ||
const DISTANCE: usize = 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason that DISTANCE
has the opposite meaning in this vs the previous test?
profiling/src/alloc/arena.rs
Outdated
let third = arena.allocate_zeroed(pointer)?; | ||
|
||
expect_distance(first, second, mem::size_of::<*const ()>()); | ||
expect_distance(second, third, mem::size_of::<*const ()>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also directly test that the pointers are aligned
pub r#type: StringId, | ||
pub unit: StringId, | ||
pub(crate) struct ValueType { | ||
pub r#type: (Option<LengthPrefixedStr>, StringId), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need both of these?
use std::sync::Once; | ||
|
||
#[derive(Copy, Clone, Debug, Eq, PartialEq)] | ||
pub struct AllocError; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want opaque or discriminated errors on this API
static INIT: Once = Once::new(); | ||
static mut PAGE_SIZE: usize = 0; | ||
|
||
unsafe { | ||
INIT.call_once(|| { | ||
PAGE_SIZE = os::page_size().unwrap(); | ||
}); | ||
PAGE_SIZE | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a lazy_static
?
profiling/src/alloc/virtual.rs
Outdated
/// The fatptr must have been previously allocated by [virtual_alloc] by | ||
/// this allocator, and must have the same address and length as it was | ||
/// returned with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And must not have been previously freed
/// The language infers !Send because the value types have length-prefixed | ||
/// strings inside of them, and these are not Send by themselves. However, if | ||
/// the whole profile is moved, including the string table, then this is safe. | ||
unsafe impl Send for Profile {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have code that actually takes advantage of this?
unit: profile.intern(vt.unit), | ||
.map(|vt| -> anyhow::Result<ValueType> { | ||
Ok(ValueType { | ||
r#type: profile.strings.insert_full(vt.r#type)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is going on here
let arena = self.strings.arena(); | ||
api::Period { | ||
r#type: api::ValueType { | ||
r#type: arena.fetch(t.1.r#type.0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment explaining what's going on here
let mut iter = self.strings.into_iter(); | ||
while let Some(item) = iter.next() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we use the for loop? Does into_iter
not actually give an iter here?
profiling/src/internal/profile.rs
Outdated
#[inline] | ||
fn intern<S>(&mut self, item: &S) -> anyhow::Result<StringId> | ||
where | ||
S: ?Sized + Borrow<str>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
|
||
let period = ( | ||
10_000_000, | ||
ValueType { | ||
r#type: profile.intern("wall-time"), | ||
unit: profile.intern("nanoseconds"), | ||
r#type: profile.strings.insert_full("wall-time")?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because of the weird type we have for valuetypes
What happens if HUGE_PAGES are enabled? |
@@ -133,7 +143,8 @@ impl Profile { | |||
start_time: SystemTime, | |||
sample_types: &[api::ValueType], | |||
period: Option<api::Period>, | |||
) -> Self { | |||
string_arena_capacity_hint: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and in the ffi, it may be worth specifying the unit here is in bytes maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a hint, or an actual max ?
Do we need to put a big value and assume that only when we start using them will there be a RSS usage ?
There seems to be some concerns around providing accurate hints here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It gets rounded to a page size. There's mostly no consequence for setting this high, because it's virtual memory so you pay for what you actually use. There are some edges though, like on Linux, OOM kill will be more attracted to your process if you have high virtual memory use. And .NET supports 32-bit processes which only has 2 GiB of virtual memory available per process.
I met with Christophe and Gregory and they are concerned about hitting a max, any max, because that will screw up the accuracy of the upscaling, and they would consider this inaccuracy intolerable in .NET. I knew that the data would incomplete, and that may have been okay. But it will also be inaccurate, and that's not.
Making a more complicated and advanced allocator is a possibility, but then the risk surface goes up too. I'm currently exploring an arena which has fixed-sized chunks (maybe 1-2 MiB in size), and it allocates new chunks as needed. It uses the system allocator to hold these in a Vec
, which isn't ideal, but that's far less memory than is represented by the chunks themselves, so it would still greatly reduce the effects of commingling with the system allocator. I could avoid this by using an intrusive linked-list, but this slows performance and makes it harder to provide a global offset for it. Anyway, this is the direction I'm looking at right now, while also searching for crates that may do something we want when I think of a new search term to maybe find new ones.
Closing in favor of #404. |
What does this PR do?
Creates a string table that stores the bytes of the strings into an arena, instead of using individual allocations. The arena has a maximum capacity, and uses virtual memory to acquire pages lazily.
This is a breaking change:
ddog_prof_Profile_new
.Motivation
I'm working on reducing overhead of the PHP profiler, and reducing memory of the string table is a nice step. It also puts less pressure onto the system allocator, which should help others like the Ruby profiler as well.
Additional Notes
The PR is quite large. Maybe I should split it into two pieces, but it's kind of hard to know where to split it.
PROF-9125
How to test the change?
The function
ddog_prof_Profile_new
has a new parameter for the minimum size of the arena, which will round up to the nearest page size for you. The memory is virtually allocated, so it will be paged in by the OS as needed. You don't need to worry too much about its size, as long as you are being reasonable (e.g. 2 GiB may be fine, 2 TiB is not).For Reviewers
@DataDog/security-design-and-guidance
.