-
-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add string builder to build JsString
#3915
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3915 +/- ##
==========================================
+ Coverage 47.24% 52.88% +5.64%
==========================================
Files 476 483 +7
Lines 46892 46934 +42
==========================================
+ Hits 22154 24821 +2667
+ Misses 24738 22113 -2625 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Check my comment if they make sense :)
@HalidOdat, one more thing is that I used fn reserve(&mut self, new_layout: Layout) {
let new_ptr = if self.is_dangling() {
let new_ptr = unsafe { alloc(new_layout) };
let Some(new_ptr) = NonNull::new(new_ptr.cast::<RawJsString>()) else {
std::alloc::handle_alloc_error(new_layout)
};
new_ptr
} else {
use std::alloc::realloc;
let old_ptr = self.inner.as_ptr();
let ptr = unsafe { realloc(ptr, new_layout, new_layout.size()) };
let Some(new_ptr) = NonNull::new(new_ptr.cast::<RawJsString>()) else {
std::alloc::handle_alloc_error(new_layout)
};
new_ptr
};
self.inner = new_ptr;
let new_arr_size = new_layout.size() - DATA_OFFSET;
self.cap = new_arr_size / Self::DATA_SIZE;
} |
let ptr = unsafe { realloc(ptr, new_layout, new_layout.size()) }; This may be unsafe because in the realloc's safety section it has the requirement that the layout passed to it is See: https://doc.rust-lang.org/stable/core/alloc/trait.GlobalAlloc.html#method.realloc |
I benchmarked with the current implementation (alloc) and with realloc and found that it's considerably faster (14x), here is my benchmark code if you want to reproduce it. fn main() {
let mut sb = JsStringBuilder::<u8>::new();
for _ in 0..10000 {
sb.push(b'a');
sb.extend_from_slice(b"bcde".as_slice());
sb.extend([b'f', b'g'].into_iter());
}
let s = sb.build();
drop(s);
} hyperfine -N --warmup=10 ./sb-alloc ./sb-realloc
NOTE: Changed |
@HalidOdat Benchmark 1: ./jsstringbuilder-alloc.exe
Time (mean ± σ): 3.2 ms ± 0.2 ms [User: 0.7 ms, System: 1.9 ms]
Range (min … max): 3.0 ms … 6.0 ms 952 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 2: ./jsstringbuilder-realloc.exe
Time (mean ± σ): 3.0 ms ± 0.2 ms [User: 1.0 ms, System: 1.6 ms]
Range (min … max): 2.8 ms … 6.1 ms 1028 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
./jsstringbuilder-realloc.exe ran
1.07 ± 0.08 times faster than ./jsstringbuilder-alloc.exe |
I use Linux (EndeavourOS).
Yes, there is no drawback to using realloc and even on windows (from the results you posted) there is still a small |
Ok |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the suggested adjustments, this looks great! :)
JsStringBuilder
to build JsString
JsString
#[doc(hidden)] | ||
mod private { | ||
pub trait Sealed {} | ||
|
||
impl Sealed for u8 {} | ||
impl Sealed for u16 {} | ||
} | ||
|
||
/// Inner elements represented for `JsStringBuilder`. | ||
pub trait JsStringData: private::Sealed {} | ||
|
||
impl JsStringData for u8 {} | ||
impl JsStringData for u16 {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing a second pass I realized that maybe this abstraction is not needed? More so because pushing u16
is always valid, but pushing u8
needs to do latin1 checks, and having separate builders for each type would allow us to reflect that in the API.
Since we now use an enum for the generic StringBuilder
that doesn't need the JsStringData
trait, maybe we can just have different types for each builder, then just extract the common paths into utility functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing a second pass I realized that maybe this abstraction is not needed?
JsStringData
could be removed since I only export specific builders.
maybe we can just have different types for each builder, then just extract the common paths into utility functions.
Have we already done this by using generic?
More so because pushing u16 is always valid, but pushing u8 needs to do latin1 checks, and having separate builders for each type would allow us to reflect that in the API.
This could be done with implementations for generics seperately like this:
impl<T> JsStringBuilder<T> {
fn _build(mut self, latin1_check: Option<bool>) -> JsString { ... }
}
impl JsStringBuilder<u8> {
// ---check inside---
pub fn build(self) -> JsString {
self._build(None)
}
// ---check outside---
pub unsafe fn build_as_utf16(self) -> JsString {
self._build(Some(false))
}
pub unsafe fn build_as_latin1(self) -> JsString {
self._build(Some(true))
}
}
impl JsStringBuilder<u16> {
// no need to check
pub fn build(self) -> JsString {
self._build(Some(false))
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that could potentially work, maybe calling it something like JsStringTypedBuilder
to distinguish it from the plain JsStringBuilder
that can accept str, utf16 and latin1
There are many places where
Vec
is used as a builder to createJsString
.This adds
JsStringBuilder
as a more natural choice.