Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BlobSchema V2 with ZeroTrie #4207

Merged
merged 10 commits into from
Nov 2, 2023
Merged

Add BlobSchema V2 with ZeroTrie #4207

merged 10 commits into from
Nov 2, 2023

Conversation

sffc
Copy link
Member

@sffc sffc commented Oct 22, 2023

Part of #3865

Relates to #2699

Need to investigate the test failures but this should be mostly ready.

@sffc sffc mentioned this pull request Oct 22, 2023
Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sffc
Copy link
Member Author

sffc commented Oct 24, 2023

I added this PR to the 1.4 milestone for tracking the new feature, left a comment on #2699 about investigating ZeroHashMap, and moved the issue to backlog.

@sffc
Copy link
Member Author

sffc commented Oct 24, 2023

Also filed #4216 for further discussion.

sffc added a commit that referenced this pull request Oct 25, 2023
Related: #4216,
#4207,
#2699

See the numbers in locale_aux_test.rs
@sffc sffc marked this pull request as ready for review November 2, 2023 19:50
@sffc sffc requested review from Manishearth and a team as code owners November 2, 2023 19:50
@sffc
Copy link
Member Author

sffc commented Nov 2, 2023

Bench results:

provider/construct/v1   time:   [58.462 ns 58.652 ns 58.854 ns]
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild
  3 (3.00%) high severe

provider/construct/v2   time:   [58.448 ns 58.615 ns 58.781 ns]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

provider/read/v1        time:   [2.1391 µs 2.1436 µs 2.1498 µs]
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high severe

provider/read/v2        time:   [1.9701 µs 1.9753 µs 1.9838 µs]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

provider/blob/Cargo.toml Show resolved Hide resolved
Comment on lines 153 to 155
let mut version = 1;

{
let mut exporter = BlobExporter::new_with_sink(Box::new(&mut blob));
while version <= 2 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut version = 1;
{
let mut exporter = BlobExporter::new_with_sink(Box::new(&mut blob));
while version <= 2 {
for version in [1, 2] {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

exporter.close().unwrap();
let mut version = 1;

while version <= 2 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

b.iter(|| {
for locale in black_box(&locales).iter() {
let _: DataResponse<HelloWorldV1Marker> = black_box(&provider)
.as_deserializing()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd prefer benching this as a BufferProvider without deserialization

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

provider/blob/src/blob_data_provider.rs Show resolved Hide resolved
.locales
.get(idx0)
.ok_or_else(|| DataError::custom("Invalid blob bytes").with_req(key, req))?;
// TODO: Add a lookup function to zerotrie so we don't need to stringify
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.ok_or_else(|| DataError::custom("Invalid blob bytes").with_req(key, req))?;
// TODO: Add a lookup function to zerotrie so we don't need to stringify
let locale_str = req.locale.write_to_string();
let idx1 = ZeroTrieSimpleAscii::from_store(zerotrie)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blob_index?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

run_driver(exporter).unwrap();
assert_eq!(BLOB_V1, blob.as_slice());

let blob_privider = BlobDataProvider::try_new_from_blob(blob.into_boxed_slice()).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

provider/blob/src/export/blob_exporter.rs Outdated Show resolved Hide resolved
pub fn new_with_sink(sink: Box<dyn std::io::Write + Sync + 'w>) -> Self {
Self {
resources: Mutex::new(Vec::new()),
resources: Mutex::new(BTreeMap::new()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: change to Default::default() where possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@sffc
Copy link
Member Author

sffc commented Nov 2, 2023

New bench results: after switching to testing only load_buffer:

provider/construct/v1   time:   [58.328 ns 58.559 ns 58.827 ns]
                        change: [-0.2722% +0.7391% +2.1320%] (p = 0.28 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  6 (6.00%) high severe

provider/construct/v2   time:   [58.070 ns 58.497 ns 59.049 ns]
                        change: [-0.6234% -0.1470% +0.4131%] (p = 0.60 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

provider/read/v1        time:   [1.4992 µs 1.5020 µs 1.5051 µs]
                        change: [-30.699% -30.154% -29.699%] (p = 0.00 < 0.05)
                        Performance has improved.

provider/read/v2        time:   [1.1215 µs 1.1232 µs 1.1246 µs]
                        change: [-43.400% -43.242% -43.088%] (p = 0.00 < 0.05)
                        Performance has improved.

sffc and others added 2 commits November 2, 2023 13:51
Co-authored-by: Robert Bastian <4706271+robertbastian@users.noreply.github.com>
robertbastian
robertbastian previously approved these changes Nov 2, 2023
provider/blob/src/export/blob_exporter.rs Outdated Show resolved Hide resolved
provider/blob/src/export/blob_exporter.rs Outdated Show resolved Hide resolved
Co-authored-by: Robert Bastian <4706271+robertbastian@users.noreply.github.com>
@sffc sffc merged commit f223e11 into unicode-org:main Nov 2, 2023
28 checks passed
@sffc sffc deleted the blobv2 branch November 2, 2023 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants