Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve memory usage when packing a model #161

Open
VivekPanyam opened this issue Sep 29, 2023 · 0 comments
Open

Improve memory usage when packing a model #161

VivekPanyam opened this issue Sep 29, 2023 · 0 comments
Labels
good first issue Good for newcomers

Comments

@VivekPanyam
Copy link
Owner

The previous zip file library we used during packing required complete files to be available before they could be stored. This required us to load large (possibly multi GB) files into memory.

This is no longer required. The following two places within the packing code can be refactored to read, compute sha256, and store files in a streaming/incremental fashion:

// Load the data and compute the sha256
let mut hasher = Sha256::new();
let data = tokio::fs::read(entry.path()).await.unwrap();
hasher.update(&data);
let sha256 = format!("{:x}", hasher.finalize());
manifest_contents.insert(relative_path.clone(), Some(sha256));
// Add the entry to the zip file
writer = tokio::task::spawn_blocking(move || {
writer
.start_file(
relative_path,
zip::write::FileOptions::default()
.compression_method(zip::CompressionMethod::Zstd),
)
.unwrap();
writer.write_all(&data).unwrap();
writer
})
.await
.unwrap();

// Load the data and compute the sha256
let mut hasher = Sha256::new();
let data = tokio::fs::read(entry.path()).await.unwrap();
log::trace!("Done reading file {}", &relative_path);
let (data, sha256) = tokio::task::spawn_blocking(move || {
hasher.update(&data);
(data, format!("{:x}", hasher.finalize()))
})
.await
.unwrap();
log::trace!("Computed sha256 of {}", &relative_path);
// Only store the file in the zip if (1) we don't have any linked files or (2) the linked files don't include this sha256
if linked_files
.as_ref()
.map_or(true, |v| !v.urls.contains_key(&sha256))
{
// Add the entry to the zip file
let relative_path = relative_path.clone();
writer = tokio::task::spawn_blocking(move || {
writer
.start_file(
relative_path,
zip::write::FileOptions::default()
.compression_method(zip::CompressionMethod::Zstd)
.large_file(data.len() >= 4 * 1024 * 1024 * 1024),
)
.unwrap();
writer.write_all(&data).unwrap();
writer
})
.await
.unwrap();
}

@VivekPanyam VivekPanyam added the good first issue Good for newcomers label Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant