Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Zstd instead of gzip for compression #552

Closed
jrmuizel opened this issue Oct 22, 2019 · 10 comments
Closed

Use Zstd instead of gzip for compression #552

jrmuizel opened this issue Oct 22, 2019 · 10 comments

Comments

@jrmuizel
Copy link
Contributor

We're currently spending an appreciable amount of cpu time doing zlib compression:

  32.67%  sccache-dist     sccache-dist        [.] miniz_oxide::deflate::core::compress_inner
  11.67%  sccache-dist     sccache-dist        [.] miniz_oxide::deflate::core::compress_block

Using Zstd should give better compression and better performance.

@jrmuizel
Copy link
Contributor Author

FWIW, icecream uses lzo which is also a lot faster.

@froydnj
Copy link
Contributor

froydnj commented Oct 22, 2019

Isn't that just because miniz_oxide is written in pure Rust without the massive optimization work that's gone into zlib?

@jrmuizel
Copy link
Contributor Author

I'm not sure how the performance of miniz_oxide compares to zlib, but even zlib is appreciably slower than Zstd.

@luser
Copy link
Contributor

luser commented Oct 22, 2019

Currently cache entries are just zip files. It wouldn't be hard to change that, all the code for creating and unpacking cache entries is in this file:
https://github.com/mozilla/sccache/blob/d3093392af6f593361c229deb36dd14c964c0778/src/cache/cache.rs

You'd need a container format of some sort since there can be multiple output files + stdout/stderr, but it should be straightforward to use tar files.

@luser
Copy link
Contributor

luser commented Oct 22, 2019

Oh, it strikes me that since you're using sccache-dist there this is probably toolchain/inputs/outputs packaging.

Toolchain:

sccache/src/dist/pkg.rs

Lines 146 to 159 in d309339

pub fn into_compressed_tar<W: Write>(self, writer: W) -> Result<()> {
use flate2::write::GzEncoder;
let ToolchainPackageBuilder { dir_set, file_set } = self;
let mut builder = tar::Builder::new(GzEncoder::new(writer, flate2::Compression::default()));
for (tar_path, dir_path) in dir_set.into_iter() {
builder.append_dir(tar_path, dir_path)?
}
for (tar_path, file_path) in file_set.into_iter() {
let file = &mut fs::File::open(file_path)?;
builder.append_file(tar_path, file)?
}
builder.finish().map_err(Into::into)
}

Inputs packaging:

let mut compressor = ZlibWriteEncoder::new(&mut body, Compression::fast());

Inputs unpackaging:
let inputs_rdr = InputsReader(Box::new(ZlibReadDecoder::new(body)));

Outputs are handled here:

sccache/src/dist/mod.rs

Lines 424 to 446 in d309339

#[derive(Clone, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct OutputData(Vec<u8>, u64);
impl OutputData {
#[cfg(any(feature = "dist-server", all(feature = "dist-client", test)))]
pub fn try_from_reader<R: Read>(r: R) -> io::Result<Self> {
use flate2::Compression;
use flate2::read::ZlibEncoder as ZlibReadEncoder;
let mut compressor = ZlibReadEncoder::new(r, Compression::fast());
let mut res = vec![];
io::copy(&mut compressor, &mut res)?;
Ok(OutputData(res, compressor.total_in()))
}
pub fn lens(&self) -> OutputDataLens {
OutputDataLens { actual: self.1, compressed: self.0.len() as u64 }
}
#[cfg(feature = "dist-client")]
pub fn into_reader(self) -> impl Read {
use flate2::read::ZlibDecoder as ZlibReadDecoder;
let decompressor = ZlibReadDecoder::new(io::Cursor::new(self.0));
decompressor
}
}

Unfortunately if you change this you'll either need to support both compression formats for interop or change the whole dist API to a v2.

@luser
Copy link
Contributor

luser commented Oct 23, 2019

Related, while looking at that code I realized that at least for Rust compilation packaging compile inputs is probably doing a lot of redundant work so I filed this issue on a potential optimization: #558

@froydnj
Copy link
Contributor

froydnj commented Nov 19, 2019

A clobber build that solely distributes compiles to a remote does appear to spend a significant amount of time in the local sccache server writing cache entries and therefore doing a bunch of deflate compression (50%, I think?). perf is being its usual unhelpful self, but I do not see any flate2-related stuff regarding any of the dist API in the performance profile.

I guess we could modify zip-rs to support zstd for individual data entries in the zip file? It's unmaintained, which I guess would be a good thing for something like this. I don't know that I'd feel comfortable hard-coding some constant in zip entries to be zstd, so we'd have to figure out how to plumb that down into the guts of ZipArchive with some kind of trait? (Something like rc-zip which is only a "statement of intent" at the momemt.)

@froydnj
Copy link
Contributor

froydnj commented Nov 19, 2019

A clobber build that solely distributes compiles to a remote does appear to spend a significant amount of time in the local sccache server writing cache entries and therefore doing a bunch of deflate compression (50%, I think?).

I should note that these measurements were taken on a binary that changed input compression to be done with zstd...but I'll also note that changing zip and flate2 to use native zlib (while still using zstd compression for the inputs) made deflate compression basically disappear from the profile (!). Maybe that's just perf being stupid (it insists on telling me cache entry compression and writing is being done from the kernel's interrupt handler...), but maybe not.

I think I'll try putting together a patch that enables native zlib at the very least; I think that should go in regardless. We'll see how things look and whether zstd compression should be enabled for the inputs.

chmanchester added a commit to chmanchester/sccache that referenced this issue Jan 22, 2020
This is just one of a few things to try related to mozilla#552. As a first step it
would be interesting to see what kind of impact this has on Firefox automation
builds. Some initial measurements are promising for builds with lots of cache
misses, but this sort of thing is cumbersome to measure, so I'd like to try
enabling this for the sccache client built in Mozilla automation to see the
impact it has there.
chmanchester added a commit that referenced this issue Jan 22, 2020
This is just one of a few things to try related to #552. As a first step it
would be interesting to see what kind of impact this has on Firefox automation
builds. Some initial measurements are promising for builds with lots of cache
misses, but this sort of thing is cumbersome to measure, so I'd like to try
enabling this for the sccache client built in Mozilla automation to see the
impact it has there.
@FooBarWidget
Copy link
Contributor

I think this issue can be closed now that #784 is merged.

@sylvestre
Copy link
Collaborator

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants