Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider using decompress? #617

Closed
jondot opened this issue Dec 19, 2022 · 5 comments
Closed

consider using decompress? #617

jondot opened this issue Dec 19, 2022 · 5 comments

Comments

@jondot
Copy link

jondot commented Dec 19, 2022

Hi,
You might want to consider decompress, which can:

  • Replace your existing archive handling logic and save you from maintaining code
  • Offer better archive detection, more extensive testing
  • Add in features you can expose (stripping wrapping folders, filtering unwanted files, and more)

I'd do this myself, but I got stopped short when seeing the async handling in downloader, not sure I'm fully capable of refactoring that reliably.

@NobodyXu
Copy link
Member

Thanks, I would take a look at the crate at my spare time.

Currently, I'm on my way adding support of pax extension to tokio-ta.
After that is done, I plan to move to tokio-tar completely and simplify the Download API by replacing the existing one with Download::and_visit while also provides an API for efficiently extract a file.

Download::and_visit will enable cargo-binstall to filter the files/dirs and remove all the block_in_place call for issuing blocking fs operations.

I will prefer if decompress also supports async, since binstalk-downloader deals with http through reqwest and we'd like to keep using async.

@NobodyXu
Copy link
Member

I skim through the API and dependencies of decompress and there are a few issues:

  • decompress always pulls in regex, something that is quite huge (cargo-binstall does not use regex)
  • decompress use tar v0.4.38 which doesn't support pax extension (GNU sparse extension) yet. We could work around this by patching the dependency tar to use our fork.
  • decompress requires the archive to be stored on the disk, while in cargo-binstall we would like it to be decompressed on stream to avoid temporary files
  • Documentation needs improvement. TBF, doc of binstalk-downloader is also poor, but since we develop it in house, we know what it's doing pretty well.

In conclusion, it looks to me decompress is not completely suit for cargo-binstall, but other maintainers might have different opinions.

@jondot
Copy link
Author

jondot commented Dec 19, 2022

Thanks, I will already use your input to make some improvement - we don't have to use regex, I didn't know it was that much of a consideration but would be happy to offer opt-out.

Can you elaborate on what is missing in the docs? I would be happy to improve it and add anything specific that you miss.

Last point - about in-memory, actually it's a design decision since current uses of decompress rely on a downloaded file that's already cached, so if you consider adding a cache to binstall (e.g. etag based) as an improvement in the future, that means in-mem isn't relevant anymore.

@NobodyXu
Copy link
Member

Can you elaborate on what is missing in the docs? I would be happy to improve it and add anything specific that you miss.

The trait Decompressor really needs doc.
I don't know what Decompressor::test is for.

Also ExtractOpts::strip needs some more explanation.

so if you consider adding a cache to binstall (e.g. etag based) as an improvement in the future, that means in-mem isn't relevant anymore.

That's not planned and probably has zero benefit for cargo-binstall.

cargo-binstall is designed to be as simple as possible and adding cache makes no sense, since

  • it's rare for people to request installation of the same crate twice
  • cargo-binstall maintains a manifest of what has been installed and only reinstalls when a new version is available

@passcod
Copy link
Member

passcod commented Dec 21, 2022

Yeah, I'm going to concur with Jiahao and say that it's a nice little crate you've got, but for the reasons stated and because it's not a good fit for binstall at the moment, we're not going to use it. Of course, we may revisit this as time goes! Thanks for letting us know about it and best of luck

@passcod passcod closed this as not planned Won't fix, can't repro, duplicate, stale Dec 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants