Description
openedon Nov 20, 2014
(Continuing a discussion started here.)
The cld2 library is a natural-language detection library from Google, and it does some pretty cool stuff. I've packaged it as two Rust libraries, cld2 and cld2-sys. But because the upstream cld2 library is packaged by very few Linux distributions, I've chosen to distribute the source code with the cld2-sys package and build it using the Rust gcc library. So far, so good—all this works quite nicely.
But I can't upload the package to crates.io because it contains statistical language models, and those models are just too big:
$ du -sh target/package/cld2-sys-0.0.1.crate
35M target/package/cld2-sys-0.0.1.crate
I can shrink this down somewhat (by omitting everything I don't need for the build), but I almost certainly can't get it under the 10MB limit. I can think of a couple of ways to address this issue:
- Accept that certain
*-syspackages will be larger than 10MB, and provide some way to override the limit selectively. - Store compressed source code in an S3 bucket, and ask
build.rsto download it. But this introduces a dependency on an outside data source that may go away.
Any thoughts on the best way to handle this? Thank you for your advice, and for a great package-management system!