New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local cache of binary packages #629

Open
UnixJunkie opened this Issue May 30, 2013 · 17 comments

Comments

Projects
None yet
6 participants
@UnixJunkie
Contributor

UnixJunkie commented May 30, 2013

Hello,

It would be nice if I build a package, then uninstall it,
then reinstall it, that the already built one is re-used
instead of recompiling everything again.

Thanks,
F.

@samoht

This comment has been minimized.

Member

samoht commented May 30, 2013

Adding the rest of the discussion that happened on the caml-list.

Chet Murthy wrote:

OK. A little more. OPAM is already a tremendous improvement. But to
really make it possible to build -systems- in Ocaml, you have to be
able to distribute collections of programs, config, and libraries,
across multiple (admittedly identical) machines. And distribute
updates to same. OPAM is in some ways like BSD ports -- it works
great for maintaining single machines from source-code.

But what's needed is a way to maintain -many- machines, and to
distribute updates in a granular manner that be -managed- -- rolled
forward, rolled back, with full knowledge of which versions of which
packages are being installed. And everything with -zero- version
skew. So any nondeterminism happened at buiild-time -- by
deploy-time, all machines are getting identical files with identical
timestamps.

It's a tall order, b/c OPAM will need to figure out how to capture
enough of the environment (in order to check it on the target machine
where a binary is installed) to verify whether it's safe to install
that binary. But boy would it be nice.

And as a bonus, we could wrapper opam in the debian apparatus (I
think) and get a really nice way to turn opam packages into debian
packages.

Malcolm Matalka wrote:

I think out would be wrong for opam to try to solve this problem. There
are already many tools available for deploying (Ansible, Puppet, Chef,
Fabric, Capistrano). Such a later can be build on top of opam of need be.

Chet Murthy wrote:

I think this is incorrect. Let me explain.

(1) when we look at deploying complex collections of code/libs/data
onto multiple machines, usually we assume that the code has already
been built.

(2) but let's first dispatch the case where the code has -not- been
built. In such a case, I presume you're proposing that the code be
built on each machine, yes?

(a) this drastically increases the CPU required to perform upgrades
and deploys

(b) but far, far, far more importantly, it means that on each
machine, a nontrivially complex script runs that builds the actual
installed binaries. If that script contains -any- nondeterminism or
environmental sensitivity, it could produce different results on
different machines. The technical term is "version skew".

In scale-out systems, this sort of "skew" is absolutely fatal, because
it means that machines/nodes are not a priori interchangeable. And
all of fast-fail fault-tolerance depends on nodes being
interchangeable.

(3) But let's say that what you really mean is that we should use
tools like puppet/chef/capistrano to copy collections of
binaries/libs/data to target machines and install them. These
scripts/recipes are written by some person. You could have equally
well suggested that that person build Debian packages (or RPMs) of
each OPAM package, writing out all the descriptions and manifests.

And manually specifying all dependencies and requiremeents.

Either way, that person is doing a job that OPAM already does a lot
of, and does quite well. Gosh, wouldn't it be nice if OPAM could
generate those RPMs? Well, it's a little more complicated than that,
but really, not much more. The complexity comes in that you -might-
(I'm not saying I have this part figured out yet) want ways to
-generalize- (say) the camlp5 package so that it could be installed on
many different base OPAM installations.

But setting aside that nice-to-have, imagine that OPAM knew how to
generate RPMs from each package it installed, and from the ocaml+opam
base itself. You combine those, and you can:

(i) install ocaml, opam, and a bunch of packages

(ii) push a button, and out come a pile of RPMs, along with
dependencies amongst them (and hopefully on the relevant
environmental RPMs (e.g., libpcre-dev for pcre-ocaml, etc) so that
you can just stuff those RPMs into a YUM repo, go to a second box,
and say

"yum install opam ocaml pcre-ocaml"

and get everything slurped down and installed, just as if OPAM had
installed it all, package-by-package.

-P.S. And this doesn't even get into the unsuitability of chef/puppet
for managing software package installation. There's a reason that no
distro uses such schemes to install the large and complex sets of
packages needed to run amodern Linux box. And why there is no Linux
version of Microsorft's "DLL Hell". Linux distros by and large (and
esp Debian and Ubuntu) have worked hard to make package installation
foolproof -- and chef/puppet etc are anything but.

@UnixJunkie

This comment has been minimized.

Contributor

UnixJunkie commented May 31, 2013

fpm looks extremely interesting:
https://github.com/jordansissel/fpm

I think OPAM doesn't need to implement everything.
If it uses the right tool already out there to create
packages, then so be it.

The thread had distinct part:

  • I (the impatient unix user) want a cache of binary packages
  • some (the sysadmins) also want to have real packages generated from the OPAM ones: deb, rpm.
    That would also fit my binary cash of packages request in fact
  • some (the lawyers) want to separate cache from config from permanent data and follow a specification

Regards,
F.

@UnixJunkie

This comment has been minimized.

Contributor

UnixJunkie commented Jun 24, 2013

Hello,

Can someone provide hints on how to plug fpm into OPAM?
For example, what OPAM source files to look into, which step of the build of a package to hook to, etc.

Let's say this will be an experimental feature, just try and play with it.

I'd like to give a try at it, but no idea on when I'll have time for this in fact.

Regards,
F.

@UnixJunkie

This comment has been minimized.

Contributor

UnixJunkie commented Oct 29, 2013

Funnily, on a machine with low RAM (1Gb) and no swap, the compiler cannot be compiled.
I guess it may also be true for several packages included in OPAM.

How is the front of supporting binary packages repositories in OPAM?

@AltGr

This comment has been minimized.

Member

AltGr commented Dec 15, 2017

There is a prototype of this using hooks, see https://github.com/ocaml/opam/blob/master/shell/opam-bin-cache.sh

@Khady

This comment has been minimized.

Contributor

Khady commented Aug 22, 2018

Is it possible to combine those cache hooks with the sandbox hooks?

@Khady

This comment has been minimized.

Contributor

Khady commented Aug 22, 2018

A quick comment to talk about esy too. Don't know if it's the good place. You probably know about it. They have a cache system too which seems to be pretty efficient. But rather than to copy the directories from the cache to the "switch", they put in the environment the paths to all the packages that are required by the switch. I think it also avoid some problems like relocating ocaml (esy does something in addition to make ocaml relocatable from one computer to another, but it doesn't seem as important as a good local cache system to me). I thought it worth mentioning it.

@AltGr

This comment has been minimized.

Member

AltGr commented Aug 22, 2018

Is it possible to combine those cache hooks with the sandbox hooks?

yes, of course

rather than to copy the directories from the cache to the "switch", they put in the environment the paths to all the packages that are required by the switch.

this is where there is quite a bit of difference: IIUC, in esy, every package is installed to its own subtree, which is a nice property that allows easy mix & matching of already compiled packages. It requires quite a bit of cooperation from the underlying systems, though, in this case — correct me if I am wrong — significant ocamlfind hacks.

This is a mode that we would definitely be interested in supporting in opam, but it also remains an important part of the project philosophy to be agnostic and pragmatic on what the packages do (hence the simple shell commands for build: instructions, for example).

@Drup

This comment has been minimized.

Contributor

Drup commented Aug 22, 2018

@AltGr I would encourage you to publish and promote this caching hook to get feedback. It's a very nice feature and even if it doesn't work perfectly just yet, I'm sure lot's of people would be interested. Enabling it would allow you to get feedback quickly.

@Khady

This comment has been minimized.

Contributor

Khady commented Aug 22, 2018

@AltGr

This comment has been minimized.

Member

AltGr commented Aug 22, 2018

I have tested it for a while, and while it works correctly at first, once you e.g. remove the original switch the cache was made from, in my experience it doesn't behave well...
I tried some workarounds, e.g. forcing some env variables, but that added more problems.

Of course, there might be progress in ocamlfind configuration since then, that makes this more reliable?

You can have more detail at ocaml/opam-repository#10863

@AltGr

This comment has been minimized.

Member

AltGr commented Aug 22, 2018

@Khady thanks for sharing your conf; I could publish a small script to easily enable/disable it, for those interested in testing ?

@Drup

This comment has been minimized.

Contributor

Drup commented Aug 22, 2018

@AltGr please do, this is a really good feature and I think it's worth advertising it, even if it's not completely finished.

@ELLIOTTCABLE

This comment has been minimized.

ELLIOTTCABLE commented Aug 26, 2018

I've given this a stab, and it looks like it needs a small tweak for macOS. It's currently causing the failure of any package that's already been installed elsewhere:

#=== ERROR while installing astring.0.8.3 =====================================#
# context     2.0.0 | macos/x86_64 | ocaml-base-compiler.4.07.0 | https://opam.ocaml.org/2.0#0bce4f9a
# path        ~/.opam/default/.opam-switch/build/astring.0.8.3
# command     ~/.opam/opam-init/hooks/opam-bin-cache.sh restore 021592f72a3781d5db0a804a656335022129f66149419979d89f88e3b4460c83 astring
# exit-code   64
# env-file    ~/.opam/log/astring-41810-f19885.env
# output-file ~/.opam/log/astring-41810-f19885.out
### output ###
# [...]
# + shift
# + '[' -z 021592f72a3781d5db0a804a656335022129f66149419979d89f88e3b4460c83 ']'
# + CACHE_DIR=/Users/ec/.cache/opam-bin-cache/021592f72a3781d5db0a804a656335022129f66149419979d89f88e3b4460c83
# + case $COMMAND in
# + NAME=astring
# + shift
# + '[' -d /Users/ec/.cache/opam-bin-cache/021592f72a3781d5db0a804a656335022129f66149419979d89f88e3b4460c83 ']'
# + rm -f astring.install
# + cp -aT /Users/ec/.cache/opam-bin-cache/021592f72a3781d5db0a804a656335022129f66149419979d89f88e3b4460c83/ /Users/ec/.opam/default/
# cp: illegal option -- T
# usage: cp [-R [-H | -L | -P]] [-fi | -n] [-apvXc] source_file target_file
#        cp [-R [-H | -L | -P]] [-fi | -n] [-apvXc] source_file ... target_directory

macOS nor BSD cp have a -T flag (though it looks like the coreutils cp does?)

@Khady

This comment has been minimized.

Contributor

Khady commented Aug 27, 2018

I created a 300 USD bounty for this issue. To be clear, I consider that solving it also requires solving ocaml/opam-repository#10863. Actually most of the work is probably on ocaml/opam-repository#10863 as opam-bin-cache.sh already works pretty well. I hope it can help to attract contributions.
https://www.bountysource.com/issues/1250468-local-cache-of-binary-packages

@UnixJunkie

This comment has been minimized.

Contributor

UnixJunkie commented Aug 28, 2018

@Khady I'm curious, is this out of your pocket or is this your company/employer?

@Khady

This comment has been minimized.

Contributor

Khady commented Aug 28, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment