Skip to content

A New Infrastructure for Erlang Build and Packaging

hyperthunk edited this page Feb 29, 2012 · 10 revisions

Packaging Tools Philosophy

  • Packages (OTP applications, libraries and releases) should be published as pre-built binary artefacts (archives)
  • Packages must be accessible via an HTTP URL
  • Packages should be richly described using metadata
  • Package metadata should be published in a remote index

Any more for this section?

Parts of the Packaging Machine

  • Remote Index
  • Local Index
  • Local Repository
  • Erlang Consumable App Dir

The Remote Index

  • A remote index will be a repository of human readable metadata, organized by directories
  • The index will be stored in git and targeted initially at github
  • Actual package tarballs will be stored in arbitrary locations with urls in the repository metadata pointing to those locations
  • Indexes are peers and there is no 'central' repo

Question(s):

do we want to tie ourselves to git+github in the spec?
My thoughts are that people may use any mechanism they like
to build alternative repository implementations (which we should
support via a plugins mechanism) - not everyone uses git

In light of this, can we rephrase item 2, as
"Repositories will initially be stored in git and targeted at github"?
Organizations and namespacing

All metadata will be organized around 'Organization' names. So the namespace for an OTP app is organization, app name, version. Where the same app name can appear in multiple organizations.

The organization name must, of course, be flattened out (i.e., removed) by the time the dependencies are resolved for use in an OTP release.

The Local Index

The local index is a searchable, queryable cache of all Remote Indexes that the user has expressed interest in. It allows the various tools to resolve dependencies, even when internet access is unavailable.

Question: how do we know when to refresh this cache?*

The Local Repository

As an option, the tools may make use of a local repository that serves as a cache of what is in the Remote Repositories. It has a structure similar to the Local Index, but also contains resolved binaries. This should never be used directly by any erlang systems as a source for code.

Question: is this really optional? I certain want the tools that *we*
build to do this, as I don't want to download <package-x>-1.0.1 every
time I use it.

I'm not convinced this is really an optimization rather than the only
sane way to do things. Think about a build plugin for example, or a 
logging library that you use in all your applications/systems. Do you
really want to resolve this to some local app directory each time you
start a new project? This is what rebar does with dependencies that I
actually hate - the local repository (along with a good index and binary
artefacts) is part of the main reason I want to do this stuff.

Erlang Consumable App Dir

When a set of dependencies are solved they are resolved down to a single global namespace (the exact same scheme erlang uses). These dependencies are then resolved into an Erlang Consumable App Dir. This can be used as a source of code for the Erlang VM.

Question: I agree that we need the namespace flattening, but I'm
wondering why we need do implement it in this way. On windows, for
example, this is a royal pain as there are no symlinks.

If we make the local repository a core part of the solution, then
there is another option which IMO is viable.

All the use-cases for the dependencies involve getting a running
emulator to 'know about' them, for example compile time ones (such
as parse transforms and build plugins) need to be on the code path
during the 'build tool' operations. Runtime ones need to be available
during testing, local development and pacakging a release. Test scoped
ones need to be on the path during testing. In all cases, the following
things need to work

$ code:which(AModuleInTheDependency) /= non_existing
$ code:lib_dir(TheAppName) /= {error,bad_name}

-include_lib("dependency/include/header.hrl").

So why not just add the correct directories to the code path at the
appropriate time? This saves copying (or symlinking) and avoids 
cluttering the local directory structure. It also avoid having to
manage an 'app dir' when dependencies are removed for example.

Personally I find this approach preferable, as long as the code
path modification process is documented clearly.

Tools

Goals
  • Convention over Configuration
  • Everything is configurable

No tool should hardcode a path. All paths must have both sane defaults and the ability to be overridden both via configuration files and the command line.

Ideas
  • Certain behaviours can be overridden by user defined modules
  • Overriding behaviours (i.e., plugins/extensions) can be obtained as dependencies ** another +1 for local repositories and code path munging IMHO

Repository Management Tool

manager: serves the function of interacting with the remote repos, index management, resolution. apt-get for erlang.

solver: the dependency solver. Given a set of constraints and one or more indexes resolves all dependencies to hard versions and outputs those in a consumable way.

fetcher: this may be integrated with the manager (probably will be). Given a set of app-name/version repos will fetch the correct binary for those repos to a specified location.

builder: The builder, given dependencies and a project dir will build binaries. This will probably be served by rebar and/or the build functionality of sinan.

assembler: The assembler. Given resources above and build code assembles an erlang release building the script, rel and boot files as needed.

packager: This serves to package up a release for distribution. In any number of formats.

Clone this wiki locally