edpm stands for EIC development packet manager helper
The goal of edpm is to provide esier experience of building EIC simulation and reconstruction framework and supporting packages on a user machine with development reasons.
- Motivation
- edpm installation
- Get eicrecon installed
- Manage environment
- Troubleshooting
- Manual or devel installation
- Adding a package
Cheat sheet:
Install edpm:
# install edpm
sudo python -m pip install edpm
# OR without sudo: add --user flag and ensure ~/.local/bin is in your PATH
python -m pip install --user -U edpm
JLab machines with certificate problems - add these flags to the command above:
--trusted-host pypi.python.org --trusted-host files.pythonhosted.org --trusted-host pypi.org
see Troubleshooting chapter for details
Install everything else
# install prerequesties
edpm req centos eicrecon # get list of OS packets required to build jana and deps
sudo yum install ... # install whatever 'edpm req' shows
# setup installation dir and existing packets, introspect
edpm --top-dir=<where-to> # Directory where packets will be installed
edpm # To see how edpm is configured
edpm install eicrecon --explain # To see what is to be installed
edpm set root `$ROOTSYS` # if you have CERN.ROOT. Or skip this step
edpm set <packet> <path> # set other existing packets. Or skip this step!!!
edpm config global cxx_standard=17 # It is recommended if compiler supports C++17
# Build and install the rest
edpm install eicrecon # install eicrecon and dependencies (like genfit, jana and rave)
edpm install g4e # install Geant-4-Eic and dependencies (Geant4, etc)
# Set environment
source<(edpm env) # set environment variables
edpm env csh > your.csh # if you are still on CSH
# If that worked don't read the next...
(!) If you use your version of ROOT, all packages depending on ROOT should be installed with the same C++ standard flags as root. So it it was C++11 or C++17, it should be used everywhere. To set it in edpm
edpm config global cxx_standard=17
TL;DR; Major HEP and NP scientific packages are not supported by some major distros and usually are crappy (at least in terms of dependency requirements). Everybody have to reinvent the wheel to include such packages in their software chains and make users' lives easier. And we do.
Longer reading
edpm is here as there is no standard convention in HEP and NP of how to distribute and install software packages with its dependencies. Some packages (like eigen, xerces, etc.) are usually supported by OS maintainers, while others (Cern ROOT, Geant4, Rave) are usually built by users or other packet managers and could be located anywhere. We also praise "version hell" (e.g. when GenFit doesn't compile with CLHEP from ubuntu repo) and lack of software manpower (e.g. to sufficiently and continuously maintain packages for major distros or even to fix some simple issues on GitHub).
What about Spack? - Spack works and shines on clusters with supervision of experts. In failed countless times when the task was to install something working for students. Spack requires to know Spack and its concepts to debug its deep dependencies failures
At this points edpm tries to unify experience and make it simple to deploy eicrecon for:
- Users on RHEL 7 and CentOS
- Users on Ubutnu (and Windows with WSL)
- Docker and other containers
Design features
-
Essentials:
- edpm is written in pure python (2 and 3 compatible) with minimum dependencies
- it is shipped by pip (python official repo), so can be installed with one command on all major platforms
- CLI (command line interface) - provides users with commands to manipulate packets
- JSON database stores the current state and packets locations
- It makes easy to... rebuild packets, deploy missing packets, continue after fail, etc.
-
Under the hood:
- Each packet has a single python file that defines how it will be installed and configured
- Each such file is easy to read and modify by inexperienced users in case they would love to
- Installation steps written in a style close to Dockerfile (same command names, etc)
Alternatives
Is there something existing? What others do? - Simple bash build scripts quickly get bloated and complex. Dockerfiles and similar stuff are too-tool-related. Build systems like scons or cmake also too centric on compiling something rather than managing packets chains. Full featured package managers and tools like Homebrew are pretty complex to tame (for dealing with just 5 deps).
So edpm is something more advanced than build scripts, but less cumbersome than real packet managers, it is in pure python, and being focused on our specific problems.
edpm is not:
- It is not a real package manager which automatically solves dependency trees
- edpm is not a requirment for eJANA. It is not a part of eJANA build system and one can compile and install eJANA without edpm
Users are pretty much encouraged to change the code and everything is done here to be user-change-friendly
TL;DR;
sudo pip install --upgrade edpm # system level installation
pip install --user --upgrade edpm # User level. $HOME/.local/bin should be in $PATH
If you have certificate problems on JLab machines: (more options on certificates):
# System level copy-paste:
sudo python -m pip install --trusted-host pypi.python.org --trusted-host files.pythonhosted.org --trusted-host pypi.org -U edpm
# User level copy-paste:
python -m pip install --trusted-host pypi.python.org --trusted-host files.pythonhosted.org --trusted-host pypi.org --user -U edpm
- See INSTALLATION TROUBLESHOOTING If you don't have pip or right python version.
- See Jlab root certificate problems and how to solve them
- See Manual or development installation to use this repo directly, develop edpm or don't want to mess with pip at all?
(or crash course to edpm)
TL;DR; example for CentOS/RHEL7
edpm req centos eicrecon # get list of OS packets required to build jana and deps
sudo yum install ... # install watever 'edpm req' shows
edpm --top-dir=<where-to> # Directory where packets will be installed
edpm set root `$ROOTSYS` # if you have CERN.ROOT. Or skip this step
edpm install eicrecon --missing # install eicrecon and dependencies (like genfit, jana and rave)
source<(edpm env) # set environment variables
Step by step explained instruction:
-
Install (or check) required packages form OS:
edpm req ubuntu # for all packets that edpm knows edpm req centos eicrecon # for eicrecon and its dependencies only
At this point only 'ubuntu' and 'centos' are known words for
req
command. Put:ubuntu
for debian familycentos
for RHEL and CentOS systems.
In future macOS and more detailed os-versions will be supported
-
Set top-dir. This is where all missing packets will be installed.
edpm --top-dir=<where-to-install-all>
-
You may have CERN.ROOT installed (req. version >= 6.14.00). Run this:
edpm set root `$ROOTSYS`
You may set paths for other installed dependencies combining:
edpm install eicrecon --missing --explain # to see missing dependencies edpm set <name> <path> # to set dependency path edpm set jana <path to jana2 install> # JANA2 as an example
Or you may skip this step and just get everything installed by edpm
-
Then you can install edpm and all missing dependencies:
edpm install eicrecon
-
Set right environment variables (right in the next section)
TL;DR; Just source it like:
source <(edpm env)
# or
source ~/.local/share/edpm/env.sh # .csh for CSH/TCSH
edpm env # To generate env & regenerate env files
edpm_DATA_PATH
- sets the path where the configuration db.json and env.sh, env.csh are located
longer reading:
Every time configuration is changed (something installed or deleted) or edpm env command is called, edpm creates 2 environment files with the current environment:
~/.local/share/edpm/env.sh # bash
~/.local/share/edpm/env.csh # bash
There is a command to print our the environement. One can then redirect it to a file or feed to source directly (examples are below)
edpm env # type with --help for options
Examples:
-
Dynamically source output of
edpm env
command (recommended)source <(edpm env) # works on bash
-
Save output of
edpm env
command to a file (can be useful)edpm env sh > your-file.sh # get environment for bash or compatible shells edpm env csh > your-file.csh # get environment for CSH/TCSH
-
Use edpm generated
env.sh
andenv.csh
files (lazy and convenient)$HOME/.local/share/edpm/env.sh # bash and compatible $HOME/.local/share/edpm/env.csh # for CSH/TCSH
(!) The files are regenerated each time
edpm
changes something. If you changedb.json
by yourself, edpm doesn't track it automatically, so calledpm env
to regenerate these 2 files
edpm stores the states in ``db.json``` file. There are certain parameters relevant for all/most of the packages such as cxx_standard or a number of compilation threads. Then there are parameters to configure each package installation.
To view and change those configuration:
edpm config <packet name> <config name> = <new value>
examples:
edpm config # edpm config global
edpm config global # Show global configs
edpm config root # Show configs for packet root
edpm config global cxx_standard=14 # Set globally to use C++14 for all new packages
# (if that is not overwritten by the package config)
edpm config acts cxx_standard=17 # Set cxx standard for root (overwrites global level)
Config allows
Where edpm data is stored:
There are standard directories for users data for each operating system. edpm use them to store db.json and generated environment files (edpm doesnt use the files by itself).
For linux it is XDG_DATA_HOME*:
~/.local/share/edpm/env.sh # sh version
~/.local/share/edpm/env.csh # csh version
~/.local/share/edpm/db.json # open it, edit it, love it
* - XDG is the standard POSIX paths to store applications data, configs, etc.
edpm_DATA_PATH
- You can control where edpm stores data by setting edpm_DATA_PATH
environment variable.
But... there is no pip:
Install it!
sudo easy_install pip # system level
easy_install pip --user # user level
For JLab lvl1&2 machines, there is a python installations that have pip
:
/apps/python/ # All pythons there have pip and etc
/apps/anaconda/ # Moreover, there is anaconda (python with all major math/physics libs)
But there is no 'pip' command?
If easy_install
installed something, but pip
command is not found after, do:
- If
--user
flag was used, make sure~/.local/bin
is in your$PATH
variable - you can fallback to
python -m pip
instead of usingpip
command:python -m pip install --user --upgrade edpm
But... there is no easy_install!
Install it!
sudo yum install python-setuptools python-setuptools-devel # centos and RHEL/CentOS
sudo apt-get install python-setuptools # Ubuntu and Debian
# Gentoo. I should not write this for its users, right?
For python3 it is easy_install3
and python3-setuptools
I dont have sudo privileges!
Add "--user" flag both for pip and easy_install for this. Read SO here
If you get errors like:
Retrying (...) after connection broken by 'SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED]...
The problem is that pip
is trustworthy enough to use secure connection to get packages.
And JLab is helpful enough to put its root level certificates in the middle.
- The easiest solution is to declare PiPy sites as trusted:
python -m pip install --trusted-host pypi.python.org --trusted-host files.pythonhosted.org --trusted-host pypi.org edpm
- Or to permanently add those sites as trusted in pip.config
(The link where to find pip.config on your system)
[global] trusted-host= pypi.python.org pypi.org files.pythonhosted.org
- You may want to be a hero and kill the dragon. The quest is to take JLab certs. Then Convert them to pem. Then add certs to pip. Then check it really works on JLab machines. And bring the dragon's head back (i.e. please, add the exact instruction to this file)
TL;DR; Get edpm, install requirements, ready:
git clone https://gitlab.com/eic/edpm.git
pip install -r edpm/requirements.txt
# OR clone and add edpm/bin to your PATH
export PATH=`pwd`/edpm/bin:$PATH
requirments:
Click
and appdirs
are the only requirements. If you have pip do:
pip install --upgrade click appdirs
If for some reason you don't have pip, you don't know python well enough and don't want to mess with it, pips, shmips and doh... Just download and add to
PYTHONPATH
: this 'click' folder and some folder with appdirs.py
Each packet is represented by a single python file - a recipe which has instructions of how to get and build the package. Usually it provides:
- download/clone command
- build command
- setup of environment variables
- system dependencies (which can be installed by OS packet managers: yum, apt)
For simplicity (at this point) all recipes are located in a folder inside this repo: edpm/recipes.
The most of packages served now by edpm use git to get source code and cmake to build the package. As git + cmake became a 'standard' there is a basic recipe class which makes adding new git+cmake packets straight forward.
As a dive-in example of adding packets, lets look on how to add such packet using Genfit as a copy-paste example.
1. Set packet name and where to clone from
One should change 3 lines:
class GenfitRecipe(GitCmakeRecipe):
def __init__(self):
"""Installs Genfit track fitting framework"""
# This name is used in edpm commands like 'edpm install genfit'
super(GenfitRecipe, self).__init__('genfit')
# The branch or tag to be cloned (-b flag)
self.config['branch'] = 'master'
# Repo address
self.config['repo_address'] = 'https://github.com/GenFit/GenFit'
Basically that is enough to build the package and one can test:
edpm install yourpacket
2. Set environment variables
This is a done in gen_env
function. By using this function edpm generates environments for
csh/tcsh, bash and python*. So 3 commands to be used in this function:
Set(name, value)
- equalsexport name=value
in bashAppend(name, value)
- equalsexport name=$name:value
in bashPrepend(name, value)
- equalsexport name=value:$name
in bash
@staticmethod
def gen_env(data):
path = data['install_path'] # data => installation information
yield Set('GENFIT_HOME', path)
# add bin to PATH
yield Prepend('PATH', os.path.join(path, 'bin'))
# add lib to LD_LIBRARY_PATH
yield Append('LD_LIBRARY_PATH', os.path.join(path, 'lib'))
One can test gen_env with:
edpm env
* - if other python packages use edpm programmatically to build something
3. System requirments
If packet has some dependencies that can be installed by OS packet managers such as apt, one can add them to os_dependencies array.
os_dependencies = {
'required': {
'ubuntu': "libboost-dev libeigen3-dev",
'centos': "boost-devel eigen3-devel"
},
'optional': {
'ubuntu': "",
'centos': ""
},
}
(!) don't remove any sections from the map, leave them blank
To test it one can run:
edpm req ubuntu
edpm req centos
Compared to the previous example, several more functions should be added:
setup
- configures the packagestep_clone
,step_build
,step_install
- execute commands to perform the step
1. Setup
Setup should provide all values, that are going to be used later in 'step_xxx' functions. Usually it is just 3 things:
def setup(self):
#
# use_common_dirs_scheme() sets standard package variables:
# source_path = {app_path}/src/{branch} # Where the sources for the current version are located
# build_path = {app_path}/build/{branch} # Where sources are built. Kind of temporary dir
# install_path = {app_path}/root-{branch} # Where the binary installation is
self.use_common_dirs_scheme()
# Git download link. Clone with shallow copy
self.clone_command = "git clone --depth 1 -b {branch} {repo_address} {source_path}".format(**self.config)
# make command:
self.build_command = './configure && make -j{build_threads} install'.format(**self.config)
2. Step functions
3 docker alike functions that helps to execute stuff:
run(command)
- executes the console commandworkdir(dir)
- changes the working directoryenv(name, value)
- sets an environment variable
run(self.clone_command) # Execute git clone command
workdir(self.source_path) # Go to our build directory
run('./bootstrap') # This command required to run by rave once...
env('RAVEPATH', self.install_path)