Skip to content

II. Installation

Alvin X. Han edited this page Oct 6, 2020 · 11 revisions

PhyCLIP is written in Python 2.7 and depends on several python libraries and at least one ILP solver.

To simplify the installation process, we highly recommend that you use Anaconda, a free and open-source distribution of Python and package management system. Visit http://www.anaconda.com/download/ to download and install the Python 2.7 version distribution for your preferred OS.

There is no support for Python 3 currently. However, if you are using a Python 3 Conda environment, you can still install/run PhyCLIP by first building a Python 2 Conda environment (see below).

Prerequisite 1: Python libraries

PhyCLIP depends on several Python libraries:

  • numpy, scipy, statsmodels (mathematical/statistical operations)
  • cython
  • ete3 (parsing phylogenetic trees)

To install the dependencies, go to Terminal (Mac/Linux) or Command/Anaconda Prompt (Windows):

$ conda install -c etetoolkit ete3
$ conda install -c anaconda cython
$ conda install numpy scipy statsmodels

Prerequisite 2: ILP solver

PhyCLIP currently supports two ILP solvers. You can choose either ONE to install depending on your access to these solvers:

  1. Gurobi optimizer (http://www.gurobi.com/) is a commercial linear and quadratic programming solver with FREE licenses available for academic users.
  2. GLPK (GNU Linear Programming Kit, http://www.gnu.org/software/glpk/) is a free and open-source package intended for solving large-scale linear programming, mixed integer programming, and other related problems.

If you are a university user (i.e. you have internet access from a recognized academic domain, e.g. '.edu' addresss), we highly reccomend running PhyCLIP with the Gurobi solver. GLPK performs poorly in terms of both speed and solvability (Gurobi v. 8.1 solved all 40 benchmark simplex LP problems whereas GLPK v. 4.65 only solved 31 of them with a geometric mean time that is 52 times longer compared to Gurobi, see http://plato.asu.edu/talks/informs2018.pdf).

Furthermore, as with any other linear programming problems, it is possible to obtain multiple optimal solutions. Currently, GLPK can only return ONE solution that is guaranteed to be the global optimal if and only if the feasible region is convex and bounded. However, this may not always be the case. Gurobi, on the other hand, generates a solution pool which may include > 1 optimal solution.

Gurobi

IMPORTANT: Take note of the version of Gurobi you are using (printed on summary-stats_*.txt file, see Output files). Gurobi is updated periodically to enhance solver performance. Correspondingly, we do find minor changes to PhyCLIP's clustering results in some cases as a result of Gurobi updates.

The easiest way to install Gurobi is via the Anaconda platform:

  1. Make sure you have Anaconda for Python 2.7 installed (see above).

  2. Install the Gurobi package via conda:

$ conda config --add channels http://conda.anaconda.org/gurobi
$ conda install gurobi
  1. You need to install a Gurobi licence next. Visit https://www.gurobi.com/registration/academic-license-reg to register for a free Gurobi account. Follow the instructions in the verification email from Gurobi to set your password and login to your Gurobi account via http://www.gurobi.com/login.

  2. You can now access https://www.gurobi.com/downloads/end-user-license-agreement-academic/ to request for a free academic license. Once requested, you will be brought to the License Detail webpage.

  1. To install the license, go to Terminal/Command Prompt: $ grbgetkey XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX where XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX is your unique license key shown in the License Detail webpage. Note that an active internet connection from a recognized academic domain (e.g. '.edu' addresss) is required for this step.

GLPK

You can easily install GLPK via Anaconda as well:

$ conda install -c conda-forge glpk

Alternatively, you can also install GLPK from source, go to http://ftp.gnu.org/gnu/glpk/ and download the latest distribution of glpk as a tarball. You can find installation information in the documentation provided.

WINDOWS users

You may also need to compile C extensions using Microsoft Visual C++ Compiler for Python so that Cython can work properly. You can install that from http://www.microsoft.com/en-us/download/details.aspx?id=44266.

Install PhyCLIP

Finally, install phyclip.py by:

$ cd PhyCLIP-master/ 
$ python setup.py install

You may need sudo privileges for system-wide installation. Otherwise, it is also possible to use phyclip.py locally by adding the phyclip_modules folder to your $PYTHONPATH.

Building a Python 2 Conda environment

If your base Conda environment runs on Python 3, you can build a separate Python 2 environment to install/run PhyCLIP.

In your terminal/command:

$ conda create -n python2env python=2.7 anaconda

This will create the python 2 environment and install Anaconda in there.

Activate the environment:

$ source activate python2env

Continue to install the pre-requisites, solver and PhyCLIP as per the instructions above under the same Python 2 environment. Remember to activate the Python 2 environment every time you use PhyCLIP.

When you are done using the Python 2 environment, you can return to your base environment by:

(python2env)$ source deactivate