Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Setting up your own Chipster server in EGI Federated Cloud
Chipster is an easy to use data analysis platform for bioinformatics. It provides an uniform graphical interface for over 350 commonly used bioinformatics tools including several R/Bioconductor based tools and standalone programs (e.g. BWA, TopHat).
Chipster is based on a client-server system where the user runs locally a Chipster-client that submits analysis tasks to a Chipster server. Even though Chipster is an open source tool, there is no public Chipster server that would be open for everybody. Due to that, a researcher needs to have an access to some of the existing Chipster servers to be able to use this platform. Alternatively, researcher can set up your his own Chipster server.
This document describes how a Chipster server can be launched EGI Federated Could environment. This cloud environment provides resources for all European researchers. With the instructions provided here, any European researcher can launch and manage his own Chipster server, suited for needs of a small research group or a bioinformatics course.
The set up described here is based on collaboration of several European instances. Chipster is developed by CSC – IT Center for Science Ltd. In Finland (http://chipster.csc.fi). European Grid Infrastructure (EGI) has fitted Chipster to cloud environment and provides the cloud computing resources. Finally, Rutherford Appleton Laboratory hosts the CVMFS server that provides the scientific tools and data sets for the Chipster servers running in EGI federated cloud.
1. Preparatory steps
The EGI Federated cloud environment can be used from Linux or Mac OSX machines. In order to launch a Chipster server in EGI Federated Cloud, the machine that is used to manage the Chipster server must have the following tools and files installed:
- A valid X.509 certificate ,
- rOCCI command line client for managing cloud computing environment,
- voms-proxy-init command to create proxy certificates,
- define settings to connect the VOMS server hosting chipster.csc.fi VO. In addition you must join the chipster.csc.fi Virtual Organization.
The manager of the virtual Chipster server needs to do these preparatory steps only once. After that Chispster servers can be managed with the FedCloud_Chipster_manager tool. Note that the end-users who wish just to use the Chipster server running in EGI Federated Cloud, do not need to do any of these preparatory steps.
1.1. Grid certificates and VO membership
EGI Federated Cloud use X.509 certificates for user authentication. Researchers from the member countries of GEANT network can use the DigiCert certificate service (https://www.digicert.com/sso) to obtain a personal grid certificate. Users from other countries should use their local certification authorities.
Once you have a grid certificate installed in your browser, you can join the chipster.csc.fi Virtual Organization in address: https://voms.fgi.csc.fi:8443/voms/chipster.csc.fi
1.2. Installing rOCCI and VOMS client
The management of Federated Cloud resources is done using rOCCI, a ruby based implementation of OCCI standard. The authentication in EGI federated cloud is done using proxy certificates generated with command voms-proxy-init. The instructions to install these tools to your local machine can be found from: https://wiki.egi.eu/wiki/Fedcloud-tf:CLI_Environment
Once you have installed the rOCCI and voms-proxy-init commands, you must still define the connection to Chipster.csc.fi VO management server (VOMS). To do this, first create directory /etc/grid-security/vomsdir/chipster.csc.fi and go to this directory:
mkdir /etc/grid-security/vomsdir/chipster.csc.fi cd /etc/grid-security/vomsdir/chipster.csc.fi
Then create a file "voms.fgi.csc.fi.lsc" that contains the following 2 lines:
/O=Grid/O=NorduGrid/CN=host/voms.fgi.csc.fi /O=Grid/O=NorduGrid/CN=NorduGrid Certification Authority
If you already have file_ /etc/vomses_, move the file "/etc/vomses" to "/etc/vomses/old_vomses" (vomses will be a directory now) Create a file "chipster.csc.fi-voms.fgi.csc.fi" in "/etc/vomses" and write inside the following line:
"chipster.csc.fi" "voms.fgi.csc.fi" "15010" "/O=Grid/O=NorduGrid/CN=host/voms.fgi.csc.fi" "chipster.csc.fi"
1.3 Obtaining keys and FedCloud_chipster_manager
FedCloud_chipster_manager is a help tool that can be used to launch Chipster instances to EGI Federated Cloud. It constructs the rOCCI commands needed to launch, list or delete virtual machines that have Chipster server running inside. It can also be used to restart or check the status of the Chipster servers running inside the virtual machines.
Some of the FedCloud_chipster_manager operations require that user provides encryption key pair that is used to access the virtual machine. The key pair can be created for example with command:
ssh-keygen -t rsa -b 2048 -f FedCloudKey
The command above asks you to define a password for your key files and then creates two files: a private key (in this case: FedCloudKey) and a public key (in this case: FedCloudKey.pub). The key files need to be created only once: you can use the same key files for several virtual machines.
The actual FedCloud_chipster_manager tool can be downloaded from the Chipster git-bub with command:
After downloading, remember to give execution permission for the file:
chmod u+x FedCloud_chipster_manager
2. Managing Chipster server
2.1 Setting up VOMS proxy
Before launching or managing virtual Chipster servers, you have to create a temporary proxy certificate that is used to authenticate to EGI Federated Cloud environment. If you have the voms-proxy-init command installed and a valid X.509 certificate in your .globus directory, you can create a temporary proxy certificate with command:
voms-proxy-init --voms chipster.csc.fi --rfc --dont_verify_ac
The command above asks the password of your X.509 certificate and creates a proxy certificate that is valid for 12 hours. Note that voms-proxy-init requires that you are using OpenJDK based Java environment. Other Java environments cause error messages like: Credentials couldn't be loaded.
2.2 Launching a Chipster server
Once you have done all the preparations, you can launch a new Chipster Virtual Server with command (assuming you have the FedCloud_chipster_manager tool in your current working directory):
./FedCloud_chipster_manager -key keyfile -launch
This launching command uses default values, for resources and user accounts linked to the Chipster. Option -volume_size can be added to modify the size of the data volume (in Giga bytes) that is used to store the data during the computing. The default size of the volume, is only 20 GB, which is enough for testing, but for real usage a bigger data volume may be needed.
By default, only one Chipster account (user: chipster, password: chipster) is created to a new Chipster server. A list of user accounts for a new Chipster server can be defined with option -users. The argument for this option should be a file containing a list of accounts in format:
The expiration date is defined with format: yyyy-mm-dd. For example file "accounts.txt" could look like following:
trng1:4eoU8hmx:2016-11-30 trng2:4eoU8hmx:2016-11-30 trng3:4eoU8hmx:2016-11-30
Note that these accounts are just Chipster server accounts, not linux accounts that could be used to open terminal connections to the virtual machine. Launching a Chipster server with these accounts and 100 GB storage size could be done with command:
./FedCloud_chipster_manager -launch -key FedCloudKey -volume_size 100 -users accounts.txt
The launching process can take tens of minutes. In the end the launching process prints out information about how the server can be accessed. For example:
Your new Chipster server is now running in a virtual machine with ID: /compute/00d55c95-98ff-4b27-a708-c041949723c6 In EGI Federated Cloud endpoint: https://prisma-cloud.ba.infn.it:8787 The IP-addess of the chipster virtual server is: 188.8.131.52 You can now connect your virtual machine with command: ssh -i FedCloudKey firstname.lastname@example.org The Chipster server can be connected with URL: http://184.108.40.206:8081
The users can now use the URL to use the Chipster server while ssh connection is intended for managing the Chipster server.
2.3 Other management tasks
In addition to launching Chipster servers, FedCloud_chipster_manager tool can be used to manage an existing server. You can use FeCloud_chipster_manager with option -list, to list your virtual Chipster servers running in the EGI Federated Cloud. Option: -status makes FedCloud_chipster_manager to look for Chipster VMs launched by the user, and to check the status of the Chipster server running in the VMs found. In this case must also use the -key option to define the key file, that was used to launch the server. The password for the key file is asked for each server to be connected.
$ ./FedCloud_chipster_manager -key FedCloudKey -status ------------------------------------------------------------ Remaining validity time for your proxy certificate: 07:02:41 ------------------------------------------------------------ Listing Virtual Machines with name: chipster-vm-kkmattil-at-csc.fi in endpoint https://prisma-cloud.ba.infn.it:8787/ This may take some time. -------------------------------------------------------------- https://prisma-cloud.ba.infn.it:8787/compute/86b97ed5-e256-4bce-83b5-aa3a41920975 occi.compute.hostname = chipster-vm-kkmattil-at-csc.fi IP: 220.127.116.11 Enter passphrase for key 'FedCloudKey': ******** ActiveMQ Broker is running (5995). Chipster Fileserver Service is running (PID:6118). Chipster Webstart Service is running (PID:6229). Chipster Authentication Service is running (PID:6345). Chipster Computing Service is running (PID:6883). Chipster Manager Service is running (PID:6579). chipster-jobmanager RUNNING pid 6635, uptime 1:10:06
The option -restart makes FedCloud_chipster_manager to restart the Chipster server running in the given Federated Cloud VM instance. This option can be used for example to fix the server, if the Chipster server is using internal IP address instead of public IP address. For example, restarting the Chipster server running in instance: /compute/86b97ed5-e256-4bce-83b5-aa3a41920975 can be done with command:
$ ./FedCloud_chipster_manager -key FedCloudKey -restart \ /compute/86b97ed5-e256-4bce-83b5-aa3a41920975
To completely delete the virtual machine running in EGI Federated Cloud you can use option -delete
./FedCloud_chipster_manager -delete instance-ID
For more detailed management, you can open an ssh connection to the virtual machine running the Chipster server. Detailed instructions for maintaining Chipster server can be found from the GitHub pages of Chipster:
3. Using your Chipster server in EGI FederatedCloud
The Chipster Virtual organization can provide only limited resources for the Chipster user community. By default the FedCloud_chipster_manager starts a Chipster server on a virtual machine that has 4 computing cores with total of 8 GB of memory. This is not much, but it should be enough to serve the needs of a small research group (only few simultaneous users). If you wish to use larger virtual machine, please contact the Chipster VO manager. Once launched, the server can be kept up and running as long as the data processing continues. This can be weeks or months, but finally the Chipster server should be shut down by the server manager. If your server has been running longer than 4 months, the VO manager can ask the owner of Chipster server to send a report about the usage of the server. When using the Chipster in EGI Federated Cloud, you should remember that the intermediate data at the servers is not back-upped. If you need to rebuild your Chipster server, the data in the previous version will be lost when the old version in removed. Further you should remember that current setup for running Chipster in EGI Federated Cloud is still under testing and development. We do not guarantee un-interrupted access to the resources at all times.