http://www.cse.ohio-state.edu/~harveywi/ayla
Ayla is a free, open source visualization tool for researchers in biochemistry, molecular dynamics, and protein folding. It is cross-platform, running on Windows, Mac OS X, and Linux systems.
Ayla features a rich visual analytics environment in which many researchers can work together. A storyboard metaphor is used to organize interesting events in the data.
The Ayla website hosts the latest binaries (currently version 0.1) as well as a screencast demonstrating the functionality of the software.
See the Ayla Github repository for source code and additional documentation.
The software is written in Scala and Oracle Java Version 6+ must be installed to run the binaries. Processing large datasets (on the order of hundreds of thousands of protein conformations) can be memory-intensive; hence we recommend running the software on a machine with 8 GB of RAM or more. (As the software matures, it will become faster and leaner.)
The software uses Java 3D to render topological landscapes. Before you can use the software you will need to download and install Java 3D 1.5.1.
Ayla uses port 9010 to communicate. Thus, you must configure your firewall to allow communication over this port.
Before getting started with Ayla, you will need a collection of protein conformations in the form of Protein Data Bank files. The following steps will show you how to transform the collection of files into a format that you can work with using Ayla.
In order to get started, you need to create a new directory where you will keep all of your Ayla datasets (call it datasets_root_dir
). Within datasets_root_dir
, create a directory where your first Ayla dataset will reside; let’s call it my_first_dataset
. Within my_first_dataset
you must create two additional directories which are needed by Ayla that must be called collab_projects
and scalar_functions
. Thus, you should have created a directory structure which looks like this:
datasets_root_dir
my_first_dataset
collab_projects*
scalar_functions*
Items marked with asterisks must be named exactly as above. Now, you need to put some additional data into the my_first_dataset
directory to tell Ayla about your PDB files. Here are your choices:
If your PDB conformations are inside of a single zip archive, then create a link/shortcut inside of the my_first_dataset
directory called conformations.zip
which links to the archive. You must also create a text file in my_first_dataset
called conformation_zip_paths.txt
in which each line contains the path of one of the PDB files with respect to the zip file’s internal directory. Thus, your directory structure should look like this:
datasets_root_dir
my_first_dataset
collab_projects*
scalar_functions*
conformations.zip*
conformation_zip_paths.txt*
where values marked with an asterisk must be named exactly as shown.
If your PDB conformations are stored in one or more directories, then all you need to do is create a text file in my_first_dataset
called conformation_filenames.txt
in which each line contains the absolute path to a PDB file. Thus, your directory structure should look like this:
datasets_root_dir
my_first_dataset
collab_projects*
scalar_functions*
conformation_filenames.txt*
where values marked with an asterisk must be named exactly as shown.
In order to use Ayla, you must have one or more scalar function defined over the conformations (i.e. a scalar function assigns a real number to each conformation). Some simple examples of scalar functions include potential energy, contact density, compactness, and radius of gyration. Concretely, this means that if you have 20,000 PDB files, then you must have a text file containing 20,000 lines where each line is a floating-point number. The correspondence between numbers and PDB files is established by the conformation_filenames.txt
or conformation_zip_paths.txt
files.
Assuming that you have stored the function values in a text file called my_function.txt
, copy it into the scalar_functions
directory. If you have additional scalar function text files, you can copy those into the scalar_functions
directory as well.
The first preprocessing step is to use the pointCloudMaker
utility to convert your PDB files into a cloud of points in a high-dimensional Euclidean space. The idea is that, if two conformations are structurally similar, then they will map to points which are closer, and vice versa. Usage of the pointCloudMaker
utility is as follows:
pointCloudMaker.sh my_first_dataset
This program will generate a file in my_first_dataset
called pcd_v3.dat
which contains the point representations of your conformations.
The second step in preprocessing is to (optionally) filter out conformations which are uninteresting (or equivalently, select a subset of conformations which are particularly interesting) and approximate their putative conformational manifold structure by assembling a proximity graph. The domainApproximator
utility performs these tasks. Usage is as follows:
Usage: domainApproximator [-w whitelist | -b blacklist] k dataset_dir outputFunctionName
The -w option will keep the conformations in the given whitelist text file and ignore all others.
The -b option will discard the conformations in the given blacklist text file and ignore all others.
k is the number of nearest neighbors that are used when reconstructing the domain.
dataset_dir is the directory of the Ayla dataset.
The -w
and -b
options allow you to specify a whitelist (or blacklist) text file in which each line is the absolute (zip) path to a whitelisted (or blacklisted) conformation.
The -k
parameter should be set to something small. Usually a value somewhere between 10 and 20 is sufficient.
Assuming that you have generated one or more datasets using the above instructions, you should be ready to (1) launch the Ayla server, then (2) launch one or more Ayla clients.
To run the Ayla server, use the aylaServer
command. Usage is as follows:
aylaServer datasets_root_dir
To start the Ayla client, use the aylaClient.sh
command.
Basic usage of the Ayla client is covered in the screencast which can be seen at the Ayla Visual Analytics website.
If you have started the Ayla server but are unable to connect to it using the Ayla client, make sure that your firewall allows communication over port 9010.
If you are trying to run the software using OpenJDK and having problems, try switching to Oracle Java SE 6+. I recall having some weird scala serialization problems on OpenJDK, but switching to Oracle’s VM seemed to do the trick.
For other issues please contact William Harvey (harveywi at cse dot ohio-state dot edu).