Skip to content

An efficient implementation of the Seriation problem which 'finds a suitable linear order for a set of objects'. It has been used to order a network of proteins such that 'related' nodes are closer in the other.

License

amamory/seriation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

seriation

This software is an efficient implementation of the Seriation problem which 'finds a suitable linear order for a set of objects'. It has been used to order a network of proteins such that 'related' nodes are closer in the other.

Build Status DOI

Authors

The Seriation Package was developed by Felipe Kuentzer, in collaboration with Douglas G. Ávila, Alexandre Pereira, Gabriel Perrone, Samoel da Silva, Alexandre Amory, and Rita de Almeida.

Contact information: Alexandre Amory (amamory @ gmail com)

Inputs

The input file is a textual file describing an undirected network of nodes (in our examples the nodes are protein names). Example:

L7007 L7008
L7008 L7007
L7010 L7011
L7011 L7010
L7014 L7015
L7015 L7014
L7017 Z1275

In the tab Files you can find networks for different species such as Escherichia coli, Mus musculus, Saccharomyces cerevisiae, Homo sapiens, among others.

Outputs

The output is a text file with the order of the network nodes. Example:

Protein	dim1
Z5822	0
Z5823	1
Z2911	2
Z2910	3
Z2909	4
Z4123	5
Z4124	6
Z3105	7
Z3106	8
...

The following image represents the Homo sapiens network with a random ordering.

initial

The next image represents the Homo sapiens network 'seriated'.

final

Download and Instalation

The Seriation Package is developed in C and tested on Ubuntu 14.04.

  • Download the package .
  • Is recommended to update your packages before the instalation:

sudo apt-get install update

  • To install, you can double-click it or execute:

sudo dpkg -i cfm-seriation_1.0-1_amd64.deb

  • In case of missing dependencies, try:

sudo apt-get install -f

  • To unistall:

sudo dpkg -r cfm-seriation

  • this distribution has the following files
/usr/share/cfm-seriation/bin/             Executable file
/usr/share/cfm-seriation/etc/             Auxiliar used to plot charts with GNUPLOT
/usr/share/cfm-seriation/data/            Biological input networks
/usr/share/cfm-seriation/src/             Source code in C

Download and Compilation

sudo apt-get install git

git clone https://github.com/amamory/seriation.git

cd seriation

gcc cfm-seriation.c -lm -lpthread -lrt -o cfm-seriation

How to Use

type 'cfm-seriation' to show the options:

cfm-seriation

Usage: cfm-seriation [OPTION...]

 Seriation Parameters:
   f=[NETWORK FILE].dat       Network file path name
   o=[ORDER FILE].dat         Apply initial order
   i=[INTERVAL]               Number of isothermal steps
   m=[STEPS]                  Number of steps
   c=[FACTOR]                 Cooling factor
   a=[ALPHA]                  Alpha value
   p=[PERCENTUAL]             Percentual energy for initial temperature
   s=[SEDD]                   Random seed
   P                          Plot graphs
   v                          Generate video

type to execute the seriation. This process can take about 12 minutes, depending on the CPU.

./cfm-seriation f=data/Homo_sapiens.dat m=3000 P

In case you want a video of the process, type to execute the seriation.

./cfm-seriation f=/usr/share/cfm-seriation/data/Homo_sapiens.dat m=3000 P v

This will consume some extra time.

Usage: cfm-seriation [OPTION...]

 Seriation Parameters:
   f=[NETWORK FILE].dat       Network file path name
   o=[ORDER FILE].dat         Apply initial order
   i=[INTERVAL]               Number of isothermal steps
   m=[STEPS]                  Number of steps
   c=[FACTOR]                 Cooling factor
   a=[ALPHA]                  Alpha value
   p=[PERCENTUAL]             Percentual energy for initial temperature
   s=[SEDD]                   Random seed
   P                          Plot graphs
   v                          Generate video

Reading file...
	Proteins: 9684
	Interactions: 163509
Applying random order...
Saving and plotting initial order...
INITIAL Energy: 4123514310
Ordering...
100% [====================================================================================================]

For a quicker test you can execute smaller dataset, like the Escherichia Coli.

./cfm-seriation f=Escherichia_coli.dat

Reading file...
	Proteins: 3598
	Interactions: 13687
Applying random order...
Saving initial order...
INITIAL Energy: 129449102
Ordering...
100% [====================================================================================================]
FINAL Energy: 129025784
Saving final order...
Done!

The results are save in a different directory for each execution.

Further Information

License

The source code is distributed under the terms of the GNU General Public License v3 GPL.

How to Cite this Package

If you are using this package on your research, please cite our paper:

KUENTZER, Felipe A. et al. Optimization and analysis of seriation algorithm for ordering protein networks. 
In: IEEE International Conference on Bioinformatics and Bioengineering (BIBE), 2014. p. 231-237.

Where Seriation is Used

If you are using the Seriation Package, please send an email to alexandre.amory at pucrs.br so we can update this list of users:

Similar Packages

About

An efficient implementation of the Seriation problem which 'finds a suitable linear order for a set of objects'. It has been used to order a network of proteins such that 'related' nodes are closer in the other.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published