Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 7 revisions

Biopiece: plot_distribution

Description

plot_distribution create a distribution plot of the values for a specified key from all records in the stream. Plotting is done using GNUplot which allows for different types of output the default one being crufty ASCII graphics.

GNUplot must be installed for plot_distribution to work. Read more here:

http://www.gnuplot.info/

Also, the GNUplot gem for Ruby is required - run: gem install gnuplot

Usage

... | plot_distribution -k <key> [options]

Options

[-?          | --help]               #  Print full usage description.
[-k <string> | --key=<string>]       #  Key to use for plotting.
[-o <file>   | --data_out=<file>]    #  Write result to file.
[-x          | --no_stream]          #  Do not emit records.
[-t <string> | --terminal=<string>]  #  Terminal for output: dumb|post|svg|x11|aqua|png|pdf  -  Default=dumb
[-T <string> | --title=<string>]     #  Set plot title                                       -  Default="Distribution"
[-X <string> | --xlabel=<string>]    #  Set x-axis label                                     -  Default=<key>
[-Y <string> | --ylabel=<string>]    #  Set y-axis label                                     -  Default="n"
[-L          | --logscale_y]         #  Set y-axis to log scale.
[-I <file!>  | --stream_in=<file!>]  #  Read input from stream file                          -  Default=STDIN
[-O <file>   | --stream_out=<file>]  #  Write output to stream file                          -  Default=STDOUT
[-v          | --verbose]            #  Verbose output.

Examples

Here we plot the distribution of sequence lengths from a FASTA file:

read_fasta -i test.fna | plot_distribution -k SEQ_LEN -x

                                  Distribution
       +             +            +            +            +             +
  90 +++-------------+------------+------------+------------+-------------+++
      |                                                                    |
  80 ++                                                                  **++
      |                                                                  **|
  70 ++                                                                  **++
  60 ++                                                                  **++
      |                                                                  **|
  50 ++                                                                  **++
      |                                                                  **|
  40 ++                                                                  **++
      |                                                                  **|
  30 ++                                                                  **++
  20 ++                                                                  **++
      |                                                                  **|
  10 ++                                                                  **++
      |                                                              ******|
   0 +++-------------+------------+**--------**+--***-------+**--**********++
       +             +            +            +            +             +
       0             10           20           30           40            50
                                     SEQ_LEN

To render X11 output (i.e. instant view) use -t x11:

read_fasta -i test.fna | plot_distribution -k SEQ_LEN -t x11 -x

To generate a PNG image:

read_fasta -i test.fna | plot_distribution -k SEQ_LEN -t png -o plot_distribution.png -x

And the result will look like this:

If you choose -t svg instead of -t png the output will be in SVG which is neat since it can easily be modified using e.g. Inkscape to apply labels and such.

Read more about Inkscape here:

http://www.inkscape.org/

See also

read_fasta

plot_histogram

plot_lines

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

mail@maasha.dk

May 2011

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

plot_distribution is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally