doc/files/README_DEV.html

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <title>File: README.DEV</title>
  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
  <meta http-equiv="Content-Script-Type" content="text/javascript" />
  <link rel="stylesheet" href=".././rdoc-style.css" type="text/css" media="screen" />
  <script type="text/javascript">
  // <![CDATA[

  function popupCode( url ) {
    window.open(url, "Code", "resizable=yes,scrollbars=yes,toolbar=no,status=no,height=150,width=400")
  }

  function toggleCode( id ) {
    if ( document.getElementById )
      elem = document.getElementById( id );
    else if ( document.all )
      elem = eval( "document.all." + id );
    else
      return false;

    elemStyle = elem.style;
    
    if ( elemStyle.display != "block" ) {
      elemStyle.display = "block"
    } else {
      elemStyle.display = "none"
    }

    return true;
  }
  
  // Make codeblocks hidden by default
  document.writeln( "<style type=\"text/css\">div.method-source-code { display: none }</style>" )
  
  // ]]>
  </script>

</head>
<body>


  <div id="fileHeader">
    <h1>README.DEV</h1>
    <table class="header-table">
    <tr class="top-aligned-row">
      <td><strong>Path:</strong></td>
      <td>README.DEV
      </td>
    </tr>
    <tr class="top-aligned-row">
      <td><strong>Last Update:</strong></td>
      <td>Fri Oct 05 18:50:32 +0100 2007</td>
    </tr>
    </table>
  </div>
  <!-- banner header -->

  <div id="bodyContent">


  <div id="contextContent">

    <div id="description">
      <p>
Copyright (C) 2007 Jan Aerts &lt;jan.aerts@bbsrc.ac.uk&gt;
</p>
<h2>README for developers</h2>
<p>
This README is mainly meant to explain how the code works (rather than how
to <em>use</em> the library). It should help if you&#8216;re interested in
contributing, or if you think you found a bug.
</p>
<h3>Overview</h3>
<p>
I&#8216;ve tried to document as much as possible in the code itself, see
for example the comments that accompany the setting of the defaults for <a
href="../classes/Bio/Graphics.html">Bio::Graphics</a> in the panel.rb file.
However, the bigger picture can not be explained that way.
</p>
<h3>The files</h3>
<p>
There&#8216;s one file for each class: panel, track, feature, ruler and
image_map. See the tutorial on a breakdown what each of these do. All of
these except the image_map make up a picture. The image_map is used to
describe the HTML map that can be created to make a picture clickable.
</p>
<p>
Classes are embedded in each other: instead of
</p>
<pre>
  Bio::Graphics::Panel
  Bio::Graphics::Ruler
  Bio::Graphics::Track
  Bio::Graphics::Feature
</pre>
<p>
we have:
</p>
<pre>
  Bio::Graphics::Panel
  Bio::Graphics::Panel::Ruler
  Bio::Graphics::Panel::Track
  Bio::Graphics::Panel::Track::Feature
</pre>
<p>
There&#8216;s a reason for this. A track can only exist within the confines
of a panel (i.e. a panel is a container for tracks), and a feature can only
exist within the confines of a track. In addition, there are quite some
instances where information from the panel is necessary for the track, and
from the track for the features.
</p>
<h3>The workflow</h3>
<h4>1. Creating the panel</h4>
<p>
The user has to start with a
</p>
<pre>
  my_panel = Bio::Graphics::Panel.new(length, width, clickable, display_start, display_stop)
</pre>
<p>
When this happens, among other things, the instance variable @tracks is
created that will later contain the actual Track objects. In addition,
there&#8216;s @number_of_times_bumped. You&#8216;ll later see that each
Track object also has its @number_of_times_bumped. The panel needs this
information to know how far it has to go down before it can start drawing a
track: the first track will be just below the ruler, but the vertical
coordinates of the second one depend on the height of all the ones that
were drawn previously. And <em>that</em> in turn is defined by the number
of times a feature would overlap with another one and therefore had to be
<em>bumped</em> down.
</p>
<p>
@display_start and @display_stop are used for zooming in on a region. Even
though the full @length of the sequence can be really long, setting
@display_start and @display_stop will only consider that region.
</p>
<p>
Then there is @rescale_factor, which plays a crucial role in drawing the
stuff: it tells the script how many basepairs are contained in one pixel.
This variable will be used <em>very</em> extensively in the drawing code.
</p>
<p>
So this covered the Panel#initialize&#8230;
</p>
<h4>2. Adding tracks to the panel</h4>
<p>
Because tracks are inherently part of a panel and cannot exist on their
own, they can only be created by using a Panel method rather than a Track
method.
</p>
<pre>
  my_track_1 = my_panel.add_track(name, feature_colour = [0,0,1], feature_glyph = 'generic')
</pre>
<p>
This creates a new Track object and adds it to the @tracks array of the
Panel object. Several instance variables are set for the Track object,
including @features (which is an array of Feature objects for that track)
and @number_of_times_bumped. Every time a feature cannot be drawn because
it would overlap with another one, it will be &#8216;bumped&#8217; down
until it can be drawn. This effectively results in <em>rows</em> that
contain the features. The @number_of_times_bumped is just the number of
rows (to be able to calculate the height of the track afterwards). I admit
that this variable should be renamed to something like
@number_of_feature_rows or something, because the value is actually the
number of times bumped + 1. In the example below, @number_of_times_bumped
is 3 (instead of 2). (I&#8216;ll change that later&#8230;)
</p>
<pre>
  ------------------------------------------------------
    *******    ****  *********         *****    *****
         *****                       ********
                                    **
</pre>
<p>
The Panel#add_track method returns the Track object itself, because the
latter has to be accessible to be able to assign features to it.
</p>
<h4>3. Adding features to a track</h4>
<p>
Same thing as adding a track to a panel: the feature can only be added by
the user by using the Track#add_feature method. Parameters are the name of
the feature, the location and the link.
</p>
<p>
The location of a feature can be something like
&#8216;complement(join(10..20,50..70))&#8217;. To be able to parse this, I
use the Bio::Locations object from bioruby (see <a
href="http://www.bioruby.org">www.bioruby.org</a>). A Bio::Locations
(plural) object contains one or more Bio::Location (singular) objects,
which are the subfeatures: 10..20 and 50..70. It&#8216;s these
Bio::Location objects we use to calculate the ultimate start and stop of
the feature.
</p>
<p>
The Track#add_feature method returns the Track object itself.
</p>
<p>
Now let&#8216;s look at the other end: the Feature object that gets
created. In the Feature#initialize method, you&#8216;ll notice, apart from
the obvious variables, the following instances variables:
@pixel_range_collection, @chopped_at_start, @chopped_at_stop,
@hidden_subfeatures_at_start and @hidden_subfeatures_at_stop. Let&#8216;s
take these one by one:
</p>
<h5>@pixel_range_collection</h5>
<p>
Now <em>this</em> is the crucial bit: it will hold the information on what
pixels (on the horizontal axis) should be covered. This means that any part
of the feature that does not fall within the view is <em>not</em> in this
collection. Basically, for every subfeature (e.g. exon for a gene), the
location of that subfeature is compared to the region of the view. If a
subfeature is not in the view at all, its positions are discarded (but
other stuff does happen, see below); if a subfeature is at the left of the
picture but actually extends outwith the view, the start pixel will become
1. You get the picture. Also see the mini diagrams in the code itself.
</p>
<p>
These start and stop positions are used to create
Bio::Graphics::Panel::Track::PixelRange objects. Unspliced objects will
have an array @pixel_range_collection with just one element.
</p>
<h5>@chopped_at_start and @chopped_at_stop</h5>
<p>
Suppose you&#8216;ve got a directed feature (so one with an arrow), and the
3&#8217; end falls outside of the view. What would happen, is that the
3&#8217; end that&#8216;s out of view would be chopped of (that&#8216;s
good), but also that the end of the glyph (which is <em>not</em> the end of
the feature) becomes an arrow. I don&#8216;t want that. Instead, the arrow
should be removed.
</p>
<p>
That&#8216;s where the @chopped_at_start and @chopped_at_stop come in. If
these are set to true (while building the @pixel_range_collection), the
arrow is not drawn.
</p>
<h5>@hidden_subfeatures_at_start and @hidden_subfeatures_at_stop</h5>
<p>
For spliced features, it might be that one or more of the subfeatures (e.g.
exons) lies outwith the view. We normally draw e.g. genes by drawing the
exons as boxes and connecting them with small lines. The drawing code
itself (see later) takes all exons within view and draws those connections.
However, if an exon is outside of the viewing area, this line is not drawn.
The @hidden_subfeatures_at_start and @hidden_subfeatures_at_stop are just
flags to capture this.
</p>
<h4>4. Drawing the thing</h4>
<p>
The Cairo library (<a
href="http://cairographics.org">cairographics.org</a>) is used for the
actual drawing. The main concepts in the Cairo drawing model are (please
also see <a
href="http://cairographics.org/tutorial">cairographics.org/tutorial</a>):
</p>
<ul>
<li><b>source</b>: the <em>paint</em> you&#8216;ll be using

</li>
<li><b>destination</b>: the <em>surface</em> (Cairo::ImageSurface) that you
want to draw onto

</li>
<li><b>mask</b>: controls where you apply the source to the destination. Stuff
like &#8216;line_to&#8217;.

</li>
<li><b>context</b>: tracks one source, one mask and one destination.

</li>
</ul>
<p>
From the cairo tutorial: &quot;Before you can start to draw something with
cairo, you need to create the context. &lt;SNIP&gt; When you create a cairo
context, it must be tied to a specific surface - for example, an image
surface if you want to create a PNG file.&quot; So that&#8216;s what we
have to do: create a Cairo::ImageSurface and connect a Cairo::Context to
it.
</p>
<p>
Now let&#8216;s walk through the code itself&#8230;
</p>
<p>
When a user draws a panel, the first thing that happens, is the creation of
a Cairo::ImageSurface (the <em>destination</em>). To be able to do this, we
need to know the dimensions. But there&#8216;s a slight problem: we
can&#8216;t know the height of the picture until it&#8216;s actually drawn.
The way we&#8216;ll circumvent this, is that we create a really high
picture (called &quot;huge_panel_drawing&quot;) that we&#8216;ll crop
afterwards.
</p>
<h5>Drawing the ruler</h5>
<p>
A ruler consists of a line with tickmarks on it. The major issue with
drawing the ruler, is determining the distance between those ticks. Suppose
we have zoomed into a small region, we&#8216;d still want to see usable
ticks; and if we&#8216;ve zoomed out to a huge region, we don&#8216;t want
to have those ticks all bumping into each other.
</p>
<p>
To calculate the distance between consecutive ticks, we start with a
distance of 1 basepair, and increase it until the minimal distance
criterion is met. We also set the distance between major tickmarks (which
are the ones that will get a number). There&#8216;s a small issue when you
actually start drawing the ticks. Most of the time, we don&#8216;t want the
first tick on the very first basepair of the view. Suppose that would be
position 333 in the sequence. Then the numbers under the major tickmarks
would be: 343, 353, 363, 373 and so on. Instead, we want 350, 360, 370,
380. So we want to find the position of the first tick. If we&#8216;ve
found that one, it&#8216;s simple to add the rest of them.
</p>
<p>
The ruler height @height consists of the height of the ruler itself plus
the height of the numbers.
</p>
<h5>Drawing the tracks</h5>
<p>
Drawing each track starts out with the general header: a line above it and
the title. Obviously, the more challenging part is drawing the features
themselves.
</p>
<p>
First thing we have to do, is figure out what the <b>vertical</b>
<b>coordinates</b> of the glyph should be (i.e. the row). To keep track of
what parts of the screen are already occupied by features (so that we know
when a new feature has to be bumped down), I make use of a <b>grid</b>. The
grid is basically a hash with the keys being the row number, and the values
arrays of ranges. (These ranges use basepair units rather than pixels, but
that&#8216;s completely arbitrary.) For each feature, we first check if we
can draw it at the top of the track (i.e. row 1) and if we can&#8216;t move
it down a row at a time until there&#8216;s room for it.
</p>
<p>
So for example, suppose we&#8216;ve already drawn two features that have
the following positions: 100..150 and 200..225. The grid would then look
like this:
</p>
<pre>
  grid = { 1 =&gt; [(100..150),(200..225)] }
</pre>
<p>
If we&#8216;d like to draw a new feature from 125..175 (which overlaps the
first of the two ranges above), we see that row_available becomes false,
and the row number is increased. The grid after adding this feature looks
like:
</p>
<pre>
  grid = { 1 =&gt; [(100..150),(200..225)],
           2 =&gt; [(125..175)] }
</pre>
<p>
So now we know what the vertical coordinates of the glyph should be. Next
step is to check if there&#8216;s reasons we would like to <b>change</b>
<b>the</b> <b>requested</b> <b>glyph</b> <b>type</b> <b>from</b>
<b>directed</b> <b>to</b> <b>undirected</b>. If the user asks for directed
glyphs (i.e. ones with an arrow at the end), but the view is zoomed
<em>way</em> out, there&#8216;s no way the arrow will be visible. If
we&#8216;d try to draw that arrow anyway, it would become bigger than the
feature itself. Another reason would be if the feature&#8216;s 3&#8217; end
extends outwith the picture.
</p>
<p>
Finally, we can <b>draw</b>. The actual drawing bit should be quite
self-explanatory (<em>move_to</em>, <em>line_to</em>, &#8230;).
</p>
<p>
For the spliced features (<em>spliced</em> itself and
<em>directed_spliced</em>), we first draw the components (i.e. the exons)
keeping track of the start and stop positions of the gaps (i.e. introns).
We then add the connections in those gaps. In addition, we draw a line that
extends to the side of the picture if there are exons out of view. This
flag was set when the feature was created (see above:
@hidden_subfeatures_at_start and @hidden_subfeatures_at_stop).
</p>
<p>
When the user wants a clickable map, we also have to record that this
region should be added to the image map.
</p>
<p>
When everything has been drawn, we finally know the number of rows for that
track (i.e. the number_of_times_bumped).
</p>
<h5>Finalizing the panel</h5>
<p>
So now we have a huge panel (see &quot;huge_panel_drawing&quot; above)
which is way to high. This is converted to a panel of the right size by
creating a new panel (i.e. the cairo destination), and then using the huge
panel as a source to be transferred on that new destination.
</p>
<p>
And we just write the PNG to a file. If the user wanted a clickable map,
also create the HTML file.
</p>

    </div>


   </div>


  </div>


    <!-- if includes -->

    <div id="section">


    <!-- if method_list -->


  </div>


<div id="validator-badges">
  <p><small><a href="http://validator.w3.org/check/referer">[Validate]</a></small></p>
</div>

</body>
</html>