Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
A efficient peak finder for high coverage ChIP-seq experiments, written in python.
Python
branch: monolithic

This branch is 99 commits behind master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
docs
pique
README
basic_browser.py
bookslicer.py
pique.py
piquilate.py
setup.py

README

The pique package is a high efficiency peak finder for ChIP-seq
experiments that yield high coverage allignments to the reference
genome. It was developed for studying gene expression in Halobacterium
salinarum sp. NRC1, sequenced using barcoded 40bp Illumina reads as
part of an ongoing project in Marc Facciotti's lab at UC Davis.

INSTALLING
----------

Do the usual python thing.

    $ cd pique
    $ python setup.py install --prefix=/someplace/PYTHONPATH/knows/about


TESTS
-----

Tests are designed to be used with nose. 

    http://somethingaboutorange.com/mrl/projects/nose/

The tests (stupidly) depend on hard-coded paths to find the test data
files, so you have to run nosetests from the main module directory
(i.e., the one that has setup.py in it).

    $ cd pique
    $ nosetests


USAGE
-----

The workflow pique is designed for goes something like this :

    [1]  Perform a chromatin immunoprecipitation, extract the DNA, put
         it in the sequencing machine, get sequence data back.

    [2]  Align the reads to the reference genome of your organism.

    [3]  Generate coverage tracks for the forward and reverse strands for
         the ChIP data and for the controls. You did remember to do 
         controls, right?

    [4]  Some regions of the genome may be repetitive or ambiguous, such 
         as IS elements. Create a masking file for these regions.

    [5]  Use a genome browser to select some regions that are
         definitely not peaks. Make a bookmark file.

    [6]  Run piquify.py to generate normalized background tracks using
         the bookmarked non-peak regions.

    [7]  Run piquant.py to find the noise threshold.

    [8]  Run pique.py to build a bookmark file of putative enrichment
         regions.

    [9]  View the bookmarks in your favorite genome browser (pique was
         designed around the Gaggle Genome Browser). Delete false 
         positives, manually add false negatives, and adjust any 
         start:stop coordinates that look badly off. Save the bookmark 
         file.

    [10] Run piquancy.py to re-annotate the curated bookmark file with
         enrichment ratios.
    
    [11] Run piquilate.py to re-annotate the curated bookmark file
         with genes likely to be associated with the enrichment site.


CAVEATS
-------

Binding coodinates : The peak calling step (pique.py) will attempt to
calculate a binding coordinate for the putative peaks it finds. The
method used is extremely naive, but we find that it generally comes
within about 40bp of the expected binding site.

Double peaks : Some enriched regions contain two binding sites, but
the algorithm is unable to de-convolute them. We have tried running
these regions through several other peak callers with little success.
We have included a script for identifying sites that are likely to be
double peaks to assist with curation.

CONFIG FILES
------------

INSPIRATIONAL QUOTE
-------------------

I met men at every turn who owned from one thousand to thirty thousand
"feet" in undeveloped silver mines, every single foot of which they
believed would shortly be worth from fifty to a thousand dollars—and
as often as any other way they were men who had not twenty-five
dollars in the world. Every man you met had his new mine to boast of,
and his "specimens" ready; and if the opportunity offered, he would
infallibly back you into a corner and offer as a favor to you, not to
him, to part with just a few feet in the "Golden Age," or the "Sarah
Jane," or some other unknown stack of croppings, for money enough to
get a "square meal" with, as the phrase went. And you were never to
reveal that he had made you the offer at such a ruinous price, for it
was only out of friendship for you that he was willing to make the
sacrifice. Then he would fish a piece of rock out of his pocket, and
after looking mysteriously around as if he feared he might be waylaid
and robbed if caught with such wealth in his possession, he would dab
the rock against his tongue, clap an eyeglass to it, and exclaim:

"Look at that! Right there in that red dirt! See it? See the specks of
gold? And the streak of silver? That's from the Uncle Abe. There's a
hundred thousand tons like that in sight! Right in sight, mind you!
And when we get down on it and the ledge comes in solid, it will be
the richest thing in the world! Look at the assay! I don't want you to
believe me—look at the assay!"

Then he would get out a greasy sheet of paper which showed that the
portion of rock assayed had given evidence of containing silver and
gold in the proportion of so many hundreds or thousands of dollars to
the ton.

I little knew, then, that the custom was to hunt out the richest piece
of rock and get it assayed! Very often, that piece, the size of a
filbert, was the only fragment in a ton that had a particle of metal
in it—and yet the assay made it pretend to represent the average value
of the ton of rubbish it came from!

    - Mark Twain, Roughing It


LEGAL STUFF
-----------

Copyright (c) 2011, Russell Neches
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

    * Redistributions of source code must retain the above copyright 
      notice, this list of conditions and the following disclaimer.

    * Redistributions in binary form must reproduce the above copyright
      notice, this list of conditions and the following disclaimer in 
      the documentation and/or other materials provided with the 
      distribution.

    * Neither the name of the University of California, Davis nor the 
      names of its contributors may be used to endorse or promote 
      products derived from this software without specific prior 
      written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Something went wrong with that request. Please try again.