Skip to content

lucapinello/mim

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
[MIM: Motif Independet Metric - Luca Pinello]

This software calculates a measure of sequence specificity called Motif Independent Metric. 

[INSTALLATION AND REQUIREMENTS]

In order to use the MIM software you need:

-Python 2.7(http://www.python.org/getit/) 

and the following modules:

-Numpy (http://sourceforge.net/projects/numpy/files/NumPy/1.6.1/)
-Scipy (http://sourceforge.net/projects/scipy/files/scipy/0.9.0/)

Note: 
If you are using the Windows 64 bit version of Python, you must download the 64 bit version of the modules from:

http://www.lfd.uci.edu/~gohlke/pythonlibs/

[USAGE]

The program takes as input:

1) A proper BED file (http://genome.ucsc.edu/FAQ/FAQformat.html#format1) containing the coordinates of your sequences for some reference genome

2) The fasta files of your reference genome (http://hgdownload.cse.ucsc.edu/downloads.html)

To launch the program from a shell type:

python mim.py bed_file.bed /path_to_your_genome_directory/

To see a list of the optional arguments with a short explanation, please just type:

python mim.py

At the end of the execution the program will report the MIM value for the sequences extracted from the bed file and its p-value.

[TESTING EXAMPLE]

1) Download the human genome (hg18) fasta files from here: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/chromFa.zip
and extract all the files from chromFa.zip in a directory (for example hg18_directory).

2) Run the program with the following command:

python sample_hg18.bed hg18_directory

Notes: If you want to build a more realiable null model, you can use the parameter --null_rep to increase the sampling accuracy (the default value is 1000):

python sample_hg18.bed hg18_directory --null_rep 5000

You can also use a null model that shuffle the input sequences instead of random sampling sequence from genome with the optional parameter --null_model:

python sample_hg18.bed hg18_directory --null_model shuffle



About

Motif Independent Metric

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages