Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Motif Independent Metric
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
[MIM: Motif Independet Metric - Luca Pinello] This software calculates a measure of sequence specificity called Motif Independent Metric. [INSTALLATION AND REQUIREMENTS] In order to use the MIM software you need: -Python 2.7(http://www.python.org/getit/) and the following modules: -Numpy (http://sourceforge.net/projects/numpy/files/NumPy/1.6.1/) -Scipy (http://sourceforge.net/projects/scipy/files/scipy/0.9.0/) Note: If you are using the Windows 64 bit version of Python, you must download the 64 bit version of the modules from: http://www.lfd.uci.edu/~gohlke/pythonlibs/ [USAGE] The program takes as input: 1) A proper BED file (http://genome.ucsc.edu/FAQ/FAQformat.html#format1) containing the coordinates of your sequences for some reference genome 2) The fasta files of your reference genome (http://hgdownload.cse.ucsc.edu/downloads.html) To launch the program from a shell type: python mim.py bed_file.bed /path_to_your_genome_directory/ To see a list of the optional arguments with a short explanation, please just type: python mim.py At the end of the execution the program will report the MIM value for the sequences extracted from the bed file and its p-value. [TESTING EXAMPLE] 1) Download the human genome (hg18) fasta files from here: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/chromFa.zip and extract all the files from chromFa.zip in a directory (for example hg18_directory). 2) Run the program with the following command: python sample_hg18.bed hg18_directory Notes: If you want to build a more realiable null model, you can use the parameter --null_rep to increase the sampling accuracy (the default value is 1000): python sample_hg18.bed hg18_directory --null_rep 5000 You can also use a null model that shuffle the input sequences instead of random sampling sequence from genome with the optional parameter --null_model: python sample_hg18.bed hg18_directory --null_model shuffle