Skip to content

Meta gene plotting with conflicting trend filtering

License

Notifications You must be signed in to change notification settings

aasaporito/MetageneCluster

Repository files navigation

Contributors Forks Stargazers Issues License: MPL 2.0 LinkedIn

MetageneCluster

MetageneCluster generates metagene analysis plots for a given feature within a .gff/.gtf file when paired with a corresponding SAM file.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. License
  5. Contact

About The Project

Built With

  • Python3
  • MatPlotLib
  • Numpy

(back to top)

Getting Started

Python3 is required to install and run this program.

Prerequisites

Run in a terminal:

pip3 install matplotlib numpy

Installation

Clone the repo:

git clone https://github.com/aasaporito/MetageneCluster.git

(back to top)

Usage

To run, open a terminal in the MetageneCluster directory.

Examples:

Minimal parameters:

  python3 run.py RNA_seq.sam hg38.gff CDS 500 1000

Full parameters:

  python3 run.py -c -s -r H3K36me3.sam input.sam TAIR9.gff CDS 500 1000 0.25

Parameters

Argument Function
-c, -u, -cu Indicates whether you want to cluster your data by similarity. May be set to -c to produce clustered metagene plots, -u to produce a single, unclustered metagene plot or -cu to produce both an unclustered as well as clustered plots. Default: -c.
-s, -m Indicates whether you want to cluster by shape only or include magnitude. May be set to -s to cluster by overall shape of plot or -m to factor magnitude of signal into account when clustering. Meaningless in unclustered mode. Default: -s.
-r, -R Indicates that you want to compute and plot the ratio of the first alignment file to the second. If -r or -R are enabled, you must follow with two .sam files. Default: disabled.
file_name.sam Your input aligned .sam file with path. Must be two files separated by whitespace if -r is enabled.
file_name.gff Your input annotation file file with path.
feature The feature you want to build your metagene plots from. i.e. gene, CDS.
streamDistance Integer distance up and downstream of your feature of interest to be included in the plot. Included for context only, not used for caluclating which features cluster together.
norm_length Integer length in nucleotides that features should be normalized to.
dist_reduct Used to determine how many clusters, k, to build. The method selects the cluster number when the change in total distance from the last cluser number, k-1, has a smaller reduction than this value. Must be between 0 and 1. Default: 0.25

The program will store all generated output in ~/MetageneCluster/Outputs/

(back to top)

License

Distributed under the Mozilla Public License Version 2.0 License. See LICENSE.txt for more information.

(back to top)

Contact

Primary Author:

Clayton Carter - LinkedIn - GitHub

Current Maintainer:

Aaron Saporito - LinkedIn - GitHub

Project Link: https://github.com/aasaporito/MetageneCluster

(back to top)

About

Meta gene plotting with conflicting trend filtering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages