PINTS is available on PyPI, which means you can install it with the following command:
pip install pyPINTSAlternatively, you can clone this repo to a local directory, then in the directory, run the following command:
python setup.py installPython packages
- biopython
- matplotlib
- numpy
- pandas
- pybedtools
- pyBigWig
- pysam
- requests
- scipy
- statsmodels
PINTS can call peaks directly from BAM files. To call peaks from BAM files,
you need to provide the tool a path to the bam file and what kind of experiment it was from.
If it's from a standard protocol, like PROcap, then you can set --exp-type PROcap.
Other supported experiments including GROcap/
CoPRO/
csRNAseq/
NETCAGE/
CAGE/
RAMPAGE/
STRIPEseq. For a comprehensive list of directly supported assays, please run
pints_caller --helpIf the data was generated by other methods, you need to tell the tool where it can find ends of RNAs you are interested in.
For example, --exp-type R_5 tells the tool that:
- this alignment is from a single-end library;
- the tool should look at 5' of reads. Other supported values are
R_3,R1_5,R1_3,R2_5,R2_3.
If reads represent the reverse complement of original RNAs, like PROseq, then you need to use --reverse-complement
(not necessary for standard protocols).
One example for calling peaks from BAM file:
pints_caller --bam-file input.bam --save-to output_dir --file-prefix output_prefix --thread 16 --exp-type PROcapOr you can call peaks from BigWig files:
pints_caller --save-to output_dir --file-prefix output_prefix --bw-pl path_to_pl.bw --bw-mn path_to_mn.bw --thread 16- prefix+
_{SID}_divergent_peaks.bed: Divergent TREs; - prefix+
_{SID}_bidirectional_peaks.bed: Bidirectional TREs (divergent + convergent); - prefix+
_{SID}_unidirectional_peaks.bed: Unidirectional TREs, maybe lncRNAs transcribed from enhancers (e-lncRNAs) as suggested here.
{SID} will be replaced with the number of samples that peaks are called from,
if you only provide PINTS with one sample, then {SID} will be replaced with 1,
if you try to use PINTS with three replicates (--bam-file A.bam B.bam C.bam), then {SID} for peaks identified from A.bam will be replaced with 1.
For divergent or bidirectional TREs, there will be 6 columns in the outputs:
- Chromosome
- Start site: 0-based
- End site: 0-based
- Confidence about the peak pair. Can be:
Stringent(qval), which means the two peaks on both forward and reverse strands are significant based on their q-values;Stringent(pval), which means one peak is significant according to q-value while the other one is significant according to p-value;Relaxed, which means only one peak is significant in the pair.- A combination of the three types above, because of overlap for nearby elements.
- If epigenomic annotation is enabled by
--epig-annotation <biosample>, then peaks that are less significant (--relaxed-fdr-target, default is 2*fdr_target), but overlap with epigenomic annotations from PINTS web server, will be listed with the confidence level:Marginal.
- Major TSSs on the forward strand, if there are multiple major TSSs, they will be separated by comma
, - Major TSSs on the reverse strand, if there are multiple major TSSs, they will be separated by comma
,
For unidirectional TREs, there will be 9 columns in the output:
- Chromosome
- Start
- End
- Peak ID
- Q-value
- Strand
- Read counts
- Position of the summit TSS
- Height of the summit
For all three types of TREs, if a valid biosample name for --epig-annotation is provided, then an additional column with epigenomic annotation for each TRE will show up in the final output.
- If you want to use BAM files as inputs:
--bam-file: input bam file(s);--exp-type: Type of experiment. If the experiment is not listed as a choice, or you know the position of RNA ends on the reads and you want to override the defaults, you can specify:R_5(5' of the read for single-end lib),R_3(3' of the read for single-end lib),R1_5(5' of the read1 for paired-end lib),R1_3(3' of the read1 for paired-end lib),R2_5(5' of the read2 for paired-end lib),- or
R2_3(3' of the read2 for paired-end lib)
--reverse-complement: Set this switch if 1)exp-typeisRx_xand 2) reads in this library represent the reverse complement of RNAs, like PROseq;--ct-bam: Bam file for input/control (optional);
- If you want to use bigwig files as inputs:
--bw-pl: Bigwig for signals on the forward strand;--bw-mn: Bigwig for signals on the reverse strand;--ct-bw-pl: Bigwig for input/control signals on the forward strand (optional);--ct-bw-mn: Bigwig for input/control signals on the reverse strand (optional);
--save-to: save peaks to this path (a folder), by default, current folder--file-prefix: prefix to all outputs
--epig-annotation <biosample>: Use this option together with the name of the biosample that the library was derived from, for example K562; then epigenomic annotations will be downloaded from the PINTS web server and used for annotating and augmenting TREs identified by PINTS (for hg38 only);--relaxed-fdr-target <relaxed fdr>: In the presence of--epig-annotation, peaks that do not pass the original FDR cutoff but pass this relaxed cutoff and have support from DNase-seq and H3K27ac ChIP-seq will also be included in final outputs. By default, 2*fdr;--mapq-threshold <min mapq>: Minimum mapping quality, by default: 30 orNone;--close-threshold <close distance>: Distance threshold for two peaks (on opposite strands) to be merged, by default: 300;--fdr-target <fdr>: FDR target for multiple testing, by default: 0.1;--chromosome-start-with <chromosome prefix>: Only keep reads mapped to chromosomes with this prefix, if it's set toNone, then all reads will be analyzed;--thread <n thread>: Max number of threads the tool can create;--borrow-info-reps: Borrow information from reps to refine calling of divergent elements;--output-diagnostic-plot: Save diagnostic plots (independent filtering and pval dist) to local folder
More parameters can be seen by running pints_caller -h.
pints_boundary_extender: Extend peaks from summits.pints_visualizer: Generate bigwig files for the inputs.pints_normalizery: Normalize inputs.
- Be cautious to reads mapped to scaffolds instead of main chromosome (for example the notorious
chrUn_gl000220inhg19, they maybe rRNA contamination)!
Please submit an issue with any questions or if you experience any issues/bugs. If you use PINTS in your work, please cite: https://www.nature.com/articles/s41587-022-01211-7.