# Meth_Comp API usage

## Import module

In [1]:
# Import main module 
from pycoMeth.Comp_Report import Comp_Report

# optionally inport jupyter helper functions
from pycoMeth.common import head, jhelp

## Getting help

In [2]:
jhelp(Comp_Report)

**Comp_Report** (methcomp_fn, gff3_fn, ref_fasta_fn, outdir, n_top, max_tss_distance, pvalue_threshold, min_diff_llr, n_len_bin, verbose, quiet, progress, kwargs)

Generate an HTML report of significantly differentially methylated CpG intervals from `Meth_Comp` text output. Significant intervals are annotated with their closest transcript TSS.

---

* **methcomp_fn** (required) [str]

Input tsv file generated by Meth_comp (can be gzipped). At the moment only data binned by intervals with Interval_Aggregate are supported.

* **gff3_fn** (required) [str]

Path to an **ensembl GFF3** file containing genomic annotations. Only the transcripts details are extracted.

* **ref_fasta_fn** (required) [str]

Reference file used for alignment in Fasta format (ideally already indexed with samtools faidx)

* **outdir** (default: "") [str]

Directory where to output HTML reports, By default current directory

* **n_top** (default: 100) [int]

Number of top interval candidates for which to generate an interval report. If there are not enough significant candidates this is automatically scaled down.

* **max_tss_distance** (default: 500000) [int]

Maximal distance to transcription stat site to find transcripts close to interval candidates

* **pvalue_threshold** (default: 0.01) [float]

pValue cutoff for top interval candidates

* **min_diff_llr** (default: 1) [float]

Minimal llr boundary for negative and positive median llr. 1 is recommanded for vizualization purposes.

* **n_len_bin** (default: 500) [int]

Number of genomic intervals for the longest chromosome of the ideogram figure

* **verbose** (default: False) [bool]

* **quiet** (default: False) [bool]

* **progress** (default: False) [bool]

* **kwargs**



## Example usage

#### Example with a single significant result

In [5]:
Comp_Report (
    methcomp_fn = "./data/Yeast_CGI_meth_comp.tsv.gz",
    gff3_fn = "./data/yeast.gff3",
    ref_fasta_fn="./data/yeast.fa",
    outdir = "yeast_html",
    pvalue_threshold = 0.05,
    verbose=True)

[01;34m## Checking options and input files ##[0m
[37m	[DEBUG]: Options summary[0m
[37m	[DEBUG]: 	Package name: pycoMeth[0m
[37m	[DEBUG]: 	Package version: 0.4.3[0m
[37m	[DEBUG]: 	Timestamp: 2020-05-18 23:03:11.505853[0m
[37m	[DEBUG]: 	methcomp_fn: ./data/Yeast_CGI_meth_comp.tsv.gz[0m
[37m	[DEBUG]: 	gff3_fn: ./data/yeast.gff3[0m
[37m	[DEBUG]: 	ref_fasta_fn: ./data/yeast.fa[0m
[37m	[DEBUG]: 	outdir: yeast_html[0m
[37m	[DEBUG]: 	n_top: 100[0m
[37m	[DEBUG]: 	max_tss_distance: 500000[0m
[37m	[DEBUG]: 	pvalue_threshold: 0.05[0m
[37m	[DEBUG]: 	min_diff_llr: 1[0m
[37m	[DEBUG]: 	n_len_bin: 1000[0m
[37m	[DEBUG]: 	verbose: True[0m
[37m	[DEBUG]: 	quiet: False[0m
[37m	[DEBUG]: 	progress: False[0m
[37m	[DEBUG]: 	kwargs[0m
[01;34m## Loading and preparing data ##[0m
[32m	Loading Methcomp data from TSV file[0m
[32m	Loading transcripts info from GFF file[0m
[32m	Loading chromosome info from reference FASTA file[0m
[32m	Number of significant intervals found (a

#### Usage with large dataset

In [3]:
Comp_Report (
    methcomp_fn = "./data/Medaka_CGI_meth_comp.tsv.gz",
    gff3_fn = "./data/medaka.gff3",
    ref_fasta_fn="./data/medaka.fa",
    outdir = "medaka_html",
    n_top=50,
    progress=True)

[01;34m## Checking options and input files ##[0m
[01;34m## Loading and preparing data ##[0m
[32m	Loading Methcomp data from TSV file[0m
[32m	Loading transcripts info from GFF file[0m
[32m	Loading chromosome info from reference FASTA file[0m
[32m	Number of significant intervals found (adjusted pvalue<0.01): 3532[0m
[32m	Generating file names for top candidates reports[0m
[32m	Computing source md5[0m
[01;34m## Parsing methcomp data ##[0m
[32m	Iterating over significant intervals and generating top candidates reports[0m
	Progress: 100%|██████████| 3.53k/3.53k [00:09<00:00, 384 intervals/s]
[32m	Generating summary report[0m
