# DMPs and DMRs

In [None]:
from pylluminator.samples import Samples
from pylluminator.visualizations import dmr_manhattan_plot, dmp_heatmap, visualize_gene, show_chromosome_legend
from pylluminator.dm import DM
from pylluminator.utils import save_object

from pylluminator.utils import set_logger

set_logger('WARNING')  # set the verbosity level, can be DEBUG, INFO, WARNING, ERROR

## Load pylluminator Samples

We assume that you have already processed the .idat files according to your preferences and saved them. If not, please refer to notebook `1 - Read data and get beta values` before going any further.

In [None]:
my_samples = Samples.load('preprocessed_samples')

Here, we want to filter out the probes on the X or Y chromosomes.

In [None]:
my_samples.mask_xy_probes()

To speed up the demo, we will only calculate DMPs and DMRs on 10% of the probes

In [None]:
ten_pct_probes = int(0.1 * my_samples.nb_probes)
probe_ids = my_samples.probe_ids[:ten_pct_probes]
print(f'Selected {ten_pct_probes:,} first probes')

## Differentially Methylated Probes

The second parameter needed to create a DM object (here `~ sample_type`) is a R-like formula that describes the model, and is used to create the design matrix. You can use one or more predictors in the formula, e.g. `~age + sex`. The predictors names must be the column names of the sample sheet.

More info on  design matrices and formulas:
- https://www.statsmodels.org/devel/gettingstarted.html
- https://patsy.readthedocs.io/en/latest/overview.html


In [None]:
my_samples.sample_sheet

In [None]:
my_dms = DM(my_samples, '~ sample_type', probe_ids=probe_ids)

You can now plot the results, for the 25 most variable probes:

In [None]:
dmp_heatmap(my_dms, my_dms.contrasts[0], nb_probes=25, figsize=(8, 5))

## Differentially Methylated Regions

We can then identify ths DMRs by grouping neighboring probes with similar methylation patterns for a given predictor contrast. Similarity is calculated based on the Euclidean distance between probes’ beta values.

In [None]:
my_dms.compute_dmr(my_dms.contrasts)
save_object(my_dms, 'dms')

In [None]:
# get top DMRs and their associated genes for the first contrast, PREC
my_dms.get_top('DMR', my_dms.contrasts[0])

In [None]:
# visualize the DMRs for the first contrast
dmr_manhattan_plot(my_dms, my_dms.contrasts[0])

## Gene visualization

We can then have a look at a particular gene identified as differentially methylated, for example the first one, ISM1. The heatmap of the beta values of the probes associated to this gene shows a clear methylation difference between the healthy cells (PrEC) and the prostate cancer cells (LNCAP).

In [None]:
show_chromosome_legend()  # display the legend for chromosome regions colors, corresponding to Giemsa staining
visualize_gene(my_samples, 'ISM1', figsize=(10, 5))