[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/rsinghlab/pyaging/blob/main/tutorials/tutorial_histonemarkchipseq.ipynb) [![Open In nbviewer](https://img.shields.io/badge/View%20in-nbviewer-orange)](https://nbviewer.jupyter.org/github/rsinghlab/pyaging/blob/main/tutorials/tutorial_histonemarkchipseq.ipynb)

# Bulk histone mark ChIP-Seq

This tutorial is a brief guide for the implementation of the seven histone-mark-specific clocks and the pan-histone-mark clock developed ourselves. Link to [preprint](https://www.biorxiv.org/content/10.1101/2023.08.21.554165v3).

We just need two packages for this tutorial.

In [1]:
import pandas as pd
import pyaging as pya

## Download and load example data

Let's download an example of H3K4me3 ChIP-Seq bigWig file from the ENCODE project.

In [2]:
pya.data.download_example_data('ENCFF386QWG')

|-----> 🏗️ Starting download_example_data function
|-----------> Data found in pyaging_data/ENCFF386QWG.bigWig
|-----> 🎉 Done! [0.4776s]


|-----------> in progress: 1.0002%

|-----------> in progress: 2.0005%

|-----------> in progress: 3.0007%

|-----------> in progress: 4.0010%

|-----------> in progress: 5.0012%

|-----------> in progress: 6.0014%

|-----------> in progress: 7.0017%

|-----------> in progress: 8.0019%

|-----------> in progress: 9.0022%

|-----------> in progress: 10.0024%

|-----------> in progress: 11.0026%

|-----------> in progress: 12.0029%

|-----------> in progress: 13.0031%

|-----------> in progress: 14.0034%

|-----------> in progress: 15.0036%

|-----------> in progress: 16.0038%

|-----------> in progress: 17.0041%

|-----------> in progress: 18.0043%

|-----------> in progress: 19.0045%

|-----------> in progress: 20.0048%

|-----------> in progress: 21.0050%

|-----------> in progress: 22.0053%

|-----------> in progress: 23.0055%

|-----------> in progress: 24.0057%

|-----------> in progress: 25.0060%

|-----------> in progress: 26.0062%

|-----------> in progress: 27.0065%

|-----------> in progress: 28.0067%

|-----------> in progress: 29.0069%

|-----------> in progress: 30.0072%

|-----------> in progress: 31.0074%

|-----------> in progress: 32.0077%

|-----------> in progress: 33.0079%

|-----------> in progress: 34.0081%

|-----------> in progress: 35.0084%

|-----------> in progress: 36.0086%

|-----------> in progress: 37.0089%

|-----------> in progress: 38.0091%

|-----------> in progress: 39.0093%

|-----------> in progress: 40.0096%

|-----------> in progress: 41.0098%

|-----------> in progress: 42.0101%

|-----------> in progress: 43.0103%

|-----------> in progress: 44.0105%

|-----------> in progress: 45.0108%

|-----------> in progress: 46.0110%

|-----------> in progress: 47.0112%

|-----------> in progress: 48.0115%

|-----------> in progress: 49.0117%

|-----------> in progress: 50.0120%

|-----------> in progress: 51.0122%

|-----------> in progress: 52.0124%

|-----------> in progress: 53.0127%

|-----------> in progress: 54.0129%

|-----------> in progress: 55.0132%

|-----------> in progress: 56.0134%

|-----------> in progress: 57.0136%

|-----------> in progress: 58.0139%

|-----------> in progress: 59.0141%

|-----------> in progress: 60.0144%

|-----------> in progress: 61.0146%

|-----------> in progress: 62.0148%

|-----------> in progress: 63.0151%

|-----------> in progress: 64.0153%

|-----------> in progress: 65.0156%

|-----------> in progress: 66.0158%

|-----------> in progress: 67.0160%

|-----------> in progress: 68.0163%

|-----------> in progress: 69.0165%

|-----------> in progress: 70.0168%

|-----------> in progress: 71.0170%

|-----------> in progress: 72.0172%

|-----------> in progress: 73.0175%

|-----------> in progress: 74.0177%

|-----------> in progress: 75.0180%

|-----------> in progress: 76.0182%

|-----------> in progress: 77.0184%

|-----------> in progress: 78.0187%

|-----------> in progress: 79.0189%

|-----------> in progress: 80.0191%

|-----------> in progress: 81.0194%

|-----------> in progress: 82.0196%

|-----------> in progress: 83.0199%

|-----------> in progress: 84.0201%

|-----------> in progress: 85.0203%

|-----------> in progress: 86.0206%

|-----------> in progress: 87.0208%

|-----------> in progress: 88.0211%

|-----------> in progress: 89.0213%

|-----------> in progress: 90.0215%

|-----------> in progress: 91.0218%

|-----------> in progress: 92.0220%

|-----------> in progress: 93.0223%

|-----------> in progress: 94.0225%

|-----------> in progress: 95.0227%

|-----------> in progress: 96.0230%

|-----------> in progress: 97.0232%

|-----------> in progress: 98.0235%

|-----------> in progress: 99.0237%

|-----------> in progress: 100.0000%


|-----> 🎉 Done! [133.6851s]


To exemplify that multiple bigWigs can be turned into a df object at once, let's just repeat the file path.

In [3]:
df = pya.pp.bigwig_to_df(['pyaging_data/ENCFF386QWG.bigWig', 'pyaging_data/ENCFF386QWG.bigWig'])

|-----> 🏗️ Starting bigwig_to_df function
|-----> ⚙️ Load Ensembl genome metadata started
|-----------> Data found in pyaging_data/Ensembl-105-EnsDb-for-Homo-sapiens-genes.csv
|-----> ✅ Load Ensembl genome metadata finished [1.0212s]
|-----> ⚙️ Processing bigWig files started
|-----------> Processing file: pyaging_data/ENCFF386QWG.bigWig
|-----------> in progress: 100.0000%
|-----------> Processing file: pyaging_data/ENCFF386QWG.bigWig
|-----------> in progress: 100.0000%
|-----> ✅ Processing bigWig files finished [21.9402s]
|-----> 🎉 Done! [47.5545s]


|-----------> Downloading data to pyaging_data/Ensembl-105-EnsDb-for-Homo-sapiens-genes.csv


|-----------> in progress: 1.0402%

|-----------> in progress: 2.0804%

|-----------> in progress: 3.1206%

|-----------> in progress: 4.1608%

|-----------> in progress: 5.2011%

|-----------> in progress: 6.2413%

|-----------> in progress: 7.2815%

|-----------> in progress: 8.3217%

|-----------> in progress: 9.3619%

|-----------> in progress: 10.4021%

|-----------> in progress: 11.4423%

|-----------> in progress: 12.4825%

|-----------> in progress: 13.5227%

|-----------> in progress: 14.5629%

|-----------> in progress: 15.6032%

|-----------> in progress: 16.6434%

|-----------> in progress: 17.6836%

|-----------> in progress: 18.7238%

|-----------> in progress: 19.7640%

|-----------> in progress: 20.8042%

|-----------> in progress: 21.8444%

|-----------> in progress: 22.8846%

|-----------> in progress: 23.9248%

|-----------> in progress: 24.9650%

|-----------> in progress: 26.0053%

|-----------> in progress: 27.0455%

|-----------> in progress: 28.0857%

|-----------> in progress: 29.1259%

|-----------> in progress: 30.1661%

|-----------> in progress: 31.2063%

|-----------> in progress: 32.2465%

|-----------> in progress: 33.2867%

|-----------> in progress: 34.3269%

|-----------> in progress: 35.3671%

|-----------> in progress: 36.4074%

|-----------> in progress: 37.4476%

|-----------> in progress: 38.4878%

|-----------> in progress: 39.5280%

|-----------> in progress: 40.5682%

|-----------> in progress: 41.6084%

|-----------> in progress: 42.6486%

|-----------> in progress: 43.6888%

|-----------> in progress: 44.7290%

|-----------> in progress: 45.7692%

|-----------> in progress: 46.8095%

|-----------> in progress: 47.8497%

|-----------> in progress: 48.8899%

|-----------> in progress: 49.9301%

|-----------> in progress: 50.9703%

|-----------> in progress: 52.0105%

|-----------> in progress: 53.0507%

|-----------> in progress: 54.0909%

|-----------> in progress: 55.1311%

|-----------> in progress: 56.1713%

|-----------> in progress: 57.2116%

|-----------> in progress: 58.2518%

|-----------> in progress: 59.2920%

|-----------> in progress: 60.3322%

|-----------> in progress: 61.3724%

|-----------> in progress: 62.4126%

|-----------> in progress: 63.4528%

|-----------> in progress: 64.4930%

|-----------> in progress: 65.5332%

|-----------> in progress: 66.5735%

|-----------> in progress: 67.6137%

|-----------> in progress: 68.6539%

|-----------> in progress: 69.6941%

|-----------> in progress: 70.7343%

|-----------> in progress: 71.7745%

|-----------> in progress: 72.8147%

|-----------> in progress: 73.8549%

|-----------> in progress: 74.8951%

|-----------> in progress: 75.9353%

|-----------> in progress: 76.9756%

|-----------> in progress: 78.0158%

|-----------> in progress: 79.0560%

|-----------> in progress: 80.0962%

|-----------> in progress: 81.1364%

|-----------> in progress: 82.1766%

|-----------> in progress: 83.2168%

|-----------> in progress: 84.2570%

|-----------> in progress: 85.2972%

|-----------> in progress: 86.3374%

|-----------> in progress: 87.3777%

|-----------> in progress: 88.4179%

|-----------> in progress: 89.4581%

|-----------> in progress: 90.4983%

|-----------> in progress: 91.5385%

|-----------> in progress: 92.5787%

|-----------> in progress: 93.6189%

|-----------> in progress: 94.6591%

|-----------> in progress: 95.6993%

|-----------> in progress: 96.7395%

|-----------> in progress: 97.7798%

|-----------> in progress: 98.8200%

|-----------> in progress: 99.8602%

|-----------> in progress: 100.0000%


|-----> ✅ Load Ensembl genome metadata finished [4.0658s]


|-----> ⚙️ Processing bigWig files started


|-----------> Processing file: pyaging_data/ENCFF386QWG.bigWig


|-----------> in progress: 1.0009%

|-----------> in progress: 2.0019%

|-----------> in progress: 3.0028%

|-----------> in progress: 4.0038%

|-----------> in progress: 5.0047%

|-----------> in progress: 6.0057%

|-----------> in progress: 7.0066%

|-----------> in progress: 8.0076%

|-----------> in progress: 9.0085%

|-----------> in progress: 10.0095%

|-----------> in progress: 11.0104%

|-----------> in progress: 12.0114%

|-----------> in progress: 13.0123%

|-----------> in progress: 14.0133%

|-----------> in progress: 15.0142%

|-----------> in progress: 16.0152%

|-----------> in progress: 17.0161%

|-----------> in progress: 18.0171%

|-----------> in progress: 19.0180%

|-----------> in progress: 20.0190%

|-----------> in progress: 21.0199%

|-----------> in progress: 22.0209%

|-----------> in progress: 23.0218%

|-----------> in progress: 24.0228%

|-----------> in progress: 25.0237%

|-----------> in progress: 26.0246%

|-----------> in progress: 27.0256%

|-----------> in progress: 28.0265%

|-----------> in progress: 29.0275%

|-----------> in progress: 30.0284%

|-----------> in progress: 31.0294%

|-----------> in progress: 32.0303%

|-----------> in progress: 33.0313%

|-----------> in progress: 34.0322%

|-----------> in progress: 35.0332%

|-----------> in progress: 36.0341%

|-----------> in progress: 37.0351%

|-----------> in progress: 38.0360%

|-----------> in progress: 39.0370%

|-----------> in progress: 40.0379%

|-----------> in progress: 41.0389%

|-----------> in progress: 42.0398%

|-----------> in progress: 43.0408%

|-----------> in progress: 44.0417%

|-----------> in progress: 45.0427%

|-----------> in progress: 46.0436%

|-----------> in progress: 47.0446%

|-----------> in progress: 48.0455%

|-----------> in progress: 49.0464%

|-----------> in progress: 50.0474%

|-----------> in progress: 51.0483%

|-----------> in progress: 52.0493%

|-----------> in progress: 53.0502%

|-----------> in progress: 54.0512%

|-----------> in progress: 55.0521%

|-----------> in progress: 56.0531%

|-----------> in progress: 57.0540%

|-----------> in progress: 58.0550%

|-----------> in progress: 59.0559%

|-----------> in progress: 60.0569%

|-----------> in progress: 61.0578%

|-----------> in progress: 62.0588%

|-----------> in progress: 63.0597%

|-----------> in progress: 64.0607%

|-----------> in progress: 65.0616%

|-----------> in progress: 66.0626%

|-----------> in progress: 67.0635%

|-----------> in progress: 68.0645%

|-----------> in progress: 69.0654%

|-----------> in progress: 70.0664%

|-----------> in progress: 71.0673%

|-----------> in progress: 72.0683%

|-----------> in progress: 73.0692%

|-----------> in progress: 74.0701%

|-----------> in progress: 75.0711%

|-----------> in progress: 76.0720%

|-----------> in progress: 77.0730%

|-----------> in progress: 78.0739%

|-----------> in progress: 79.0749%

|-----------> in progress: 80.0758%

|-----------> in progress: 81.0768%

|-----------> in progress: 82.0777%

|-----------> in progress: 83.0787%

|-----------> in progress: 84.0796%

|-----------> in progress: 85.0806%

|-----------> in progress: 86.0815%

|-----------> in progress: 87.0825%

|-----------> in progress: 88.0834%

|-----------> in progress: 89.0844%

|-----------> in progress: 90.0853%

|-----------> in progress: 91.0863%

|-----------> in progress: 92.0872%

|-----------> in progress: 93.0882%

|-----------> in progress: 94.0891%

|-----------> in progress: 95.0901%

|-----------> in progress: 96.0910%

|-----------> in progress: 97.0919%

|-----------> in progress: 98.0929%

|-----------> in progress: 99.0938%

|-----------> in progress: 100.0000%


|-----------> Processing file: pyaging_data/ENCFF386QWG.bigWig


|-----------> in progress: 1.0009%

|-----------> in progress: 2.0019%

|-----------> in progress: 3.0028%

|-----------> in progress: 4.0038%

|-----------> in progress: 5.0047%

|-----------> in progress: 6.0057%

|-----------> in progress: 7.0066%

|-----------> in progress: 8.0076%

|-----------> in progress: 9.0085%

|-----------> in progress: 10.0095%

|-----------> in progress: 11.0104%

|-----------> in progress: 12.0114%

|-----------> in progress: 13.0123%

|-----------> in progress: 14.0133%

|-----------> in progress: 15.0142%

|-----------> in progress: 16.0152%

|-----------> in progress: 17.0161%

|-----------> in progress: 18.0171%

|-----------> in progress: 19.0180%

|-----------> in progress: 20.0190%

|-----------> in progress: 21.0199%

|-----------> in progress: 22.0209%

|-----------> in progress: 23.0218%

|-----------> in progress: 24.0228%

|-----------> in progress: 25.0237%

|-----------> in progress: 26.0246%

|-----------> in progress: 27.0256%

|-----------> in progress: 28.0265%

|-----------> in progress: 29.0275%

|-----------> in progress: 30.0284%

|-----------> in progress: 31.0294%

|-----------> in progress: 32.0303%

|-----------> in progress: 33.0313%

|-----------> in progress: 34.0322%

|-----------> in progress: 35.0332%

|-----------> in progress: 36.0341%

|-----------> in progress: 37.0351%

|-----------> in progress: 38.0360%

|-----------> in progress: 39.0370%

|-----------> in progress: 40.0379%

|-----------> in progress: 41.0389%

|-----------> in progress: 42.0398%

|-----------> in progress: 43.0408%

|-----------> in progress: 44.0417%

|-----------> in progress: 45.0427%

|-----------> in progress: 46.0436%

|-----------> in progress: 47.0446%

|-----------> in progress: 48.0455%

|-----------> in progress: 49.0464%

|-----------> in progress: 50.0474%

|-----------> in progress: 51.0483%

|-----------> in progress: 52.0493%

|-----------> in progress: 53.0502%

|-----------> in progress: 54.0512%

|-----------> in progress: 55.0521%

|-----------> in progress: 56.0531%

|-----------> in progress: 57.0540%

|-----------> in progress: 58.0550%

|-----------> in progress: 59.0559%

|-----------> in progress: 60.0569%

|-----------> in progress: 61.0578%

|-----------> in progress: 62.0588%

|-----------> in progress: 63.0597%

|-----------> in progress: 64.0607%

|-----------> in progress: 65.0616%

|-----------> in progress: 66.0626%

|-----------> in progress: 67.0635%

|-----------> in progress: 68.0645%

|-----------> in progress: 69.0654%

|-----------> in progress: 70.0664%

|-----------> in progress: 71.0673%

|-----------> in progress: 72.0683%

|-----------> in progress: 73.0692%

|-----------> in progress: 74.0701%

|-----------> in progress: 75.0711%

|-----------> in progress: 76.0720%

|-----------> in progress: 77.0730%

|-----------> in progress: 78.0739%

|-----------> in progress: 79.0749%

|-----------> in progress: 80.0758%

|-----------> in progress: 81.0768%

|-----------> in progress: 82.0777%

|-----------> in progress: 83.0787%

|-----------> in progress: 84.0796%

|-----------> in progress: 85.0806%

|-----------> in progress: 86.0815%

|-----------> in progress: 87.0825%

|-----------> in progress: 88.0834%

|-----------> in progress: 89.0844%

|-----------> in progress: 90.0853%

|-----------> in progress: 91.0863%

|-----------> in progress: 92.0872%

|-----------> in progress: 93.0882%

|-----------> in progress: 94.0891%

|-----------> in progress: 95.0901%

|-----------> in progress: 96.0910%

|-----------> in progress: 97.0919%

|-----------> in progress: 98.0929%

|-----------> in progress: 99.0938%

|-----------> in progress: 100.0000%


|-----> ✅ Processing bigWig files finished [10.9185s]


|-----> 🎉 Done! [26.0601s]


In [4]:
df.index = ['sample1', 'sample2'] # just to avoid an annoying anndata warning that samples have same names

In [5]:
df.head()

Unnamed: 0,ENSG00000223972,ENSG00000227232,ENSG00000278267,ENSG00000243485,ENSG00000284332,ENSG00000237613,ENSG00000268020,ENSG00000240361,ENSG00000186092,ENSG00000238009,...,ENSG00000237801,ENSG00000237040,ENSG00000124333,ENSG00000228410,ENSG00000223484,ENSG00000124334,ENSG00000270726,ENSG00000185203,ENSG00000182484,ENSG00000227159
sample1,0.028616,0.030415,0.027783,0.028616,0.028616,0.028616,0.044171,0.036474,0.030784,0.03181,...,0.034435,0.006822,1.413119,0.029424,0.140005,0.049786,0.069296,0.332126,0.028596,0.028616
sample2,0.028616,0.030415,0.027783,0.028616,0.028616,0.028616,0.044171,0.036474,0.030784,0.03181,...,0.034435,0.006822,1.413119,0.029424,0.140005,0.049786,0.069296,0.332126,0.028596,0.028616


## Convert data to AnnData object

AnnData objects are highly flexible and are thus our preferred method of organizing data for age prediction.

In [6]:
adata = pya.preprocess.df_to_adata(df)

|-----> 🏗️ Starting df_to_adata function
|-----> ⚙️ Create anndata object started
|-----> ✅ Create anndata object finished [0.0625s]
|-----> ⚙️ Add metadata to anndata started
|-----------? No metadata provided. Leaving adata.obs empty
|-----> ⚠️ Add metadata to anndata finished [0.0008s]
|-----> ⚙️ Log data statistics started
|-----------> There are 2 observations
|-----------> There are 62241 features
|-----------> Total missing values: 0
|-----------> Percentage of missing values: 0.00%
|-----> ✅ Log data statistics finished [0.0104s]
|-----> ⚙️ Impute missing values started
|-----------> No missing values found. No imputation necessary
|-----> ✅ Impute missing values finished [0.0045s]
|-----> 🎉 Done! [0.0874s]


|-----> ✅ Create anndata object finished [0.0272s]


|-----> ⚙️ Add metadata to anndata started


|-----------? No metadata provided. Leaving adata.obs empty


|-----> ⚠️ Add metadata to anndata finished [0.0004s]


|-----> ⚙️ Log data statistics started


|-----------> There are 2 observations


|-----------> There are 62241 features


|-----------> Total missing values: 0


|-----------> Percentage of missing values: 0.00%


|-----> ✅ Log data statistics finished [0.0011s]


|-----> ⚙️ Impute missing values started


|-----------> No missing values found. No imputation necessary


|-----> ✅ Impute missing values finished [0.0014s]


|-----> 🎉 Done! [0.0320s]


Note that the original DataFrame is stored in `X_original` under layers. This is what the `adata` object looks like:

In [7]:
adata

AnnData object with n_obs × n_vars = 2 × 62241
    var: 'percent_na'
    layers: 'X_original'

## Predict age

We can either predict one clock at once or all at the same time. For convenience, let's simply input a few clocks of interest at once. The function is invariant to the capitalization of the clock name. 

In [8]:
pya.pred.predict_age(adata, ['CamilloH3K4me3', 'CamilloH3K9me3', 'CamilloPanHistone'])

|-----> 🏗️ Starting predict_age function
|-----> ⚙️ Set PyTorch device started
|-----------> Using device: cpu
|-----> ✅ Set PyTorch device finished [0.0105s]
|-----> 🕒 Processing clock: camilloh3k4me3
|-----------> ⚙️ Load clock started
|-----------------> Data found in pyaging_data/camilloh3k4me3.pt
|-----------> ✅ Load clock finished [0.6478s]
|-----------> ⚙️ Check features in adata started
|-----------------> All features are present in adata.var_names.
|-----------------> Added prepared input matrix to adata.obsm[X_camilloh3k4me3]
|-----------> ✅ Check features in adata finished [0.0992s]
|-----------> ⚙️ Predict ages with model started
|-----------------> There is no preprocessing necessary
|-----------------> There is no postprocessing necessary
|-----------------> in progress: 100.0000%
|-----------> ✅ Predict ages with model finished [0.0169s]
|-----------> ⚙️ Add predicted ages and clock metadata to adata started
|-----------> ✅ Add predicted ages and clock metadata to adata

|-----------> Using device: cpu


|-----> ✅ Set PyTorch device finished [0.0005s]


|-----> 🕒 Processing clock: camilloh3k4me3


|-----------> ⚙️ Load clock started


|-----------------> Downloading data to pyaging_data/camilloh3k4me3.pt


|-----------------> in progress: 1.1607%

|-----------------> in progress: 2.3214%

|-----------------> in progress: 3.4822%

|-----------------> in progress: 4.6429%

|-----------------> in progress: 5.8036%

|-----------------> in progress: 6.9643%

|-----------------> in progress: 8.1250%

|-----------------> in progress: 9.2858%

|-----------------> in progress: 10.4465%

|-----------------> in progress: 11.6072%

|-----------------> in progress: 12.7679%

|-----------------> in progress: 13.9287%

|-----------------> in progress: 15.0894%

|-----------------> in progress: 16.2501%

|-----------------> in progress: 17.4108%

|-----------------> in progress: 18.5715%

|-----------------> in progress: 19.7323%

|-----------------> in progress: 20.8930%

|-----------------> in progress: 22.0537%

|-----------------> in progress: 23.2144%

|-----------------> in progress: 24.3751%

|-----------------> in progress: 25.5359%

|-----------------> in progress: 26.6966%

|-----------------> in progress: 27.8573%

|-----------------> in progress: 29.0180%

|-----------------> in progress: 30.1788%

|-----------------> in progress: 31.3395%

|-----------------> in progress: 32.5002%

|-----------------> in progress: 33.6609%

|-----------------> in progress: 34.8216%

|-----------------> in progress: 35.9824%

|-----------------> in progress: 37.1431%

|-----------------> in progress: 38.3038%

|-----------------> in progress: 39.4645%

|-----------------> in progress: 40.6252%

|-----------------> in progress: 41.7860%

|-----------------> in progress: 42.9467%

|-----------------> in progress: 44.1074%

|-----------------> in progress: 45.2681%

|-----------------> in progress: 46.4289%

|-----------------> in progress: 47.5896%

|-----------------> in progress: 48.7503%

|-----------------> in progress: 49.9110%

|-----------------> in progress: 51.0717%

|-----------------> in progress: 52.2325%

|-----------------> in progress: 53.3932%

|-----------------> in progress: 54.5539%

|-----------------> in progress: 55.7146%

|-----------------> in progress: 56.8753%

|-----------------> in progress: 58.0361%

|-----------------> in progress: 59.1968%

|-----------------> in progress: 60.3575%

|-----------------> in progress: 61.5182%

|-----------------> in progress: 62.6790%

|-----------------> in progress: 63.8397%

|-----------------> in progress: 65.0004%

|-----------------> in progress: 66.1611%

|-----------------> in progress: 67.3218%

|-----------------> in progress: 68.4826%

|-----------------> in progress: 69.6433%

|-----------------> in progress: 70.8040%

|-----------------> in progress: 71.9647%

|-----------------> in progress: 73.1254%

|-----------------> in progress: 74.2862%

|-----------------> in progress: 75.4469%

|-----------------> in progress: 76.6076%

|-----------------> in progress: 77.7683%

|-----------------> in progress: 78.9291%

|-----------------> in progress: 80.0898%

|-----------------> in progress: 81.2505%

|-----------------> in progress: 82.4112%

|-----------------> in progress: 83.5719%

|-----------------> in progress: 84.7327%

|-----------------> in progress: 85.8934%

|-----------------> in progress: 87.0541%

|-----------------> in progress: 88.2148%

|-----------------> in progress: 89.3755%

|-----------------> in progress: 90.5363%

|-----------------> in progress: 91.6970%

|-----------------> in progress: 92.8577%

|-----------------> in progress: 94.0184%

|-----------------> in progress: 95.1792%

|-----------------> in progress: 96.3399%

|-----------------> in progress: 97.5006%

|-----------------> in progress: 98.6613%

|-----------------> in progress: 99.8220%

|-----------------> in progress: 100.0000%


|-----------> ✅ Load clock finished [5.3464s]


|-----------> ⚙️ Check features in adata started


|-----------------> All features are present in adata.var_names.


|-----------------> Added prepared input matrix to adata.obsm[X_camilloh3k4me3]


|-----------> ✅ Check features in adata finished [0.0813s]


|-----------> ⚙️ Predict ages with model started


|-----------------> There is no preprocessing necessary


|-----------------> There is no postprocessing necessary


|-----------------> in progress: 100.0000%


|-----------> ✅ Predict ages with model finished [0.0016s]


|-----------> ⚙️ Add predicted ages and clock metadata to adata started


|-----------> ✅ Add predicted ages and clock metadata to adata finished [0.0008s]


|-----> 🕒 Processing clock: camilloh3k9me3


|-----------> ⚙️ Load clock started


|-----------------> Downloading data to pyaging_data/camilloh3k9me3.pt


|-----------------> in progress: 1.1913%

|-----------------> in progress: 2.3827%

|-----------------> in progress: 3.5740%

|-----------------> in progress: 4.7654%

|-----------------> in progress: 5.9567%

|-----------------> in progress: 7.1481%

|-----------------> in progress: 8.3394%

|-----------------> in progress: 9.5308%

|-----------------> in progress: 10.7221%

|-----------------> in progress: 11.9135%

|-----------------> in progress: 13.1048%

|-----------------> in progress: 14.2962%

|-----------------> in progress: 15.4875%

|-----------------> in progress: 16.6789%

|-----------------> in progress: 17.8702%

|-----------------> in progress: 19.0616%

|-----------------> in progress: 20.2529%

|-----------------> in progress: 21.4443%

|-----------------> in progress: 22.6356%

|-----------------> in progress: 23.8270%

|-----------------> in progress: 25.0183%

|-----------------> in progress: 26.2097%

|-----------------> in progress: 27.4010%

|-----------------> in progress: 28.5924%

|-----------------> in progress: 29.7837%

|-----------------> in progress: 30.9751%

|-----------------> in progress: 32.1664%

|-----------------> in progress: 33.3578%

|-----------------> in progress: 34.5491%

|-----------------> in progress: 35.7405%

|-----------------> in progress: 36.9318%

|-----------------> in progress: 38.1232%

|-----------------> in progress: 39.3145%

|-----------------> in progress: 40.5059%

|-----------------> in progress: 41.6972%

|-----------------> in progress: 42.8886%

|-----------------> in progress: 44.0799%

|-----------------> in progress: 45.2713%

|-----------------> in progress: 46.4626%

|-----------------> in progress: 47.6540%

|-----------------> in progress: 48.8453%

|-----------------> in progress: 50.0366%

|-----------------> in progress: 51.2280%

|-----------------> in progress: 52.4193%

|-----------------> in progress: 53.6107%

|-----------------> in progress: 54.8020%

|-----------------> in progress: 55.9934%

|-----------------> in progress: 57.1847%

|-----------------> in progress: 58.3761%

|-----------------> in progress: 59.5674%

|-----------------> in progress: 60.7588%

|-----------------> in progress: 61.9501%

|-----------------> in progress: 63.1415%

|-----------------> in progress: 64.3328%

|-----------------> in progress: 65.5242%

|-----------------> in progress: 66.7155%

|-----------------> in progress: 67.9069%

|-----------------> in progress: 69.0982%

|-----------------> in progress: 70.2896%

|-----------------> in progress: 71.4809%

|-----------------> in progress: 72.6723%

|-----------------> in progress: 73.8636%

|-----------------> in progress: 75.0550%

|-----------------> in progress: 76.2463%

|-----------------> in progress: 77.4377%

|-----------------> in progress: 78.6290%

|-----------------> in progress: 79.8204%

|-----------------> in progress: 81.0117%

|-----------------> in progress: 82.2031%

|-----------------> in progress: 83.3944%

|-----------------> in progress: 84.5858%

|-----------------> in progress: 85.7771%

|-----------------> in progress: 86.9685%

|-----------------> in progress: 88.1598%

|-----------------> in progress: 89.3512%

|-----------------> in progress: 90.5425%

|-----------------> in progress: 91.7339%

|-----------------> in progress: 92.9252%

|-----------------> in progress: 94.1166%

|-----------------> in progress: 95.3079%

|-----------------> in progress: 96.4992%

|-----------------> in progress: 97.6906%

|-----------------> in progress: 98.8819%

|-----------------> in progress: 100.0733%

|-----------------> in progress: 100.0000%


|-----------> ✅ Load clock finished [1.3816s]


|-----------> ⚙️ Check features in adata started


|-----------------> All features are present in adata.var_names.


|-----------------> Added prepared input matrix to adata.obsm[X_camilloh3k9me3]


|-----------> ✅ Check features in adata finished [0.0269s]


|-----------> ⚙️ Predict ages with model started


|-----------------> There is no preprocessing necessary


|-----------------> There is no postprocessing necessary


|-----------------> in progress: 100.0000%


|-----------> ✅ Predict ages with model finished [0.0018s]


|-----------> ⚙️ Add predicted ages and clock metadata to adata started


|-----------> ✅ Add predicted ages and clock metadata to adata finished [0.0006s]


|-----> 🕒 Processing clock: camillopanhistone


|-----------> ⚙️ Load clock started


|-----------------> Downloading data to pyaging_data/camillopanhistone.pt


|-----------------> in progress: 1.0044%

|-----------------> in progress: 2.0087%

|-----------------> in progress: 3.0131%

|-----------------> in progress: 4.0174%

|-----------------> in progress: 5.0218%

|-----------------> in progress: 6.0261%

|-----------------> in progress: 7.0305%

|-----------------> in progress: 8.0348%

|-----------------> in progress: 9.0392%

|-----------------> in progress: 10.0435%

|-----------------> in progress: 11.0479%

|-----------------> in progress: 12.0522%

|-----------------> in progress: 13.0566%

|-----------------> in progress: 14.0609%

|-----------------> in progress: 15.0653%

|-----------------> in progress: 16.0696%

|-----------------> in progress: 17.0740%

|-----------------> in progress: 18.0784%

|-----------------> in progress: 19.0827%

|-----------------> in progress: 20.0871%

|-----------------> in progress: 21.0914%

|-----------------> in progress: 22.0958%

|-----------------> in progress: 23.1001%

|-----------------> in progress: 24.1045%

|-----------------> in progress: 25.1088%

|-----------------> in progress: 26.1132%

|-----------------> in progress: 27.1175%

|-----------------> in progress: 28.1219%

|-----------------> in progress: 29.1262%

|-----------------> in progress: 30.1306%

|-----------------> in progress: 31.1349%

|-----------------> in progress: 32.1393%

|-----------------> in progress: 33.1436%

|-----------------> in progress: 34.1480%

|-----------------> in progress: 35.1524%

|-----------------> in progress: 36.1567%

|-----------------> in progress: 37.1611%

|-----------------> in progress: 38.1654%

|-----------------> in progress: 39.1698%

|-----------------> in progress: 40.1741%

|-----------------> in progress: 41.1785%

|-----------------> in progress: 42.1828%

|-----------------> in progress: 43.1872%

|-----------------> in progress: 44.1915%

|-----------------> in progress: 45.1959%

|-----------------> in progress: 46.2002%

|-----------------> in progress: 47.2046%

|-----------------> in progress: 48.2089%

|-----------------> in progress: 49.2133%

|-----------------> in progress: 50.2177%

|-----------------> in progress: 51.2220%

|-----------------> in progress: 52.2264%

|-----------------> in progress: 53.2307%

|-----------------> in progress: 54.2351%

|-----------------> in progress: 55.2394%

|-----------------> in progress: 56.2438%

|-----------------> in progress: 57.2481%

|-----------------> in progress: 58.2525%

|-----------------> in progress: 59.2568%

|-----------------> in progress: 60.2612%

|-----------------> in progress: 61.2655%

|-----------------> in progress: 62.2699%

|-----------------> in progress: 63.2742%

|-----------------> in progress: 64.2786%

|-----------------> in progress: 65.2829%

|-----------------> in progress: 66.2873%

|-----------------> in progress: 67.2917%

|-----------------> in progress: 68.2960%

|-----------------> in progress: 69.3004%

|-----------------> in progress: 70.3047%

|-----------------> in progress: 71.3091%

|-----------------> in progress: 72.3134%

|-----------------> in progress: 73.3178%

|-----------------> in progress: 74.3221%

|-----------------> in progress: 75.3265%

|-----------------> in progress: 76.3308%

|-----------------> in progress: 77.3352%

|-----------------> in progress: 78.3395%

|-----------------> in progress: 79.3439%

|-----------------> in progress: 80.3482%

|-----------------> in progress: 81.3526%

|-----------------> in progress: 82.3569%

|-----------------> in progress: 83.3613%

|-----------------> in progress: 84.3657%

|-----------------> in progress: 85.3700%

|-----------------> in progress: 86.3744%

|-----------------> in progress: 87.3787%

|-----------------> in progress: 88.3831%

|-----------------> in progress: 89.3874%

|-----------------> in progress: 90.3918%

|-----------------> in progress: 91.3961%

|-----------------> in progress: 92.4005%

|-----------------> in progress: 93.4048%

|-----------------> in progress: 94.4092%

|-----------------> in progress: 95.4135%

|-----------------> in progress: 96.4179%

|-----------------> in progress: 97.4222%

|-----------------> in progress: 98.4266%

|-----------------> in progress: 99.4309%

|-----------------> in progress: 100.0000%


|-----------> ✅ Load clock finished [7.8131s]


|-----------> ⚙️ Check features in adata started


|-----------------> All features are present in adata.var_names.


|-----------------> Added prepared input matrix to adata.obsm[X_camillopanhistone]


|-----------> ✅ Check features in adata finished [0.3039s]


|-----------> ⚙️ Predict ages with model started


|-----------------> There is no preprocessing necessary


|-----------------> There is no postprocessing necessary


|-----------------> in progress: 100.0000%


|-----------> ✅ Predict ages with model finished [0.0077s]


|-----------> ⚙️ Add predicted ages and clock metadata to adata started


|-----------> ✅ Add predicted ages and clock metadata to adata finished [0.0006s]


|-----> 🎉 Done! [15.1410s]


In [9]:
adata.obs.head()

Unnamed: 0,camilloh3k4me3,camilloh3k9me3,camillopanhistone
sample1,53.998544,44.3229,54.021884
sample2,53.998544,44.3229,54.021884


Having so much information printed can be overwhelming, particularly when running several clocks at once. In such cases, just set verbose to False.

In [None]:
pya.data.download_example_data('ENCFF386QWG', verbose=False)
df = pya.pp.bigwig_to_df(['pyaging_data/ENCFF386QWG.bigWig', 'pyaging_data/ENCFF386QWG.bigWig'], verbose=False)
df.index = ['sample1', 'sample2']
adata = pya.preprocess.df_to_adata(df, verbose=False)
pya.pred.predict_age(adata, ['CamilloH3K4me3', 'CamilloH3K9me3', 'CamilloPanHistone'], verbose=False)

In [None]:
adata.obs.head()

After age prediction, the clocks are added to `adata.obs`. Moreover, the percent of missing values for each clock and other metadata are included in `adata.uns`.

In [None]:
adata

## Get citation

The doi, citation, and some metadata are automatically added to the AnnData object under `adata.uns[CLOCKNAME_metadata]`.

In [None]:
adata.uns['camilloh3k4me3_metadata']