[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/rsinghlab/pyaging/blob/main/tutorials/tutorial_dnam.ipynb) [![Open In nbviewer](https://img.shields.io/badge/View%20in-nbviewer-orange)](https://nbviewer.jupyter.org/github/rsinghlab/pyaging/blob/main/tutorials/tutorial_dnam.ipynb)

# Illumina Human Methylation Arrays

This tutorial is a brief guide for the implementation of an array of bulk DNA-methylation epigenetic clocks that predict age in humans. In this notebook, we will demonstrate the breadth of epigenetic clock models available in `pyaging` by showing:

- Horvath's 2013 ElasticNet-based clock ([paper](https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r115));
  
- AltumAge, a highly accurate deep-learning based clock ([paper](https://www.nature.com/articles/s41514-022-00085-y));
  
- PCGrimAge, a principal-component based version of the GrimAge clock ([paper](https://www.nature.com/articles/s43587-022-00248-2));

- GrimAge2, the latest version of GrimAge ([paper](https://www.aging-us.com/article/204434/text]));

- DunedinPACE, a biomarker of the pace of aging ([paper](https://elifesciences.org/articles/73420)).

We just need two packages for this tutorial.

In [1]:
import pandas as pd
import pyaging as pya

## Download and load example data

Let's download the publicly avaiable dataset GSE139307 with Illumina's 450k array. The CpG coverage of the 450k array should be good enough for most clocks.

In [2]:
pya.data.download_example_data('GSE139307')

|-----> 🏗️ Starting download_example_data function
|-----------> Data found in pyaging_data/GSE139307.pkl
|-----> 🎉 Done! [0.5322s]


|-----------> in progress: 1.0047%

|-----------> in progress: 2.0095%

|-----------> in progress: 3.0142%

|-----------> in progress: 4.0190%

|-----------> in progress: 5.0237%

|-----------> in progress: 6.0285%

|-----------> in progress: 7.0332%

|-----------> in progress: 8.0379%

|-----------> in progress: 9.0427%

|-----------> in progress: 10.0474%

|-----------> in progress: 11.0522%

|-----------> in progress: 12.0569%

|-----------> in progress: 13.0617%

|-----------> in progress: 14.0664%

|-----------> in progress: 15.0711%

|-----------> in progress: 16.0759%

|-----------> in progress: 17.0806%

|-----------> in progress: 18.0854%

|-----------> in progress: 19.0901%

|-----------> in progress: 20.0948%

|-----------> in progress: 21.0996%

|-----------> in progress: 22.1043%

|-----------> in progress: 23.1091%

|-----------> in progress: 24.1138%

|-----------> in progress: 25.1186%

|-----------> in progress: 26.1233%

|-----------> in progress: 27.1280%

|-----------> in progress: 28.1328%

|-----------> in progress: 29.1375%

|-----------> in progress: 30.1423%

|-----------> in progress: 31.1470%

|-----------> in progress: 32.1518%

|-----------> in progress: 33.1565%

|-----------> in progress: 34.1612%

|-----------> in progress: 35.1660%

|-----------> in progress: 36.1707%

|-----------> in progress: 37.1755%

|-----------> in progress: 38.1802%

|-----------> in progress: 39.1850%

|-----------> in progress: 40.1897%

|-----------> in progress: 41.1944%

|-----------> in progress: 42.1992%

|-----------> in progress: 43.2039%

|-----------> in progress: 44.2087%

|-----------> in progress: 45.2134%

|-----------> in progress: 46.2182%

|-----------> in progress: 47.2229%

|-----------> in progress: 48.2276%

|-----------> in progress: 49.2324%

|-----------> in progress: 50.2371%

|-----------> in progress: 51.2419%

|-----------> in progress: 52.2466%

|-----------> in progress: 53.2514%

|-----------> in progress: 54.2561%

|-----------> in progress: 55.2608%

|-----------> in progress: 56.2656%

|-----------> in progress: 57.2703%

|-----------> in progress: 58.2751%

|-----------> in progress: 59.2798%

|-----------> in progress: 60.2845%

|-----------> in progress: 61.2893%

|-----------> in progress: 62.2940%

|-----------> in progress: 63.2988%

|-----------> in progress: 64.3035%

|-----------> in progress: 65.3083%

|-----------> in progress: 66.3130%

|-----------> in progress: 67.3177%

|-----------> in progress: 68.3225%

|-----------> in progress: 69.3272%

|-----------> in progress: 70.3320%

|-----------> in progress: 71.3367%

|-----------> in progress: 72.3415%

|-----------> in progress: 73.3462%

|-----------> in progress: 74.3509%

|-----------> in progress: 75.3557%

|-----------> in progress: 76.3604%

|-----------> in progress: 77.3652%

|-----------> in progress: 78.3699%

|-----------> in progress: 79.3747%

|-----------> in progress: 80.3794%

|-----------> in progress: 81.3841%

|-----------> in progress: 82.3889%

|-----------> in progress: 83.3936%

|-----------> in progress: 84.3984%

|-----------> in progress: 85.4031%

|-----------> in progress: 86.4079%

|-----------> in progress: 87.4126%

|-----------> in progress: 88.4173%

|-----------> in progress: 89.4221%

|-----------> in progress: 90.4268%

|-----------> in progress: 91.4316%

|-----------> in progress: 92.4363%

|-----------> in progress: 93.4411%

|-----------> in progress: 94.4458%

|-----------> in progress: 95.4505%

|-----------> in progress: 96.4553%

|-----------> in progress: 97.4600%

|-----------> in progress: 98.4648%

|-----------> in progress: 99.4695%

|-----------> in progress: 100.0000%


|-----> 🎉 Done! [18.3267s]


In [3]:
df = pd.read_pickle('pyaging_data/GSE139307.pkl')

In [4]:
df.head()

Unnamed: 0,dataset,tissue_type,age,gender,cg00000029,cg00000108,cg00000109,cg00000165,cg00000236,cg00000289,...,ch.X.93511680F,ch.X.938089F,ch.X.94051109R,ch.X.94260649R,ch.X.967194F,ch.X.97129969R,ch.X.97133160R,ch.X.97651759F,ch.X.97737721F,ch.X.98007042R
GSM4137709,GSE139307,sperm,84.0,M,0.084811,0.920696,0.856851,0.084567,0.838699,0.247273,...,0.061751,0.045942,0.037631,0.056455,0.249872,0.049022,0.085691,0.037435,0.07782,0.106234
GSM4137710,GSE139307,sperm,69.0,M,0.099626,0.919073,0.890024,0.115541,0.852584,0.198103,...,0.075077,0.041849,0.032573,0.08979,0.250245,0.079095,0.079756,0.046229,0.091256,0.120241
GSM4137711,GSE139307,sperm,69.0,M,0.117228,0.920276,0.894317,0.117127,0.839258,0.21341,...,0.068679,0.049515,0.058097,0.079919,0.299758,0.079305,0.089815,0.065364,0.086864,0.156005
GSM4137712,GSE139307,sperm,69.0,M,0.077096,0.910204,0.9084,0.073885,0.861615,0.163276,...,0.070091,0.033289,0.038836,0.108213,0.295428,0.050731,0.099943,0.047597,0.07848,0.10748
GSM4137713,GSE139307,sperm,67.0,M,0.063524,0.911608,0.884643,0.079877,0.864654,0.176169,...,0.082368,0.038411,0.048787,0.088631,0.316694,0.041873,0.079303,0.048823,0.08901,0.117903


For PCGrimAge and GrimAge2, both age and sex are features. Therefore, to get the full prediction, let's convert the column `gender` into a column called `female`, with 1 being female and 0 being male.

In [5]:
# needs only numerical data (doesn't work with strings)
df['female'] = (df['gender'] == 'F').astype(int)

## Convert data to AnnData object

AnnData objects are highly flexible and are thus our preferred method of organizing data for age prediction.

In [6]:
adata = pya.pp.df_to_adata(df, metadata_cols=['gender', 'tissue_type', 'dataset'], imputer_strategy='knn')

|-----> 🏗️ Starting df_to_adata function
|-----> ⚙️ Create anndata object started
|-----------? Dropping 1 columns with only NAs: ['cg01550828'], etc.
|-----> ⚠️ Create anndata object finished [0.6261s]
|-----> ⚙️ Add metadata to anndata started
|-----------> Adding provided metadata to adata.obs
|-----> ✅ Add metadata to anndata finished [0.0012s]
|-----> ⚙️ Log data statistics started
|-----------> There are 37 observations
|-----------> There are 485513 features
|-----------> Total missing values: 489
|-----------> Percentage of missing values: 0.00%
|-----> ✅ Log data statistics finished [0.0301s]
|-----> ⚙️ Impute missing values started
|-----------> Imputing missing values using knn strategy
|-----> ✅ Impute missing values finished [63.3845s]
|-----> ⚙️ Add imputer strategy to adata.uns started
|-----> ✅ Add imputer strategy to adata.uns finished [0.0011s]
|-----> 🎉 Done! [64.1721s]


|-----------? Dropping 1 columns with only NAs: ['cg01550828'], etc.


|-----> ⚠️ Create anndata object finished [0.3221s]


|-----> ⚙️ Add metadata to anndata started


|-----------> Adding provided metadata to adata.obs


|-----> ✅ Add metadata to anndata finished [0.0012s]


|-----> ⚙️ Log data statistics started


|-----------> There are 37 observations


|-----------> There are 485513 features


|-----------> Total missing values: 489


|-----------> Percentage of missing values: 0.00%


|-----> ✅ Log data statistics finished [0.0111s]


|-----> ⚙️ Impute missing values started


|-----------> Imputing missing values using knn strategy


|-----> ✅ Impute missing values finished [6.0460s]


|-----> ⚙️ Add imputer strategy to adata.uns started


|-----> ✅ Add imputer strategy to adata.uns finished [0.0003s]


|-----> 🎉 Done! [6.4736s]


Note that the original DataFrame is stored in `X_original` under layers. is This is what the `adata` object looks like:

In [7]:
adata

AnnData object with n_obs × n_vars = 37 × 485513
    obs: 'gender', 'tissue_type', 'dataset'
    var: 'percent_na'
    uns: 'imputer_strategy'
    layers: 'X_original', 'X_imputed'

## Predict age

We can either predict one clock at once or all at the same time. For convenience, let's simply input all four clocks of interest at once. The function is invariant to the capitalization of the clock name. 

In [None]:
pya.pred.predict_age(adata, ['Horvath2013', 'AltumAge', 'PCGrimAge', 'GrimAge2', 'DunedinPACE'])

|-----> 🏗️ Starting predict_age function
|-----> ⚙️ Set PyTorch device started
|-----------> Using device: cpu
|-----> ✅ Set PyTorch device finished [0.0021s]
|-----> 🕒 Processing clock: horvath2013
|-----------> ⚙️ Load clock started
|-----------------> Data found in pyaging_data/horvath2013.pt
|-----------> ✅ Load clock finished [0.7568s]
|-----------> ⚙️ Check features in adata started
|-----------------> All features are present in adata.var_names.
|-----------------> Added prepared input matrix to adata.obsm[X_horvath2013]
|-----------> ✅ Check features in adata finished [0.0961s]
|-----------> ⚙️ Predict ages with model started
|-----------------> There is no preprocessing necessary
|-----------------> The postprocessing method is anti_log_linear
|-----------------> in progress: 100.0000%
|-----------> ✅ Predict ages with model finished [0.0098s]
|-----------> ⚙️ Add predicted ages and clock metadata to adata started
|-----------> ✅ Add predicted ages and clock metadata to adata 

|-----------> Using device: cpu


|-----> ✅ Set PyTorch device finished [0.0006s]


|-----> 🕒 Processing clock: horvath2013


|-----------> ⚙️ Load clock started


|-----------------> Downloading data to pyaging_data/horvath2013.pt


|-----------------> in progress: 51.6650%

|-----------------> in progress: 103.3300%

|-----------------> in progress: 100.0000%


|-----------> ✅ Load clock finished [0.9516s]


|-----------> ⚙️ Check features in adata started


|-----------------> All features are present in adata.var_names.


|-----------------> Added prepared input matrix to adata.obsm[X_horvath2013]


|-----------> ✅ Check features in adata finished [0.0463s]


|-----------> ⚙️ Predict ages with model started


|-----------------> There is no preprocessing necessary


|-----------------> The postprocessing method is anti_log_linear


|-----------------> in progress: 100.0000%


|-----------> ✅ Predict ages with model finished [0.0019s]


|-----------> ⚙️ Add predicted ages and clock metadata to adata started


|-----------> ✅ Add predicted ages and clock metadata to adata finished [0.0005s]


|-----> 🕒 Processing clock: altumage


|-----------> ⚙️ Load clock started


|-----------------> Downloading data to pyaging_data/altumage.pt


|-----------------> in progress: 1.0839%

|-----------------> in progress: 2.1679%

|-----------------> in progress: 3.2518%

|-----------------> in progress: 4.3357%

|-----------------> in progress: 5.4196%

|-----------------> in progress: 6.5036%

|-----------------> in progress: 7.5875%

|-----------------> in progress: 8.6714%

|-----------------> in progress: 9.7554%

|-----------------> in progress: 10.8393%

|-----------------> in progress: 11.9232%

|-----------------> in progress: 13.0071%

|-----------------> in progress: 14.0911%

|-----------------> in progress: 15.1750%

|-----------------> in progress: 16.2589%

|-----------------> in progress: 17.3429%

|-----------------> in progress: 18.4268%

|-----------------> in progress: 19.5107%

|-----------------> in progress: 20.5946%

|-----------------> in progress: 21.6786%

|-----------------> in progress: 22.7625%

|-----------------> in progress: 23.8464%

|-----------------> in progress: 24.9304%

|-----------------> in progress: 26.0143%

|-----------------> in progress: 27.0982%

|-----------------> in progress: 28.1821%

|-----------------> in progress: 29.2661%

|-----------------> in progress: 30.3500%

|-----------------> in progress: 31.4339%

|-----------------> in progress: 32.5178%

|-----------------> in progress: 33.6018%

|-----------------> in progress: 34.6857%

|-----------------> in progress: 35.7696%

|-----------------> in progress: 36.8536%

|-----------------> in progress: 37.9375%

|-----------------> in progress: 39.0214%

|-----------------> in progress: 40.1053%

|-----------------> in progress: 41.1893%

|-----------------> in progress: 42.2732%

|-----------------> in progress: 43.3571%

|-----------------> in progress: 44.4411%

|-----------------> in progress: 45.5250%

|-----------------> in progress: 46.6089%

|-----------------> in progress: 47.6928%

|-----------------> in progress: 48.7768%

|-----------------> in progress: 49.8607%

|-----------------> in progress: 50.9446%

|-----------------> in progress: 52.0286%

|-----------------> in progress: 53.1125%

|-----------------> in progress: 54.1964%

|-----------------> in progress: 55.2803%

|-----------------> in progress: 56.3643%

|-----------------> in progress: 57.4482%

|-----------------> in progress: 58.5321%

|-----------------> in progress: 59.6161%

|-----------------> in progress: 60.7000%

|-----------------> in progress: 61.7839%

|-----------------> in progress: 62.8678%

|-----------------> in progress: 63.9518%

|-----------------> in progress: 65.0357%

|-----------------> in progress: 66.1196%

|-----------------> in progress: 67.2036%

|-----------------> in progress: 68.2875%

|-----------------> in progress: 69.3714%

|-----------------> in progress: 70.4553%

|-----------------> in progress: 71.5393%

|-----------------> in progress: 72.6232%

|-----------------> in progress: 73.7071%

|-----------------> in progress: 74.7911%

|-----------------> in progress: 75.8750%

|-----------------> in progress: 76.9589%

|-----------------> in progress: 78.0428%

|-----------------> in progress: 79.1268%

|-----------------> in progress: 80.2107%

|-----------------> in progress: 81.2946%

|-----------------> in progress: 82.3785%

|-----------------> in progress: 83.4625%

|-----------------> in progress: 84.5464%

|-----------------> in progress: 85.6303%

|-----------------> in progress: 86.7143%

|-----------------> in progress: 87.7982%

|-----------------> in progress: 88.8821%

|-----------------> in progress: 89.9660%

|-----------------> in progress: 91.0500%

|-----------------> in progress: 92.1339%

|-----------------> in progress: 93.2178%

|-----------------> in progress: 94.3018%

|-----------------> in progress: 95.3857%

|-----------------> in progress: 96.4696%

|-----------------> in progress: 97.5535%

|-----------------> in progress: 98.6375%

|-----------------> in progress: 99.7214%

|-----------------> in progress: 100.0000%


|-----------> ✅ Load clock finished [2.2961s]


|-----------> ⚙️ Check features in adata started


|-----------------> All features are present in adata.var_names.


|-----------------> Added prepared input matrix to adata.obsm[X_altumage]


|-----------> ✅ Check features in adata finished [1.9134s]


|-----------> ⚙️ Predict ages with model started


|-----------------> The preprocessing method is scale


|-----------------> There is no postprocessing necessary


|-----------------> in progress: 100.0000%


|-----------> ✅ Predict ages with model finished [0.0127s]


|-----------> ⚙️ Add predicted ages and clock metadata to adata started


|-----------> ✅ Add predicted ages and clock metadata to adata finished [0.0008s]


|-----> 🕒 Processing clock: pcgrimage


|-----------> ⚙️ Load clock started


|-----------------> Downloading data to pyaging_data/pcgrimage.pt


|-----------------> in progress: 1.0001%

|-----------------> in progress: 2.0003%

|-----------------> in progress: 3.0004%

|-----------------> in progress: 4.0006%

|-----------------> in progress: 5.0007%

|-----------------> in progress: 6.0009%

|-----------------> in progress: 7.0010%

|-----------------> in progress: 8.0012%

|-----------------> in progress: 9.0013%

|-----------------> in progress: 10.0015%

|-----------------> in progress: 11.0016%

|-----------------> in progress: 12.0018%

|-----------------> in progress: 13.0019%

|-----------------> in progress: 14.0021%

|-----------------> in progress: 15.0022%

|-----------------> in progress: 16.0024%

|-----------------> in progress: 17.0025%

|-----------------> in progress: 18.0026%

|-----------------> in progress: 19.0028%

|-----------------> in progress: 20.0029%

|-----------------> in progress: 21.0031%

|-----------------> in progress: 22.0032%

|-----------------> in progress: 23.0034%

|-----------------> in progress: 24.0035%

|-----------------> in progress: 25.0037%

|-----------------> in progress: 26.0038%

|-----------------> in progress: 27.0040%

|-----------------> in progress: 28.0041%

|-----------------> in progress: 29.0043%

|-----------------> in progress: 30.0044%

|-----------------> in progress: 31.0046%

|-----------------> in progress: 32.0047%

|-----------------> in progress: 33.0049%

|-----------------> in progress: 34.0050%

|-----------------> in progress: 35.0051%

|-----------------> in progress: 36.0053%

|-----------------> in progress: 37.0054%

|-----------------> in progress: 38.0056%

|-----------------> in progress: 39.0057%

|-----------------> in progress: 40.0059%

|-----------------> in progress: 41.0060%

|-----------------> in progress: 42.0062%

|-----------------> in progress: 43.0063%

|-----------------> in progress: 44.0065%

|-----------------> in progress: 45.0066%

|-----------------> in progress: 46.0068%

|-----------------> in progress: 47.0069%

|-----------------> in progress: 48.0071%

|-----------------> in progress: 49.0072%

|-----------------> in progress: 50.0074%

|-----------------> in progress: 51.0075%

|-----------------> in progress: 52.0077%

|-----------------> in progress: 53.0078%

|-----------------> in progress: 54.0079%

|-----------------> in progress: 55.0081%

|-----------------> in progress: 56.0082%

|-----------------> in progress: 57.0084%

|-----------------> in progress: 58.0085%

|-----------------> in progress: 59.0087%

|-----------------> in progress: 60.0088%

|-----------------> in progress: 61.0090%

|-----------------> in progress: 62.0091%

|-----------------> in progress: 63.0093%

|-----------------> in progress: 64.0094%

|-----------------> in progress: 65.0096%

|-----------------> in progress: 66.0097%

|-----------------> in progress: 67.0099%

|-----------------> in progress: 68.0100%

|-----------------> in progress: 69.0102%

|-----------------> in progress: 70.0103%

|-----------------> in progress: 71.0104%

|-----------------> in progress: 72.0106%

|-----------------> in progress: 73.0107%

|-----------------> in progress: 74.0109%

|-----------------> in progress: 75.0110%

|-----------------> in progress: 76.0112%

|-----------------> in progress: 77.0113%

|-----------------> in progress: 78.0115%

|-----------------> in progress: 79.0116%

|-----------------> in progress: 80.0118%

|-----------------> in progress: 81.0119%

|-----------------> in progress: 82.0121%

|-----------------> in progress: 83.0122%

|-----------------> in progress: 84.0124%

|-----------------> in progress: 85.0125%

|-----------------> in progress: 86.0127%

|-----------------> in progress: 87.0128%

|-----------------> in progress: 88.0129%

|-----------------> in progress: 89.0131%

|-----------------> in progress: 90.0132%

|-----------------> in progress: 91.0134%

|-----------------> in progress: 92.0135%

|-----------------> in progress: 93.0137%

|-----------------> in progress: 94.0138%

|-----------------> in progress: 95.0140%

|-----------------> in progress: 96.0141%

|-----------------> in progress: 97.0143%

|-----------------> in progress: 98.0144%

|-----------------> in progress: 99.0146%

|-----------------> in progress: 100.0000%


|-----------> ✅ Load clock finished [158.8097s]


|-----------> ⚙️ Check features in adata started


|-----------------> All features are present in adata.var_names.


|-----------------> Added prepared input matrix to adata.obsm[X_pcgrimage]


|-----------> ✅ Check features in adata finished [9.0392s]


|-----------> ⚙️ Predict ages with model started


|-----------------> There is no preprocessing necessary


|-----------------> There is no postprocessing necessary


|-----------------> in progress: 100.0000%


|-----------> ✅ Predict ages with model finished [0.1944s]


|-----------> ⚙️ Add predicted ages and clock metadata to adata started


|-----------> ✅ Add predicted ages and clock metadata to adata finished [0.0006s]


|-----> 🕒 Processing clock: grimage2


|-----------> ⚙️ Load clock started


|-----------------> Downloading data to pyaging_data/grimage2.pt


|-----------------> in progress: 12.3385%

|-----------------> in progress: 24.6769%

|-----------------> in progress: 37.0154%

|-----------------> in progress: 49.3539%

|-----------------> in progress: 61.6923%

|-----------------> in progress: 74.0308%

|-----------------> in progress: 86.3693%

|-----------------> in progress: 98.7077%

|-----------------> in progress: 111.0462%

|-----------------> in progress: 100.0000%


|-----------> ✅ Load clock finished [1.1021s]


|-----------> ⚙️ Check features in adata started


|-----------------> All features are present in adata.var_names.


|-----------------> Added prepared input matrix to adata.obsm[X_grimage2]


|-----------> ✅ Check features in adata finished [0.1319s]


|-----------> ⚙️ Predict ages with model started


|-----------------> There is no preprocessing necessary


|-----------------> The postprocessing method is cox_to_years


|-----------------> in progress: 100.0000%


|-----------> ✅ Predict ages with model finished [0.0032s]


|-----------> ⚙️ Add predicted ages and clock metadata to adata started


|-----------> ✅ Add predicted ages and clock metadata to adata finished [0.0012s]


|-----> 🕒 Processing clock: dunedinpace


|-----------> ⚙️ Load clock started


|-----------------> Downloading data to pyaging_data/dunedinpace.pt


|-----------------> in progress: 1.3927%

|-----------------> in progress: 2.7854%

|-----------------> in progress: 4.1781%

|-----------------> in progress: 5.5708%

|-----------------> in progress: 6.9635%

|-----------------> in progress: 8.3562%

|-----------------> in progress: 9.7489%

|-----------------> in progress: 11.1416%

|-----------------> in progress: 12.5343%

|-----------------> in progress: 13.9270%

|-----------------> in progress: 15.3198%

|-----------------> in progress: 16.7125%

|-----------------> in progress: 18.1052%

|-----------------> in progress: 19.4979%

|-----------------> in progress: 20.8906%

|-----------------> in progress: 22.2833%

|-----------------> in progress: 23.6760%

|-----------------> in progress: 25.0687%

|-----------------> in progress: 26.4614%

|-----------------> in progress: 27.8541%

|-----------------> in progress: 29.2468%

|-----------------> in progress: 30.6395%

|-----------------> in progress: 32.0322%

|-----------------> in progress: 33.4249%

|-----------------> in progress: 34.8176%

|-----------------> in progress: 36.2103%

|-----------------> in progress: 37.6030%

|-----------------> in progress: 38.9957%

|-----------------> in progress: 40.3884%

|-----------------> in progress: 41.7811%

|-----------------> in progress: 43.1738%

|-----------------> in progress: 44.5665%

|-----------------> in progress: 45.9593%

|-----------------> in progress: 47.3520%

|-----------------> in progress: 48.7447%

|-----------------> in progress: 50.1374%

|-----------------> in progress: 51.5301%

|-----------------> in progress: 52.9228%

|-----------------> in progress: 54.3155%

|-----------------> in progress: 55.7082%

|-----------------> in progress: 57.1009%

|-----------------> in progress: 58.4936%

|-----------------> in progress: 59.8863%

|-----------------> in progress: 61.2790%

|-----------------> in progress: 62.6717%

|-----------------> in progress: 64.0644%

|-----------------> in progress: 65.4571%

|-----------------> in progress: 66.8498%

|-----------------> in progress: 68.2425%

|-----------------> in progress: 69.6352%

|-----------------> in progress: 71.0279%

|-----------------> in progress: 72.4206%

|-----------------> in progress: 73.8133%

|-----------------> in progress: 75.2060%

|-----------------> in progress: 76.5988%

|-----------------> in progress: 77.9915%

|-----------------> in progress: 79.3842%

|-----------------> in progress: 80.7769%

|-----------------> in progress: 82.1696%

|-----------------> in progress: 83.5623%

|-----------------> in progress: 84.9550%

|-----------------> in progress: 86.3477%

|-----------------> in progress: 87.7404%

|-----------------> in progress: 89.1331%

|-----------------> in progress: 90.5258%

|-----------------> in progress: 91.9185%

|-----------------> in progress: 93.3112%

|-----------------> in progress: 94.7039%

|-----------------> in progress: 96.0966%

|-----------------> in progress: 97.4893%

|-----------------> in progress: 98.8820%

|-----------------> in progress: 100.2747%

|-----------------> in progress: 100.0000%


|-----------> ✅ Load clock finished [1.4808s]


|-----------> ⚙️ Check features in adata started


|-----------------> All features are present in adata.var_names.


|-----------------> Added prepared input matrix to adata.obsm[X_dunedinpace]


|-----------> ✅ Check features in adata finished [3.2894s]


|-----------> ⚙️ Predict ages with model started


|-----------------> The preprocessing method is quantile_normalization_with_gold_standard


|-----------------> There is no postprocessing necessary


|-----------------> in progress: 100.0000%


|-----------> ✅ Predict ages with model finished [0.0795s]


|-----------> ⚙️ Add predicted ages and clock metadata to adata started


|-----------> ✅ Add predicted ages and clock metadata to adata finished [0.0006s]


|-----> 🎉 Done! [179.6900s]


In [None]:
adata.obs.head()

For curiosity, we can also check if there are any correlations amongst these clocks.

In [None]:
adata.obs.iloc[:, 3:].corr('pearson')

Having so much information printed can be overwhelming, particularly when running several clocks at once. In such cases, just set verbose to False.

In [None]:
pya.data.download_example_data('GSE139307', verbose=False)
df = pd.read_pickle('pyaging_data/GSE139307.pkl')
df['female'] = (df['gender'] == 'F').astype(int)
adata = pya.preprocess.df_to_adata(df, metadata_cols=['gender', 'tissue_type', 'dataset'], imputer_strategy='mean', verbose=False)
pya.pred.predict_age(adata, ['Horvath2013', 'AltumAge', 'PCGrimAge', 'GrimAge2', 'DunedinPACE'], verbose=False)
adata.obs.head()

After age prediction, the clocks are added to `adata.obs`. Moreover, the percent of missing values for each clock and other metadata are included in `adata.uns`.

In [None]:
adata

We can also look at which features seem to be missing from each clock (if there are any).

In [None]:
adata.uns['dunedinpace_missing_features']

## Get citation

The doi, citation, and some metadata are automatically added to the AnnData object under `adata.uns[CLOCKNAME_metadata]`.

In [None]:
adata.uns['horvath2013_metadata']

In [None]:
adata.uns['altumage_metadata']

In [None]:
adata.uns['pcgrimage_metadata']

In [None]:
adata.uns['grimage2_metadata']

In [None]:
adata.uns['dunedinpace_metadata']