[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/rsinghlab/pyaging/blob/main/tutorials/tutorial_utils.ipynb) [![Open In nbviewer](https://img.shields.io/badge/View%20in-nbviewer-orange)](https://nbviewer.jupyter.org/github/rsinghlab/pyaging/blob/main/tutorials/tutorial_utils.ipynb)

# Search, cite, get metadata and clock parameters

This tutorial shows the use of some `pyaging` helper functions.

In [1]:
import pyaging as pya

## Search

There are two main ways to search for a clock in `pyaging`. The first is through the doi of the paper in which the clock was developed.

In [2]:
pya.utils.find_clock_by_doi('https://doi.org/10.1038/s43587-022-00248-2')

|-----> 🏗️ Starting find_clock_by_doi function
|-----> ⚙️ Load all clock metadata started
|-----------> Data found in pyaging_data/all_clock_metadata.pt
|-----> ✅ Load all clock metadata finished [0.4988s]
|-----> ⚙️ Searching for clock based on DOI started
|-----------> in progress: 100.0000%
|-----------> Clocks with DOI https://doi.org/10.1038/s43587-022-00248-2: pchorvath2013, pcphenoage, pcgrimage, pchannum, pcdnamtl, hrsinchphenoage, pcskinandblood
|-----> ✅ Searching for clock based on DOI finished [0.0485s]
|-----> 🎉 Done! [0.5502s]


The second way is by simply showing the names of all the clocks that are available.

In [3]:
pya.utils.show_all_clocks()

|-----> 🏗️ Starting show_all_clocks function
|-----> ⚙️ Load all clock metadata started
|-----------> Data found in pyaging_data/all_clock_metadata.pt
|-----> ✅ Load all clock metadata finished [0.4589s]
|-----> ⚙️ Showing all available clock names started
|-----------> altumage
|-----------> bitage
|-----------> camilloh3k27ac
|-----------> camilloh3k27me3
|-----------> camilloh3k36me3
|-----------> camilloh3k4me1
|-----------> camilloh3k4me3
|-----------> camilloh3k9ac
|-----------> camilloh3k9me3
|-----------> camillopanhistone
|-----------> dnamphenoage
|-----------> dnamtl
|-----------> dunedinpace
|-----------> encen100
|-----------> encen40
|-----------> grimage
|-----------> grimage2
|-----------> han
|-----------> hannum
|-----------> horvath2013
|-----------> hrsinchphenoage
|-----------> knight
|-----------> leecontrol
|-----------> leerefinedrobust
|-----------> leerobust
|-----------> lin
|-----------> mammalian1
|-----------> mammalian2
|-----------> mammalian3
|---------

## Cite

`pyaging` also provides citations for all available clocks.

In [4]:
pya.utils.cite_clock('AltumAge')

|-----> 🏗️ Starting cite_clock function
|-----> ⚙️ Load all clock metadata started
|-----------> Data found in pyaging_data/all_clock_metadata.pt
|-----> ✅ Load all clock metadata finished [0.5150s]
|-----> ⚙️ Searching for citation of clock altumage started
|-----------> Citation for altumage:
|-----------> de Lima Camillo, Lucas Paulo, Louis R. Lapierre, and Ritambhara Singh. "A pan-tissue DNA-methylation epigenetic clock based on deep learning." npj Aging 8.1 (2022): 4.
|-----------> Please also consider citing pyaging :)
|-----------> de Lima Camillo, Lucas Paulo. "pyaging: a Python-based compendium of GPU-optimized aging clocks." bioRxiv (2023): 2023-11.
|-----> ✅ Searching for citation of clock altumage finished [0.0024s]
|-----> 🎉 Done! [0.5205s]


## Get metadata

To get all of the metadata for a clock, including citation and doi, just run the following.

In [5]:
pya.utils.get_clock_metadata('AltumAge')

|-----> 🏗️ Starting get_clock_metadata function
|-----> ⚙️ Load all clock metadata started
|-----------> Data found in pyaging_data/all_clock_metadata.pt
|-----> ✅ Load all clock metadata finished [0.5505s]
|-----> ⚙️ Showing altumage metadata started
|-----------> clock_name: altumage
|-----------> data_type: methylation
|-----------> species: Homo sapiens
|-----------> year: 2022
|-----------> approved_by_author: ✅
|-----------> citation: de Lima Camillo, Lucas Paulo, Louis R. Lapierre, and Ritambhara Singh. "A pan-tissue DNA-methylation epigenetic clock based on deep learning." npj Aging 8.1 (2022): 4.
|-----------> doi: https://doi.org/10.1038/s41514-022-00085-y
|-----------> notes: None
|-----------> version: None
|-----------> reference_values: True
|-----------> preprocess: scale
|-----> ✅ Showing altumage metadata finished [0.0062s]
|-----> 🎉 Done! [0.5622s]


## Get clock parameters

To easily analyze the weights and features of a particular clock, please use:

In [6]:
logger = pya.logger.Logger('test_logger')
device = 'cpu'
dir = 'pyaging_data'
indent_level = 1

clock = pya.pred.load_clock('AltumAge', device, dir, logger, indent_level=indent_level)

|-----> ⚙️ Load clock started
|-----------> Data found in pyaging_data/altumage.pt
|-----> ✅ Load clock finished [0.5409s]


In [7]:
clock

AltumAge(
  (base_model): AltumAgeNeuralNetwork(
    (linear1): Linear(in_features=20318, out_features=32, bias=True)
    (linear2): Linear(in_features=32, out_features=32, bias=True)
    (linear3): Linear(in_features=32, out_features=32, bias=True)
    (linear4): Linear(in_features=32, out_features=32, bias=True)
    (linear5): Linear(in_features=32, out_features=32, bias=True)
    (linear6): Linear(in_features=32, out_features=1, bias=True)
    (bn1): BatchNorm1d(20318, eps=0.001, momentum=0.99, affine=True, track_running_stats=True)
    (bn2): BatchNorm1d(32, eps=0.001, momentum=0.99, affine=True, track_running_stats=True)
    (bn3): BatchNorm1d(32, eps=0.001, momentum=0.99, affine=True, track_running_stats=True)
    (bn4): BatchNorm1d(32, eps=0.001, momentum=0.99, affine=True, track_running_stats=True)
    (bn5): BatchNorm1d(32, eps=0.001, momentum=0.99, affine=True, track_running_stats=True)
    (bn6): BatchNorm1d(32, eps=0.001, momentum=0.99, affine=True, track_running_stats=True

Let's check the weights of the first linear layer for AltumAge.

In [8]:
clock.base_model.linear1.weight

Parameter containing:
tensor([[ 1.2465e-05, -2.4719e-04,  5.4308e-02,  ..., -2.5304e-02,
          5.2822e-02,  8.9800e-02],
        [ 3.5401e-04, -3.0528e-03,  2.8799e-02,  ...,  6.8214e-03,
          6.9691e-02,  1.2179e-01],
        [ 1.6119e-04, -6.7272e-06, -4.6887e-02,  ...,  1.3132e-02,
          9.2417e-02, -4.2074e-02],
        ...,
        [ 1.9902e-04,  9.0495e-04, -8.5197e-03,  ..., -9.6892e-02,
          2.9396e-02,  5.9170e-02],
        [-1.2038e-04,  3.7530e-04,  1.7924e-01,  ..., -4.9997e-02,
         -1.2819e-02,  2.8045e-02],
        [ 1.1584e-04,  2.2752e-04, -3.0746e-02,  ...,  1.7930e-02,
          8.3116e-03, -2.0979e-02]], dtype=torch.float64, requires_grad=True)

A quick look at the features:

In [9]:
list(clock.features[0:10])

['cg00000292',
 'cg00002426',
 'cg00003994',
 'cg00007981',
 'cg00008493',
 'cg00008713',
 'cg00009407',
 'cg00011459',
 'cg00012199',
 'cg00012386']

And the reference values used:

In [10]:
list(clock.reference_values[0:10])

[0.7598633952352156,
 0.7863788078967272,
 0.06324422321924528,
 0.029943418029386736,
 0.9363471225552753,
 0.05054944899168823,
 0.0351571456459043,
 0.9114132733331861,
 0.037064057665286136,
 0.039170308280475935]

We can also get the metadata directly from the clock object:

In [11]:
clock.metadata

{'clock_name': 'altumage',
 'data_type': 'methylation',
 'species': 'Homo sapiens',
 'year': 2022,
 'approved_by_author': '✅',
 'citation': 'de Lima Camillo, Lucas Paulo, Louis R. Lapierre, and Ritambhara Singh. "A pan-tissue DNA-methylation epigenetic clock based on deep learning." npj Aging 8.1 (2022): 4.',
 'doi': 'https://doi.org/10.1038/s41514-022-00085-y',
 'notes': None,
 'version': None}

For a more in depth look at how the clock was setup, including the model type and the source of the weights, please look at our [clocks notebook folder](https://github.com/rsinghlab/pyaging/tree/main/clocks/notebooks) on GitHub.