# OAK enrichment command

This notebook is intended as a supplement to the [main OAK CLI docs](https://incatools.github.io/ontology-access-kit/cli.html).

This notebook provides examples for the `enrichment` command which produces a summary of ontology classes that are enriched in the associations for an input set of entities

## Help Option

You can get help on any OAK command using `--help`

In [1]:
!runoak enrichment --help

Usage: runoak enrichment [OPTIONS] [TERMS]...

  Run class enrichment analysis.

  Note: currently this is slow

Options:
  -o, --output FILENAME           Output file, e.g. obo file
  -p, --predicates TEXT           A comma-separated list of predicates
  --autolabel / --no-autolabel    If set, results will automatically have
                                  labels assigned  [default: autolabel]
  -O, --output-type TEXT          Desired output type
  -o, --output FILENAME           Output file, e.g. obo file
  --if-absent [absent-only|present-only]
                                  determines behavior when the value is not
                                  present or is empty.
  -S, --set-value TEXT            the value to set for all terms for the given
                                  property.
  --cutoff FLOAT                  The cutoff for the p-value  [default: 0.05]
  -S, --sample-file FILENAME      file containing input list of entity IDs
                 

## Download example file and setup

We will use the HPO Association file

In [2]:
!curl -L -s http://purl.obolibrary.org/obo/hp/hpoa/phenotype.hpoa > input/hpoa.tsv

next we will set up an hpo alias

In [2]:
alias hp runoak -i sqlite:obo:hp

In [3]:
alias mondo runoak -i sqlite:obo:mondo

Test this out by querying for associations for a particular orpha disease.

We need to pass in the association file we downloaded, as well as specify the file type (with `-G`):

In [4]:
hp -G hpoa -g input/hpoa.tsv associations -Q subject ORPHA:1899 -O csv | head

subject	subject_label	predicate	object	object_label	property_values
ORPHA:1899	None	None	HP:0000963	Thin skin	[]
ORPHA:1899	None	None	HP:0000974	Hyperextensible skin	[]
ORPHA:1899	None	None	HP:0001001	Abnormality of subcutaneous fat tissue	[]
ORPHA:1899	None	None	HP:0001252	Hypotonia	[]
ORPHA:1899	None	None	HP:0001373	Joint dislocation	[]
ORPHA:1899	None	None	HP:0001385	Hip dysplasia	[]
ORPHA:1899	None	None	HP:0001387	Joint stiffness	[]
ORPHA:1899	None	None	HP:0002300	Mutism	[]
ORPHA:1899	None	None	HP:0002381	Aphasia	[]


## Rollup

Next we will roll up annotations. We choose two representations of the same EDS concept, from Orphanet and OMIM
(note we can provide as many diseases as we like).

We will use HPO terms roughly inspired by https://www.omim.org/clinicalSynopsis/130060

In [8]:
mondo labels .parents//p=RO:0004003 [ .desc//p=i EDS ] -O csv > output/EDS-genes.tsv

In [6]:
!runoak -i translator: normalize -M NCBIGene [ .parents//p=RO:0004003 [ .desc//p=i EDS ] ]

NotImplementedError
