# Sandbox

## Intro

**To experiment with an executable version of this notebook, [load it in Google Colaboratory](https://colab.research.google.com/github/colleenXu/biothings_explorer/blob/relay/jupyter%20notebooks/CX_WIPs/TranslatorUseCase_COVIDproxies_HAdrugs.ipynb).**

## Step 0: Load BTE modules, notebook functions

In [None]:
## for Google Colab
%%capture
!pip install git+https://github.com/colleenXu/biothings_explorer@relay#egg=biothings_explorer

In [1]:
## CX: allows multiple lines of code to print from one code block
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# import modules from biothings_explorer
from biothings_explorer.hint import Hint
from biothings_explorer.user_query_dispatcher import FindConnection

## show time that this notebook was executed 
from datetime import datetime

## packages to work with objects 
import re

## to get around bugs
import nest_asyncio
nest_asyncio.apply()

In [2]:
## functions to add to modules?
def hint_display(query, hint_result):
    """
    show the type, name, number of IDs for all results returned by the query
    
    :param: query: string used in hint query
    :param: hint_result: object returned from hint query, a dictionary of lists of dictionaries
    
    Returns: None
    """
    ## function needs to be rewritten if it's going to give the exact index of each object within its type 
    display = ['type', 'name']  ## replace with the parts of the BioThings object you want to see
    concise_results = []
    for BT_type, result in hint_result.items():
        if result:  ## basically if it's not empty
            for items in result:
                ## number of identifiers per object: number of keys - 4 (name, primary, display, type)
                temp = len(items) - 4
                concise_results.append((items[display[0]], items[display[1]], 
                                         str(temp)))
                    
    print('There are {total} BioThings objects returned for {ht}:'.format(\
                total = len(concise_results), ht = query))
    for display_info in concise_results:
        print('{0}, {1}, num of IDs: {2}'.format(display_info[0], display_info[1], display_info[2]))

In [3]:
def filter_table(df):
    """
    use _source and _method columns to remove rows (paths) from the dataframe
    :param: pandas dataframe containing results from BTE FindConnection module, in table form
    
    Returns: filtered dataframe
    """
    ## note: still needs checking with EXPLAIN queries
    ## key is the string to match to column, value is a list of strings to match to column values
    filter_out = {'_source': ['SEMMED', 'CTD', 'ctd', 'omia']   
#                   '_method': []  ## currently no method stuff I want to filter out
                 }
    ## SEMMED: text mining results wrong for PhenotypicFeature -> Gene
    ## CTD/ctd: results odd for MSUD -> ChemicalSubstance
    ## omia: results wrong or discontinued gene IDs for PhenotypicFeature -> Gene
    
    
    df_temp = df.copy()  ## so the original df isn't modified in-place
    for key,val in filter_out.items():
        ## find columns that match the key string
        columns = [i for i in df_temp.columns if key in i]
        ## iterate through each column
        for col in columns:
            ## iterate through each value to take out, check if string CONTAINS match. 
            ## only keep rows that don't contain the value
            for i in val:
                df_temp = df_temp[~ df_temp[col].str.contains(i, na = False)]
    return df_temp

In [4]:
def scoring_output(df, q_type):
    """
    score results based on whether query was Predict or Explain type, number of 
        intermediate nodes 
    :param: pandas dataframe containing results from BTE FindConnection module
    :param: string describing type of query (Predict or Explain)
    
    May flatten some edges, because score only counts one edge per 
        unique predicate / API / method (ignoring source and pubmed col)
    
    Predict queries: score each output node by counting # of paths
        from input nodes to it. Normalize by dividing by maximum
        possible # of paths
    Explain two-hop (one intermediate) queries: score each intermediate node by 
        counting # of paths (between input and output nodes) that include it. 
        Normalize by dividing by maximum possible # of paths    

    Explain one-hop (direct) queries: no need to score, prints message
    Other Explain queries (many-hops): currently not able to score, prints message     
    
    Returns: pandas series with scores, index is output_name
             or None (one-hop or many-hop Explain query)
    """
    df_temp = df.copy()  ## so no chance to mutate this   
    flag_direct = False  ## one-hop query or not
    ## use df_col to look quicker into columns
    df_col = set(df_temp.columns)
    
    ## ignore source and pubmed col in looking at unique edges 
    columns_drop = [col for col in df_col if (('_source' in col) or ('_pubmed' in col))]
    df_temp.drop(columns = columns_drop, inplace = True)    
    df_temp.drop_duplicates(inplace = True)
    
    ## check if query is one-hop or not
    if "node1_name" not in df_col:    ## name for first intermediate node layer
        flag_direct = True  
    
    if q_type == 'Explain':
        if flag_direct:   # one hop / no intermediates
            print('No valid node scoring for one-hop (direct) Explain queries.')
            return None
        ## if there are many-hops/intermediate layers
        elif "node2_name" in df_col:  ## name for 2nd intermed. node layer
            print('Cannot currently score many-hop Explain queries.')
            return None
        else:   ## two-hop / 1 intermediate layer
            ## count multi-edges to results (the intermediate node1 col)
            scores = df_temp.node1_name.value_counts() 
            ## to find the maximum-possible number of edges, look at non-result cols
            columns_drop = [col for col in df_col if 'node1' in col]
            df_temp.drop(columns = columns_drop, inplace = True)
            ## now look at number of unique combos for input, edge info, output
            df_temp.drop_duplicates(inplace = True)
            max_paths = df_temp.shape[0]            
            ## normalize scores by dividing each by max number of paths
            scores = scores / max_paths

    else:  ## Predict type query
        ## count multi-edges to results (the output col)
        scores = df_temp.output_name.value_counts()
        ## to find the maximum number of multi-edges, look at non-output col
        columns_drop = [col for col in df_temp.columns if 'output' in col]
        df_temp.drop(columns = columns_drop, inplace = True)
        ## now look at number of unique paths possible
        df_temp.drop_duplicates(inplace = True)
        max_paths = df_temp.shape[0]
        ## normalize scores by dividing each by max number of paths
        scores = scores / max_paths
            
    ## return scores as pandas dataframe, with rank
    scores = scores.to_frame(name = 'score') 
    scores['rank'] = scores['score'].rank(method = 'dense', ascending = False)
    return scores

In [5]:
## record when cell blocks are executed
print('The time that this notebook was executed is...')
print('Local time (PST, West Coast USA): ')
print(datetime.now())
print('UTC time: ')
print(datetime.utcnow())

The time that this notebook was executed is...
Local time (PST, West Coast USA): 
2021-02-01 09:41:53.741924
UTC time: 
2021-02-01 17:41:53.742040


## Playing with Hint and 1-hop queries

BTE performs the **query path planning** and **query path execution** by deconstructing the query into individual API calls, executing those API calls, and then assembling the results.

The code block below takes ~4 seconds to run. 

Note that this question is not working with drug as ecallantide, Cinryze, Berinert, Ruconest, lanadelumab: BTE can find these objects (Hint module), but the specific drug -> disease is not finding anything. Issue with datasources or with identifier mapping inside hint module (not enough IDs)? 

Trying to see what ChemicalSubstances are even returned for hereditary angioedema

In [6]:
ht = Hint()  ## neater way to call this BTE module

## the human user gives this input
this_start = "cytomegalovirus"
this_hint = ht.query(this_start)

hint_display(this_start, this_hint)

There are 13 BioThings objects returned for cytomegalovirus:
ChemicalSubstance, Human cytomegalovirus immune globulin, num of IDs: 1
Disease, cytomegalovirus pneumonia, num of IDs: 2
Disease, cytomegalovirus retinitis, num of IDs: 4
Disease, fetal cytomegalovirus syndrome, num of IDs: 3
Disease, cytomegalovirus infection, num of IDs: 4
Disease, TORCH syndrome, num of IDs: 2
PhenotypicFeature, Severe cytomegalovirus infection, num of IDs: 1
PhenotypicFeature, Renal tubular cytomegalovirus inclusions, num of IDs: 1
MolecularActivity, ubiquitin protein ligase activity, num of IDs: 2
CellularComponent, Cytomegalovirus inclusion body, num of IDs: 0
Pathway, Human cytomegalovirus infection - Homo sapiens (human), num of IDs: 1
Pathway, human cytomegalovirus and map kinase pathways, num of IDs: 0
Pathway, Human cytomegalovirus infection - Mus musculus (mouse), num of IDs: 1


Based on the information above, we'll pick the top `PhenotypicFeature` choice (indexed at 0) for our query. We can look at identifier mappings inside this BioThings object. 

Note that the query didn't work when picking the top `Disease` choice (essential hypertension). 

In [10]:
## the human user makes this choice, gives this input
this_choice_type = 'Disease'
this_choice_idx = 3

this_hint_obj = this_hint[this_choice_type][this_choice_idx]  
this_hint_obj

{'MONDO': 'MONDO:0005132',
 'UMLS': 'C0010823',
 'name': 'cytomegalovirus infection',
 'MESH': 'D003586',
 'OMOP': '440032',
 'primary': {'identifier': 'MONDO',
  'cls': 'Disease',
  'value': 'MONDO:0005132'},
 'display': 'MONDO(MONDO:0005132) UMLS(C0010823) MESH(D003586) name(cytomegalovirus infection) OMOP(440032)',
 'type': 'Disease'}

In [29]:
## the human user gives this input
other_start = "pregnancy"
other_hint = ht.query(other_start)

hint_display(other_start, other_hint)

There are 25 BioThings objects returned for pregnancy:
Gene, pregnancy specific beta-1-glycoprotein 3, num of IDs: 6
Gene, pregnancy specific beta-1-glycoprotein 6, num of IDs: 6
Gene, pregnancy specific beta-1-glycoprotein 5, num of IDs: 6
Gene, pregnancy specific beta-1-glycoprotein 4, num of IDs: 6
Gene, pregnancy specific beta-1-glycoprotein 9, num of IDs: 6
Disease, pregnancy disorder, num of IDs: 3
Disease, abdominal ectopic pregnancy, num of IDs: 2
Disease, hypertension, pregnancy-induced, num of IDs: 3
Disease, ovarian ectopic pregnancy, num of IDs: 2
Disease, ectopic pregnancy, num of IDs: 4
PhenotypicFeature, Ameliorated by pregnancy, num of IDs: 1
PhenotypicFeature, Pregnancy exposure, num of IDs: 1
PhenotypicFeature, Postterm pregnancy, num of IDs: 1
PhenotypicFeature, Triggered by pregnancy, num of IDs: 1
PhenotypicFeature, Maternal fever in pregnancy, num of IDs: 2
BiologicalProcess, female pregnancy, num of IDs: 1
BiologicalProcess, cellular response to carcinoembryonic 

In [32]:
## the human user makes this choice, gives this input
other_choice_type = 'Disease'
other_choice_idx = 0

other_hint_obj = other_hint[other_choice_type][other_choice_idx]  
other_hint_obj

{'MONDO': 'MONDO:0024575',
 'UMLS': 'C0032962',
 'name': 'pregnancy disorder',
 'MESH': 'D011248',
 'primary': {'identifier': 'MONDO',
  'cls': 'Disease',
  'value': 'MONDO:0024575'},
 'display': 'MONDO(MONDO:0024575) UMLS(C0032962) MESH(D011248) name(pregnancy disorder)',
 'type': 'Disease'}

In [33]:
## the human user gives this input
q4_intermediate = None

q4 = FindConnection(input_obj = this_hint_obj,\
                    output_obj = other_hint_obj, \
                    intermediate_nodes = 'ChemicalSubstance')
q4.connect(verbose = True)


BTE will find paths that join 'cytomegalovirus infection' and 'pregnancy disorder'. Paths will have 1 intermediate node.

Intermediate node #1 will have these type constraints: ChemicalSubstance



==== Step #1: Query path planning ====

Because cytomegalovirus infection is of type 'Disease', BTE will query our meta-KG for APIs that can take 'Disease' as input and 'ChemicalSubstance' as output

BTE found 7 apis:

API 1. scibite(1 API call)
API 2. scigraph(1 API call)
API 3. hmdb(1 API call)
API 4. semmed_disease(15 API calls)
API 5. pharos(1 API call)
API 6. mydisease(1 API call)
API 7. mychem(2 API calls)


==== Step #2: Query path execution ====
NOTE: API requests are dispatched in parallel, so the list of APIs below is ordered by query time.

API 6.1: https://mydisease.info/v1/query?fields=ctd.chemical_related_to_disease (POST -d q=D003586&scopes=mondo.xrefs.mesh, disgenet.xrefs.mesh)
API 3.11: https://biothings.ncats.io/semmed/query?fields=positively_regulates (POST -d q=C0010823&

In [34]:
q4_r_paths_table = q4.display_table_view()

q4_type = re.findall("dispatcher.([a-zA-Z]+)'", str(type(q4.fc)))
q4_type = "".join(q4_type)  ## convert to string

q4 = None  ## clear memory

In [35]:
q4_r_paths_table

Unnamed: 0,input,input_type,pred1,pred1_source,pred1_api,pred1_pubmed,pred1_method,node1_type,node1_name,node1_id,pred2,pred2_source,pred2_api,pred2_pubmed,pred2_method,output_type,output_name,output_id
0,CMV INFECTION,Disease,caused_by,SEMMED,SEMMED Disease API,22205740,,ChemicalSubstance,C0597177,UMLS:C0597177,treated_by,SEMMED,SEMMED Disease API,18484419,,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575
1,CMV INFECTION,Disease,treated_by,SEMMED,SEMMED Disease API,8547414,,ChemicalSubstance,ANTIBIOTIC,name:ANTIBIOTIC,prevented_by,SEMMED,SEMMED Disease API,11359314,,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575
2,CMV INFECTION,Disease,caused_by,SEMMED,SEMMED Disease API,6261220,,ChemicalSubstance,ANTIBIOTIC,name:ANTIBIOTIC,prevented_by,SEMMED,SEMMED Disease API,11359314,,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575
3,CMV INFECTION,Disease,treated_by,SEMMED,SEMMED Disease API,9686761,,ChemicalSubstance,2-(ACETYLOXY)BENZOIC ACID,name:2-(ACETYLOXY)BENZOIC ACID,prevented_by,SEMMED,SEMMED Disease API,"17393011,19608566,20034330,20943705,23528915,2...",,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575
4,CMV INFECTION,Disease,treated_by,SEMMED,SEMMED Disease API,9686761,,ChemicalSubstance,2-(ACETYLOXY)BENZOIC ACID,name:2-(ACETYLOXY)BENZOIC ACID,treated_by,SEMMED,SEMMED Disease API,24901243273349778532271,,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575
5,CMV INFECTION,Disease,treated_by,SEMMED,SEMMED Disease API,"10974369,11063338,11704769,11789257,12834313,1...",,ChemicalSubstance,PHARMACEUTICAL PREPARATIONS,name:PHARMACEUTICAL PREPARATIONS,prevented_by,SEMMED,SEMMED Disease API,24698194,,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575
6,CMV INFECTION,Disease,prevented_by,SEMMED,SEMMED Disease API,10320043112709372401099325637709,,ChemicalSubstance,PHARMACEUTICAL PREPARATIONS,name:PHARMACEUTICAL PREPARATIONS,prevented_by,SEMMED,SEMMED Disease API,24698194,,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575
7,CMV INFECTION,Disease,affected_by,SEMMED,SEMMED Disease API,8856026,,ChemicalSubstance,PHARMACEUTICAL PREPARATIONS,name:PHARMACEUTICAL PREPARATIONS,prevented_by,SEMMED,SEMMED Disease API,24698194,,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575
8,CMV INFECTION,Disease,treated_by,SEMMED,SEMMED Disease API,"10974369,11063338,11704769,11789257,12834313,1...",,ChemicalSubstance,PHARMACEUTICAL PREPARATIONS,name:PHARMACEUTICAL PREPARATIONS,treated_by,SEMMED,SEMMED Disease API,243923758180351,,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575
9,CMV INFECTION,Disease,prevented_by,SEMMED,SEMMED Disease API,10320043112709372401099325637709,,ChemicalSubstance,PHARMACEUTICAL PREPARATIONS,name:PHARMACEUTICAL PREPARATIONS,treated_by,SEMMED,SEMMED Disease API,243923758180351,,ChemicalSubstance,COMPLICATION OF PREGNANCY OR CHILDBIRTH,MONDO:MONDO:0024575


In [None]:
q4_r_paths_table[q4_r_paths_table['pred1_api'].str.contains('Scigraph')]

## Playing with DisGeNET files

In [2]:
import pathlib
import pandas as pd

folder = pathlib.Path.home().joinpath('Desktop', 'ScrippsJob', 'DisGeNET')

### individual, all gene-disease-pmid

In [3]:
allgene_pmid_path = folder.joinpath('all_gene_disease_pmid_associations.tsv')
allgene_pmid = pd.read_table(allgene_pmid_path)

In [4]:
allgene_pmid.shape
allgene_pmid.columns

(3241576, 15)

Index(['geneId', 'geneSymbol', 'DSI', 'DPI', 'diseaseId', 'diseaseName',
       'diseaseType', 'diseaseClass', 'diseaseSemanticType', 'score', 'EI',
       'YearInitial', 'YearFinal', 'pmid', 'source'],
      dtype='object')

In [6]:
allgene_pmid.diseaseId.value_counts()

C0027651    159694
C0006826     89348
C1306459     75101
C0006142     58287
C0678222     57200
             ...  
C1302778         1
C0600039         1
C1855905         1
C3889617         1
C0751868         1
Name: diseaseId, Length: 30170, dtype: int64

In [None]:
allgene_pmid.source.unique()
allgene_pmid.diseaseSemanticType.unique()
allgene_pmid.diseaseType.unique()

In [None]:
allgene_pmid.count()

In [None]:
allgene_pmid[(allgene_pmid['source'] == 'C')].geneSymbol.unique()[0:10]

In [None]:
allgene_pmid[(allgene_pmid['geneSymbol'].str.contains('MIR'))].source.unique()

In [None]:
allgene_pmid['EI'].describe()

In [None]:
allgene_pmid[(allgene_pmid['DPI'] > 0.9)].geneSymbol.unique()

In [None]:
allgene_pmid[allgene_pmid['geneSymbol'] == 'TNF'].shape

### individual variant-disease-pmid

In [7]:
variant_pmid_path = folder.joinpath('all_variant_disease_pmid_associations.tsv')
variant_pmid = pd.read_table(variant_pmid_path)

In [8]:
variant_pmid.shape
variant_pmid.columns

(739842, 16)

Index(['snpId', 'chromosome', 'position', 'DSI', 'DPI', 'diseaseId',
       'diseaseName', 'diseaseType', 'diseaseClass', 'diseaseSemanticType',
       'score', 'EI', 'YearInitial', 'YearFinal', 'pmid', 'source'],
      dtype='object')

In [9]:
variant_pmid.snpId.value_counts()

rs113488022     4068
rs121913377     3911
rs1217691063    3313
rs77375493      1782
rs1800562       1592
                ... 
rs7519348          1
rs1393578          1
rs645601           1
rs28133            1
rs3809272          1
Name: snpId, Length: 194515, dtype: int64

In [None]:
variant_pmid['chromosome'].count()

In [None]:
variant_pmid.chromosome.dtype

In [None]:
variant_pmid['score'].describe()

In [None]:
variant_pmid[['source']].value_counts()

In [None]:
variant_pmid.source.unique()
variant_pmid.count()

In [None]:
variant_pmid.head()

### compiled, all gene-disease

In [None]:
allgene_compiled_path = folder.joinpath('dropping', 'all_gene_disease_associations.tsv')
allgene_compiled = pd.read_table(allgene_compiled_path)

In [None]:
allgene_compiled['diseaseSemanticType'].value_counts()

In [None]:
allgene_compiled.shape
allgene_compiled.columns
# allgene_compiled.source.unique()
allgene_compiled['NofSnps'].describe()

In [None]:
allgene_compiled[(allgene_compiled['geneSymbol'] == 'APP') &
                (allgene_compiled['diseaseId'] == 'C0002395')].source.to_list()

In [None]:
allgene_compiled['DPI'].describe()

In [None]:
allgene_compiled[allgene_compiled['DSI'] == 0.231]

### compiled, all variant-disease

In [None]:
allvariant_path = folder.joinpath('dropping', 'all_variant_disease_associations.tsv')
allvariant = pd.read_table(allvariant_path)

In [None]:
allvariant['diseaseSemanticType'].value_counts()

In [None]:
allvariant.source.unique() ## THIS IS BAD NEED TO PULL OUT INFERRED ENTRIES
allvariant.count()

In [None]:
allvariant[(allvariant['snpId'] == 'rs75932628') &
                (allvariant['diseaseId'] == 'C0002395')].source.to_list()

In [None]:
allvariant['NofPmids'].describe()

### Other files

In [None]:
curated_path = folder.joinpath('DisGeNETv7_nonBeeFree', 'browser_source_summary_gda_ANIMAL_MODELS.tsv')


In [None]:
curated_data = pd.read_table(curated_path)

In [None]:
curated_data.columns
curated_data.shape
curated_data.count()  ## notice that the last 3 columns have some missing values...

In [None]:
curated_data.Original_DB.unique()

In [None]:
curated_data[curated_data['Original_DB']=='MGD'].Association_Type.unique()

In [None]:
curated_data['Association_Type'].unique()

In [None]:
curated_data.head(1)

In [None]:
curated_data[curated_data['Sentence'].isna()]

In [None]:
dd_curated_path = folder.joinpath('disease_to_disease_CURATED.tsv')
dd_curated = pd.read_table(dd_curated_path)

In [None]:
dd_curated.shape
dd_curated.columns

In [None]:
dd_curated[(dd_curated['diseaseId2'] == 'C0684249') &
           (dd_curated['diseaseId1'] == 'C186190')]

In [None]:
dd_curated.head()

In [None]:
dd_curated.source.unique()

## Playing with Danish data

In [None]:
Danish_path = pathlib.Path.home().joinpath('Downloads', '41467_2020_18682_MOESM1_ESM.tsv')

In [None]:
Danish = pd.read_table(Danish_path)

In [None]:
Danish.shape  ## matches the paper saying that there are "77,294 significant diagnosis pairs"
Danish.columns


In [None]:
Danish.head()

In [None]:
Danish.count()
Danish.D1.nunique()

In [None]:
Danish.death_counts.describe()
## huh includes 0 as a value. Not sure about the more missingness. 
## this is death within 5 years of diagnosis D2. 

In [None]:
Danish.female_counts.describe()
## huh so the min. count of females can be 0. more missingness though...not clear why. 

In [None]:
1302/365.25

In [None]:
Danish.CODE_DIFF_DAYS.describe()

## huh. The cutoff was 5 years between diagnosis D1 and D2, but the max here is ~3.6 years
## maybe because it's an average? 

In [None]:
Danish.counts.describe()
## notice min is 20 and # missing values matches AGE_AT column. So perhaps 
## the diff from 40711 to 39321 is because those 1390 pairs have <20 people diagnosed 
## with both D1 and D2

Danish[~ Danish.counts.isna()]['direction_yes_no'].unique()
## these values are only given for pairs with a significant direction

Danish.AGE_AT_DISEASE.describe()
## average age of diagnosis of D1, in days 

In [None]:
## only has values if there was a significant direction / aka direction_yes_no == 1
Danish['p.value.direction'].describe()

In [None]:
Danish.direction_yes_no.value_counts()
## the authors first filtered on this. Not sure how this relates to the p.value.direction col...
## so 40,711 had a significant direction. This matches Fig. 1 in the paper

In [None]:
Danish.p5years.describe()  
## so all are less than 1.21e-9, which was their cutoff for significant disease pair
## so the filtering was already done

In [None]:
## notice that the relative risk varies a lot. 
## They used a cutoff of RR > 1. Why? What does this mean?

## they filter on this at the end to remove RR <1. 
Danish.RR.describe()