# Map Metathesaurus relations to Semrep relation

## Semrep relations

Semrep relations are mostly based on UMLS Semantic Network. they are

**ADMINISTERED_TO**: Given to an entity, when no assertion is made that the substance or procedure is being given as treatment. 
	EXAMPLE:
		{Patients} with single brain lesion <received> an extra 3 Gy x 5 {radiotherapy} ...
	OUTPUT PREDICATE:
		C0034618: Radiation therapy (Therapeutic or Preventive Procedure)-ADMINISTERED_TO-C0030705: Patients (Human) 

**AFFECTS**: Produces a direct effect on. Implied here is the altering or influencing of an existing condition, state, situation, or entity. 
	
**ASSOCIATED_WITH**: Has a relationship to (gene-disease relation). 
	
**AUGMENTS**: Expands or stimulates a process.
	EXAMPLE:
		{Nicotine} <induces> {conditioned place preferences} over a large range of doses in rats.
	OUTPUT PREDICATE:
		C0028040: Nicotine (Organic Chemical)-AUGMENTS-C0815102: place preference learning (Mental Process) 

**CAUSES**: Brings about a condition or an effect. Implied here is that an agent, such as for example, a pharmacologic substance or an organism, 

		
**COEXISTS_WITH**: Occurs together with, or jointly. 

**CONVERTS_TO**: Changes from one form to another (both substances). 
		
**COMPLICATES**: Causes to become more severe or complex, or results in adverse effects. 
	EXAMPLE:
		{Infections} can trigger GBS and <exacerbate> {CIDP}.
	OUTPUT PREDICATE:
		C0021311: Infection (Disease or Syndrome)- COMPLICATES-C0393819: Polyradiculoneuropathy, Chronic Inflammatory Demyelinating (Disease or Syndrome)
		
**DIAGNOSES**: Distinguishes or identifies the nature or characteristics of. 
		
**DISRUPTS**: Alters or influences an already existing condition, state, or situation. Produces a negative effect on. 
		
**INHIBITS**: Decreases, limits, or blocks the action or function of (substance interaction).  

**INTERACTS_WITH**: Substance interaction. 

**ISA**: The basic hierarchical link in the UMLS Semantic Network. If one item isa another item then the first item is more specific in meaning than the second item. 

**LOCATION_OF**: The position, site, or region of an entity or the site of a process. 

**MANIFESTATION_OF**: That part of a phenomenon which is directly observable or concretely or visibly expressed, or which gives evidence to the underlying process. 
		
**METHOD_OF**: The manner and sequence of events in performing an act or procedure.

**OCCURS_IN**: Has incidence in a group or population. 
	EXAMPLE:
		{Older populations} are more <prone> to {bone loss} with weight loss ...
	OUTPUT PREDICATE:
		C0599877: loss; bone (Pathologic Function)-OCCURS_IN-C1518563: Older Population (Human)

**PART_OF**: Composes, with one or more other physical units, some larger whole. This includes component of, division of, portion of, fragment of, section of, and layer of. 

**PRECEDES**: Occurs earlier in time. This includes antedates, comes before, is in advance of, predates, and is prior to.

**PREDISPOSES**: To be a risk to a disorder, pathology, or condition. 
	EXAMPLE:
		... high {ghrelin} levels <contribute to> {obesity} in Prader-Willi syndrome (PWS) ...
	OUTPUT PREDICATE:
		C0911014: ghrelin (Amino Acid, Peptide, or Protein)-PREDISPOSES-C0028754: Obesity (Disease or Syndrome)

**PREVENTS**: Stops, hinders or eliminates an action or condition. 

**PROCESS_OF**: Disorder occurs in (higher) organism. 

**PRODUCES**: Brings forth, generates or creates. This includes yields, secretes, emits, biosynthesizes, generates, releases, discharges, and creates.

**STIMULATES**: Increases or facilitates the action or function of (substance interaction). 

**TREATS**: Applies a remedy with the object of effecting a cure or managing a condition. 

**USES**: Employs in the carrying out of some activity. This includes applies, utilizes, employs, and avails.

**COMPARED_WITH**: Comparative predicate.

**HIGHER_THAN**: Comparative predicate.

**LOWER_THAN**: Comparative predicate.

**SAME_AS**: Comparative predicate.
    

    
## UMLS Metathesaurus relations
    
 
For the 5 semantic types Disease, Gene, PharmaSub, Chemical, Sympton we are interested in, there are several relations available in UMLS Metathesaurus

- chemical & pharmaSub: use, mapped_to, isa, mapped_from, has_structural_class
- disease & pharmaSub: contraindicated_with_disease, may_treat, causative_agent_of, related_to, may_prevent, associated_with, contraindicated_class_of
- gene & pharmaSub: isa, gene_product_encoded_by_gene, mapped_to, has_parent, related_to_genetic_biomarker, has_target, chemical_or_drug_metabolism_is_associated_with_allele, has_member, phenotype_of
- sympton & pharmaSub: contraindicated_with_disease, may_treat, causative_agent_of, may_prevent, induces, may_diagnose, related_to, use
- sympton & disease: isa, inverse_isa, mapped_from, related_to, has_manifestation, mapped_to, has_definitional_manifestation, disease_may_have_finding, classifies, clinically_similar, classified_as, associated_morphology_of, see, has_nichd_parent, nichd_parent_of, disease_may_have_associated_disease, possibly_equivalent_to, pathological_process_of, same_as, has_associated_finding, use, disease_has_finding, associated_with, has_associated_morphology, inverse_was_a, was_a, used_for, disease_excludes_finding, has_cdrh_parent, see_from, occurs_after, replaced_by, alias_of, interprets, definitional_manifestation_of, alternative_of, cause_of, consider, refers_to, due_to, has_causative_agent, temporally_followed_by, occurs_before, replaces, associated_finding_of


In [1]:
import os
import json
from collections import namedtuple

## Mapping between Metathesaurus relations and Semrep Predicates
    


In [2]:
relations = set(["use", "mapped_to", "isa", "mapped_from", "has_structural_class", 
                     "contraindicated_with_disease", "may_treat", "causative_agent_of", 
                     "related_to", "may_prevent", "associated_with", "isa", "gene_product_encoded_by_gene", 
                     "mapped_to", "has_parent", "related_to_genetic_biomarker", "has_target", 
                     "chemical_or_drug_metabolism_is_associated_with_allele", "has_member", "phenotype_of",
                     "contraindicated_with_disease", "may_treat", "causative_agent_of", "may_prevent",
                     "induces", "may_diagnose", "related_to", "use", "isa", "inverse_isa", "mapped_from", 
                     "related_to", "has_manifestation", "mapped_to", "has_definitional_manifestation", 
                     "disease_may_have_finding", "classifies", "clinically_similar", "classified_as", 
                     "associated_morphology_of", "see", "has_nichd_parent", "nichd_parent_of", 
                     "disease_may_have_associated_disease", "possibly_equivalent_to", "pathological_process_of", 
                     "same_as", "has_associated_finding", "use", "disease_has_finding", "associated_with", 
                     "has_associated_morphology", "inverse_was_a", "was_a", "used_for", 
                     "disease_excludes_finding", "has_cdrh_parent", "see_from", "occurs_after", "replaced_by", 
                     "alias_of", "interprets", "definitional_manifestation_of", "alternative_of", "cause_of",
                     "consider", "refers_to", "due_to", "has_causative_agent", "temporally_followed_by", 
                     "occurs_before", "replaces", "associated_finding_of","contraindicated_class_of"])

In [3]:

# whether it's inverse or not
SEMREP_RELA = namedtuple("SEMREP_RELA",['rela','PASS'])

In [10]:
mapping = {'alias_of':SEMREP_RELA('SAME_AS', False), 
             'alternative_of': SEMREP_RELA('SAME_AS', False), 
             'associated_finding_of': SEMREP_RELA('ASSOCIATED_WITH', False),
             'associated_morphology_of': SEMREP_RELA('ASSOCIATED_WITH', False),
             'associated_with': SEMREP_RELA('ASSOCIATED_WITH', False),
             'causative_agent_of': SEMREP_RELA('CAUSES', False),
             'cause_of': SEMREP_RELA('CAUSES', False),
             'chemical_or_drug_metabolism_is_associated_with_allele': SEMREP_RELA('ASSOCIATED_WITH', False),
             'classified_as': SEMREP_RELA('ISA', False),
             'classifies': SEMREP_RELA('ISA', True),
             'clinically_similar': SEMREP_RELA('COMPARED_WITH', False),
             'consider': SEMREP_RELA('OCCUR_IN', False),
             'contraindicated_with_disease': SEMREP_RELA('COMPLICATES', False),
             'contraindicated_class_of': SEMREP_RELA('COMPLICATES', False),
             'definitional_manifestation_of': SEMREP_RELA('MANIFESTATION_OF', False),
             'disease_excludes_finding': SEMREP_RELA('NEG_MANIFESTATION_OF', False),
             'disease_has_finding': SEMREP_RELA('MANIFESTATION_OF', True),
             'disease_may_have_associated_disease': SEMREP_RELA('ASSOCIATED_WITH', False),
             'disease_may_have_finding': SEMREP_RELA('MANIFESTATION_OF', True),
             'due_to': SEMREP_RELA('CAUSES', True),
             'gene_product_encoded_by_gene': SEMREP_RELA('USES', True),
             'gene_product_malfunction_associated_with_disease': SEMREP_RELA('ASSOCIATED_WITH', False),
             'has_associated_finding': SEMREP_RELA('ASSOCIATED_WITH', False),
             'has_associated_morphology': SEMREP_RELA('ASSOCIATED_WITH', False),
             'has_causative_agent': SEMREP_RELA('CAUSES', True),
             'has_cdrh_parent': SEMREP_RELA('ISA', False),
             'has_definitional_manifestation': SEMREP_RELA('MANIFESTATION_OF', True),
             'has_manifestation': SEMREP_RELA('MANIFESTATION_OF', True),
             'has_member': SEMREP_RELA('PART_OF', True),
             'has_nichd_parent': SEMREP_RELA('ISA', False),
             'has_parent': SEMREP_RELA('ISA', False),
             'has_structural_class': SEMREP_RELA('ISA', False),
             'has_target': SEMREP_RELA('INTERACTS_WITH', False),
             'induces': SEMREP_RELA('CAUSES', False),
             'interprets': SEMREP_RELA('CAUSES', False),
             'inverse_isa': SEMREP_RELA('ISA', True),
             'inverse_was_a': SEMREP_RELA('ISA', True),
             'isa': SEMREP_RELA('ISA', False),
             'mapped_from': SEMREP_RELA('PRODUCES', True),
             'mapped_to': SEMREP_RELA('PRODUCES', False),
             'may_diagnose': SEMREP_RELA('DIAGNOSES', False),
             'may_prevent': SEMREP_RELA('PREVENTS', False),
             'may_treat': SEMREP_RELA('TREATS', False),
             'nichd_parent_of': SEMREP_RELA('ISA', True),
             'occurs_after': SEMREP_RELA('PRECEDES', True),
             'occurs_before': SEMREP_RELA('PRECEDES', False),
             'pathological_process_of': SEMREP_RELA('PROCESS_OF', False),
             'phenotype_of': SEMREP_RELA('MANIFESTATION_OF', False),
             'possibly_equivalent_to': SEMREP_RELA('SAME_AS', False),
             'refers_to': SEMREP_RELA('ISA', False),
             'related_to': SEMREP_RELA('ASSOCIATED_WITH', False),
             'related_to_genetic_biomarker': SEMREP_RELA('ASSOCIATED_WITH', False),
             'replaced_by': SEMREP_RELA('COMPARED_WITH', False),
             'replaces': SEMREP_RELA('COMPARED_WITH', False),
             'same_as': SEMREP_RELA('SAME_AS', False),
             'see': SEMREP_RELA('ISA', False),
             'see_from': SEMREP_RELA('ISA', False),
             'temporally_followed_by':SEMREP_RELA('PRECEDES', False),
             'use': SEMREP_RELA('USES', False),
             'used_for': SEMREP_RELA('USES', True),
             'used_by': SEMREP_RELA('USES', True),
             'was_a': SEMREP_RELA('ISA', False),
             'finding_site_of': SEMREP_RELA('LOCATION_OF', False)
            }
             


In [11]:
def umls2semrep(umls_rela):
    semrep_rela = mapping[umls_rela[1]]
    if semrep_rela.PASS:
        return (umls_rela[2], semrep_rela.rela, umls_rela[0])
    else:
        return (umls_rela[0], semrep_rela.rela, umls_rela[2])

In [7]:
umls2semrep(('C0149931', 'has_nichd_parent', 'C0018681'))

('C0149931', 'ISA', 'C0018681')

# Read relations from UMLS Metathesaurus

In [8]:
input_folder = "output_relations/umls"
relations = set()
for path in os.listdir(input_folder):
    if path.endswith('.csv'):
        with open(os.path.join(input_folder, path),encoding='utf-8') as file:
            for line in file:
                relation = tuple(line.strip().split(','))
                relations.add(relation)

In [12]:
mapped_relations = set()
for relation in relations:
    mapped_relations.add(umls2semrep(relation))
    

In [17]:
mapped_relations

{('C1273036', 'COMPLICATES', 'C0040264'),
 ('C0002335', 'TREATS', 'C0243000'),
 ('C0040822', 'MANIFESTATION_OF', 'C4747621'),
 ('C0063076', 'TREATS', 'C0036271'),
 ('C0010137', 'CAUSES', 'C0571604'),
 ('C0013227', 'CAUSES', 'C0338717'),
 ('C0282213', 'TREATS', 'C0034734'),
 ('C0013227', 'CAUSES', 'C0570742'),
 ('C0009806', 'MANIFESTATION_OF', 'C1855849'),
 ('C0029995', 'TREATS', 'C0013911'),
 ('C0682880', 'CAUSES', 'C0393717'),
 ('C0009053', 'TREATS', 'C0037299'),
 ('C0002502', 'COMPLICATES', 'C0003460'),
 ('C0025624', 'TREATS', 'C0022104'),
 ('C1834708', 'PRODUCES', 'C0026821'),
 ('C0013404', 'MANIFESTATION_OF', 'C1970456'),
 ('C0018318', 'COMPLICATES', 'C0018801'),
 ('C0022177', 'TREATS', 'C0004096'),
 ('C0234132', 'MANIFESTATION_OF', 'C4479236'),
 ('C0178601', 'CAUSES', 'C0570734'),
 ('C0012547', 'TREATS', 'C0012546'),
 ('C0028053', 'CAUSES', 'C0572001'),
 ('C2697942', 'TREATS', 'C0028754'),
 ('C0282040', 'PREVENTS', 'C0019360'),
 ('C0236048', 'ISA', 'C0587047'),
 ('C0040485', 'MANI

# Output relations as csv

In [15]:
import pandas as pd

In [21]:
pd.DataFrame(list(mapped_relations),columns=['subject','predicate','object']).to_csv("output_relations/solidated_umls",index=False)