# Map Medical Subject Headings (MeSH) to UMLS

This example demonstrates the typical workflow to populate a MESH2UMLS database
table relating all concepts associated with all MeSH terms in the input database.

The following backends are supported for storing the results:
* MySQL
* SQLite

### Set Up

In [1]:
using SQLite
using MySQL
using BioMedQuery.DBUtils
using BioMedQuery.Processes
using BioServices.UMLS

Credentials are environment variables (e.g set in your .juliarc.jl)

In [2]:
umls_user = ENV["UMLS_USER"];
umls_pswd = ENV["UMLS_PSSWD"];

results_dir = ".";

### Using MySQL as a backend

*Note: this example reuses the MySQL DB from the PubMed Search and Save example.*

Create MySQL DB connection

In [3]:
host = "127.0.0.1";
mysql_usr = "root";
mysql_pswd = "";
dbname = "pubmed_obesity_2010_2012";

db_mysql = MySQL.connect(host, mysql_usr, mysql_pswd, db = dbname);

Map MeSH to UMLS

In [4]:
@time map_mesh_to_umls_async!(db_mysql, umls_user, umls_pswd; append_results=false, timeout=3);

----------Matching MESH to UMLS-----------
String["Adult", "Aged", "Aged, 80 and over", "Analysis of Variance", "Body Weight", "C-Reactive Protein", "Child", "Cross-Sectional Studies", "Fatigue", "Female", "Fibromyalgia", "Germany", "Health Status", "Humans", "Japan", "Male", "Middle Aged", "Nutrition Surveys", "Obesity", "Pain", "Pain Measurement", "Physical Fitness", "Prognosis", "Quality of Life", "Surveys and Questionnaires", "Reference Values", "Risk Factors", "ROC Curve", "Severity of Illness Index", "Sports", "Television", "Thyrotropin", "Biomarkers", "Weight Gain", "Exercise", "Body Mass Index", "Incidence", "Prevalence", "Logistic Models", "Odds Ratio", "Case-Control Studies", "Age Distribution", "Sex Distribution", "Sleep Apnea, Obstructive", "Metabolic Syndrome", "Overweight", "Waist Circumference", "Young Adult", "Obesity, Abdominal", "Republic of Korea", "Sedentary Lifestyle", "Pediatric Obesity"]
INFO: UTS: Requesting new TGT
INFO: Descriptor 8 out of 52: Cross-Sectional 

#### Explore the output table

In [5]:
db_query(db_mysql, "SELECT * FROM mesh2umls")

Unnamed: 0,mesh,umls
1,Adult,Age Group
2,Age Distribution,Quantitative Concept
3,Aged,Organism Attribute
4,"Aged, 80 and over",Age Group
5,Analysis of Variance,Quantitative Concept
6,Biomarkers,Clinical Attribute
7,Body Mass Index,Diagnostic Procedure
8,Body Weight,Organism Attribute
9,C-Reactive Protein,"Amino Acid, Peptide, or Protein"
10,C-Reactive Protein,Immunologic Factor


### Using SQLite as a backend

*Note: this example reuses the MySQL DB from the PubMed Search and Save example.*

Create SQLite DB connection

In [6]:
db_path = "$(results_dir)/pubmed_obesity_2010_2012.db";
db_sqlite = SQLite.DB(db_path);

Map MeSH to UMLS

In [7]:
@time map_mesh_to_umls_async!(db_sqlite, umls_user, umls_pswd; append_results=false, timeout=3);

----------Matching MESH to UMLS-----------
Union{Missings.Missing, String}["Reference Values", "Republic of Korea", "ROC Curve", "Fatigue", "Obesity", "Risk Factors", "Logistic Models", "Severity of Illness Index", "Male", "Case-Control Studies", "Analysis of Variance", "Sedentary Lifestyle", "Prevalence", "Quality of Life", "Odds Ratio", "Exercise", "Body Mass Index", "Aged", "Child", "Sex Distribution", "Adult", "Germany", "Sports", "Thyrotropin", "Pediatric Obesity", "Humans", "Japan", "Cross-Sectional Studies", "Weight Gain", "Middle Aged", "Surveys and Questionnaires", "Health Status", "Young Adult", "Incidence", "Prognosis", "Body Weight", "Pain Measurement", "Waist Circumference", "Metabolic Syndrome", "Pain", "Nutrition Surveys", "Fibromyalgia", "Sleep Apnea, Obstructive", "Television", "Age Distribution", "Overweight", "Physical Fitness", "Female", "Biomarkers", "Obesity, Abdominal", "C-Reactive Protein", "Aged, 80 and over"]
INFO: UTS: Reading TGT from file
INFO: Descriptor 1

#### Explore the output table

In [8]:
db_query(db_sqlite, "SELECT * FROM mesh2umls;")

Unnamed: 0,mesh,umls
1,Sedentary Lifestyle,Finding
2,Analysis of Variance,Quantitative Concept
3,Germany,Geographic Area
4,Weight Gain,Finding
5,Child,Age Group
6,Case-Control Studies,Research Activity
7,Logistic Models,Intellectual Product
8,Logistic Models,Quantitative Concept
9,Pediatric Obesity,Disease or Syndrome
10,Body Weight,Organism Attribute


*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*