# BioMedQuery.UMLS


Utilities to search the Unified Medical Language System (UMLS). This is a Julia interface to their [REST API](https://documentation.uts.nlm.nih.gov/rest/home.html).

Searching the UMLS requires approved credentials. You can sign up [here](https://uts.nlm.nih.gov//license.html)

As of today, the following utilities are available:

* verify credentials / issue umls tickets
* search_umls
* get the best maching cui from a query
* get the semantic type

## 1. Set Up

In [56]:
using BioMedQuery.UMLS
user = ENV["UMLS_USER"];
psswd = ENV["UMLS_PSSWD"];
credentials = Credentials(user, psswd)
query = Dict("string"=>"asthma", "searchType"=>"exact")

Dict{String,String} with 2 entries:
  "string"     => "asthma"
  "searchType" => "exact"

## 2. Get a ticket and submit query

In [57]:
tgt = get_tgt(credentials)
all_results = search_umls(tgt, query)

Requesting new TGT


1-element Array{Any,1}:
 Dict{String,Any}(Pair{String,Any}("pageSize",25),Pair{String,Any}("pageNumber",1),Pair{String,Any}("result",Dict{String,Any}(Pair{String,Any}("classType","searchResults"),Pair{String,Any}("results",Any[Dict{String,Any}(Pair{String,Any}("name","Asthma"),Pair{String,Any}("uri","https://uts-ws.nlm.nih.gov/rest/content/2016AB/CUI/C0004096"),Pair{String,Any}("ui","C0004096"),Pair{String,Any}("rootSource","MTH")),Dict{String,Any}(Pair{String,Any}("name","Asthma Pathway"),Pair{String,Any}("uri","https://uts-ws.nlm.nih.gov/rest/content/2016AB/CUI/C2984299"),Pair{String,Any}("ui","C2984299"),Pair{String,Any}("rootSource","MTH"))]))))

## 3.  Get best matching cui and it's semantic type

In [58]:
cui = best_match_cui(all_results)
display(cui)
sm = get_semantic_type(tgt, cui)
display(sm)

"C0004096"

1-element Array{String,1}:
 "Disease or Syndrome"

### Processes available for UMLS

a. Get all UMLS semantic types for all MeSH sotore in a database corresponding to results from an Entrez query

b. For all articles in the 'results' database, filter all MeSH associated with a specific semantic type

In [59]:
using BioMedQuery.Processes
using MySQL

db_host = "127.0.0.1"
mysql_usr = "root"
mysql_pswd = "bcbi123"
dbname = "biomed_query_test"

db = mysql_connect(db_host, mysql_usr, mysql_pswd, dbname)

map_mesh_to_umls_async!(db, credentials)


----------Matching MESH to UMLS-----------
Requesting new TGT
.........................................................................................
--------------------------------------------------


([4.27517,4.27482,2.95808,4.22348,4.2667,4.27372,4.26674,4.26474,3.82469,4.29467  …  1.54334,1.60487,1.54553,1.54278,1.60451,1.40106,1.54265,1.69459,1.49306,1.54475],[200.0,200.0,200.0,200.0,200.0,200.0,200.0,200.0,200.0,200.0  …  200.0,200.0,200.0,200.0,200.0,200.0,200.0,200.0,200.0,200.0])

In [60]:
tables = mysql_execute(db, "show tables;")
display(tables)

Unnamed: 0,Tables_in_biomed_query_test
1,article
2,author
3,author2article
4,mesh2umls
5,mesh_descriptor
6,mesh_heading
7,mesh_qualifier


In [61]:
mesh2umls = mysql_execute(db, "select * from mesh2umls")
display(mesh2umls)

Unnamed: 0,mesh,umls
1,"adaptation, psychological",Individual Behavior
2,adolescent,Age Group
3,adult,Age Group
4,aged,Organism Attribute
5,agriculture,Occupation or Discipline
6,agrochemicals,Chemical Viewed Functionally
7,allergens,Immunologic Factor
8,animals,Animal
9,"antigens, dermatophagoides","Amino Acid, Peptide, or Protein"
10,"antigens, dermatophagoides",Immunologic Factor


### b. Filter by semantic type

In [62]:
labels2ind, occur = umls_semantic_occurrences(db, "Disease or Syndrome")

println("-------------------------------------------------------------")
println("Output Descritor to Index Dictionary")
display(labels2ind)
println("-------------------------------------------------------------")

println("-------------------------------------------------------------")
println("Output Data Matrix")
display(full(occur))
println("-------------------------------------------------------------")


Dict{Any,Any} with 9 entries:
  "rhinitis, allergic"            => 1
  "asthma, exercise-induced"      => 2
  "otitis media"                  => 3
  "rhinitis, allergic, seasonal"  => 4
  "obesity"                       => 5
  "occupational diseases"         => 6
  "asthma"                        => 7
  "respiratory tract infections"  => 8
  "rhinitis, allergic, perennial" => 9

9×10 Array{Float64,2}:
 0.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0
 1.0  0.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0  0.0

-------------------------------------------------------------
Found 9 MESH decriptor related to  Disease or Syndrome
Set(AbstractString["rhinitis, allergic","asthma, exercise-induced","otitis media","rhinitis, allergic, seasonal","obesity","occupational diseases","asthma","respiratory tract infections","rhinitis, allergic, perennial"])
-------------------------------------------------------------
-------------------------------------------------------------
Found 10 articles with valid descriptors
-------------------------------------------------------------
-------------------------------------------------------------
Output Descritor to Index Dictionary
-------------------------------------------------------------
-------------------------------------------------------------
Output Data Matrix
-------------------------------------------------------------


### Plot conditional probabilities as a histogram

In [64]:
collect(keys(labels2ind))

9-element Array{Any,1}:
 "rhinitis, allergic"           
 "asthma, exercise-induced"     
 "otitis media"                 
 "rhinitis, allergic, seasonal" 
 "obesity"                      
 "occupational diseases"        
 "asthma"                       
 "respiratory tract infections" 
 "rhinitis, allergic, perennial"

In [73]:
using PlotlyJS

trace1 = bar(;x=collect(keys(labels2ind)),
            y=sum(occur, 2)[:]./10,
            marker=attr(color="rgba(50, 171, 96, 0.7)",
            line=attr(color="rgba(50, 171, 96, 1.0)", width=2)))

data = [trace1]
layout = Layout(;margin_b = 100,
                 margin_r =100)

plot(data, layout)