# Entrez Utilities (eutils)


BiomedQuery.Entrez provides an interface to some of the functionality in the [Entrez Utility API](https://www.ncbi.nlm.nih.gov/books/NBK25501/). 

The following E-utils functions have been implemented:

- ESearch
- EFetch
- ELink
- ESummary

The following utility functions are available to handle and store NCBI responses

- EParse - Convert XML response to Julia Dict
- Saving NCBI Responses to XML
- Saving EFetch to a SQLite database
- Saving EFetch to a MySQL database

The following utility functions are available to query the database

- All PMIDs
- All MESH descriptors for an article


## Import the Module and Environment Variables


In [None]:
using BioMedQuery.Entrez
email = ENV["NCBI_EMAIL"];
umls_user = ENV["UMLS_USER"];
umls_psswd = ENV["UMLS_PSSWD"];

## 1. esearch

esearch(search_dict): Requests a list of UIDS matchin a query. The input is a dictionary specifying all requiered parameters specified in the Entrez documentation [NCBI Entrez:Esearch](http://www.ncbi.nlm.nih.gov/books/NBK25499/#chapter4.ESearch).

For instance, let's request 10 pmids for papers matching the query: (asthma[MeSH Terms]) AND ("2001/01/29"[Date - Publication] : "2010"[Date - Publication])

In [None]:
search_term = """(asthma[MeSH Terms]) AND ("2001/01/29"[Date - Publication] : "2010"[Date - Publication])"""
search_dic = Dict("db"=>"pubmed", "term" => search_term,
"retstart" => 0, "retmax"=>10,
"email" => email)
esearch_response = esearch(search_dic)

### Save the response to file

In [None]:
using XMLconvert
xmlASCII2file(esearch_response, "./esearch.xml");

### Convert to a Julia (Multi) Dictionary

In [None]:
esearch_dict = eparse(esearch_response)
println("Type of esearch_dict: ", typeof(esearch_dict))
show_key_structure(esearch_dict)

### Flatten into Dictionary for easy access

In [None]:
flat_easearch_dict = flatten(esearch_dict)
display(flat_easearch_dict)

### Get all pmids returned by esearch

In [None]:
ids = Array{Int64,1}(flat_easearch_dict["IdList-Id" ])

## 2. efetch

In [None]:
# define the fetch dictionary
fetch_dic = Dict("db"=>"pubmed","tool" =>"BioJulia",
"email" => "maria_restrepo@brown.edu", "retmode" => "xml", "rettype"=>"null")

# fetch
efetch_response = efetch(fetch_dic, ids)

### Convert to XML respose to (Multi) Dictionary

In [None]:
efetch_dict = eparse(efetch_response)
show_key_structure(efetch_dict)

## 3. Save to MySQL

In [None]:
db_config = Dict(:host=>"127.0.0.1",
                 :dbname=>"biomed_query_test",
                 :username=>"root",
                 :pswd=>"bcbi123",
                 :overwrite=>true)

db = save_efetch_mysql(efetch_dict, db_config)

### Explore the MySQL Results Database

In [None]:
using MySQL
tables = mysql_execute(db, "show tables;")
display(tables)
articles = mysql_execute(db, "select * from article limit 10")
display(articles)
authors = mysql_execute(db, "select * from author limit 10")
display(authors)

## 4. Save as publications

In [None]:
citation_config = Dict(:type => "bibtex", :output_file => "citations_test.bib", :overwrite=>true)
    save_article_citations(efetch_dict, citation_config);

# BioMedQuery.Processes

The library comes with a series a "pre-assembled" workflows. For instance, we often need to call esearc, efetch and save to database as a pipeline.

In [None]:
using BioMedQuery.Processes

### esearch, efetch, mysql_save in one line of code

In [None]:
db = pubmed_search_and_save(email, search_term, 10,
    save_efetch_mysql, db_config);

### esearch, efetch, save citations in one line of code

In [None]:
pubmed_search_and_save(email, search_term, 10,
    save_article_citations, citation_config);