MTI MeSH Extraction Example
---

### Purpose
This Notebook demonstrates how MTI can be used in Python using Pyjnius to get MeSH terms from arbitary text. 

### Pyjnius Setup

In [None]:
%%capture
!pip install pyjnius

In [None]:
# Required for running in Colab
!mkdir -p /usr/lib/jvm/java-1.11.0-openjdk-amd64/jre/lib/amd64/server/
!ln -s /usr/lib/jvm/java-1.11.0-openjdk-amd64/lib/server/libjvm.so /usr/lib/jvm/java-1.11.0-openjdk-amd64/jre/lib/amd64/server/libjvm.so

In [None]:
import os
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-11-openjdk-amd64"

###  MeSH extraction using WebAPI 

#### Download JAR files

The JAR files are taken from [Web API](https://ii.nlm.nih.gov/Web_API/index.shtml) and can be download from [here](https://ii.nlm.nih.gov/Web_API/SKR_Web_API_V2_3.jar), [source code](https://github.com/ziy/skr-webapi).

`GenericBatchNew` is implemented using the example [Java code](https://github.com/ziy/skr-webapi/blob/master/src/example/java/GenericBatchNew.java) provided in the website.


In [None]:
!git clone https://gist.github.com/94e946a6916e7d2676b6efb9cc00db05.git lib

Cloning into 'lib'...
remote: Enumerating objects: 12, done.[K
remote: Counting objects:   8% (1/12)[Kremote: Counting objects:  16% (2/12)[Kremote: Counting objects:  25% (3/12)[Kremote: Counting objects:  33% (4/12)[Kremote: Counting objects:  41% (5/12)[Kremote: Counting objects:  50% (6/12)[Kremote: Counting objects:  58% (7/12)[Kremote: Counting objects:  66% (8/12)[Kremote: Counting objects:  75% (9/12)[Kremote: Counting objects:  83% (10/12)[Kremote: Counting objects:  91% (11/12)[Kremote: Counting objects: 100% (12/12)[Kremote: Counting objects: 100% (12/12), done.[K
remote: Compressing objects:  10% (1/10)[Kremote: Compressing objects:  20% (2/10)[Kremote: Compressing objects:  30% (3/10)[Kremote: Compressing objects:  40% (4/10)[Kremote: Compressing objects:  50% (5/10)[Kremote: Compressing objects:  60% (6/10)[Kremote: Compressing objects:  70% (7/10)[Kremote: Compressing objects:  80% (8/10)[Kremote: Compressing objects:  90% (9/10)

#### MeSH Extraction

The JAR Files present in `lib` directory are added to classpath.  

Function `getMeSH` sends the credentials and text to MTI WebAPI for extracting the MeSH terms and returns the result as it is.   

Individual column can be interpreted from this [MTI Output Help Information](https://ii.nlm.nih.gov/resource/MTI_output_help_info.html).

In [None]:
import jnius_config
jnius_config.add_classpath("./lib/*")

In [None]:
import getpass 
from jnius import autoclass

def getMeSH(text,username,email_id,password):
  GenericBatchNew = autoclass("GenericBatchNew")
  batch = GenericBatchNew()
  tmp_filepath = "/tmp/abstract_mti.txt"
  with open(tmp_filepath,"wb") as input_file:
    input_file.write(text.encode('ascii'))
  result = batch.processor(["--email", email_id ,tmp_filepath],username, password)
  return result

In [None]:
print("If you dont have account, register at https://uts.nlm.nih.gov/license.html")
email_id = input("Please enter Email address : ")
username = input("Please enter Username : ")
password = getpass.getpass(prompt='Please enter Password : ') 
text = input("Enter text to process : ")

print("\n\nMeSH Terms\n")
print(getMeSH(text,username,email_id,password))

If you dont have account, register at https://uts.nlm.nih.gov/license.html
Please enter Email address : pritishaw0103@gmail.com
Please enter Username : PritiShaw
Please enter Password : ··········
Enter text to process : R-HSA-164843    2-LTR circle formation  The formation of 2-LTR circles requires the action of the cellular non-homologous DNA end-joining pathway. Specifically the cellular Ku, XRCC4 and ligase IV proteins are needed. Evidence for this is provided by the observation that cells mutant in these functions do not support detectable formation of 2-LTR circles, though integration and formation of 1-LTR circles are mostly normal. The reaction takes place in the nucleus, and formation of 2-LTR circles has been used as a surrogate assay for nuclear transport. It has also been suggested that the NHEJ system affects the toxicity of retroviral infection.  R-HSA-73843 5-Phosphoribose 1-diphosphate biosynthesis  5-Phospho-alpha-D-ribose 1-diphosphate (PRPP) is a key intermediate in 