[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/emorynlp/elit/blob/dev/docs/tutorial.ipynb)

ELIT can be installed using pip, though it's not officially on PyPi yet.

In [None]:
!pip install -U git+https://github.com/emorynlp/elit.git@dev

The common workflow for ELIT is to load a model then call it as a function. Models in ELIT are represented as string typed indentifiers which are grouped by tasks. For example, let's list all the models in ELIT.

In [None]:
import elit
elit.pretrained.ALL

List all the MultiTaskLearning models:

In [None]:
elit.pretrained.mtl.ALL

ELIT offers several models for the same task with different settings. For example, the `LEM_POS_NER_DEP_SDP_CON_AMR_ROBERTA_BASE_EN` model is finetuned with RoBERTa-base. Let's load it and see what it can do.

In [None]:
mtl = elit.load(elit.pretrained.mtl.LEM_POS_NER_DEP_SDP_CON_AMR_ROBERTA_BASE_EN)

Once you call `load` on a model, ELIT will download it and load it into the main memory or the GPU if you have one. The loaded model behaves just like a function which you can pass in a list of tokenized sentences as arguments and get the annotations as the returned value.

In [None]:
doc = mtl([
     ["Emory", "NLP", "is", "a", "research", "lab", "in", "Atlanta", "."],
     ["It", "is", "founded", "by", "Jinho", "D.", "Choi", "in", "2014", ".", "Dr.", "Choi", "is", "a", "professor", "at", "Emory", "University", "."]
])
doc

As you can see, the returned `doc` is a Python dict storing outputs from different models. Refer to our GitHub docs for its format and guidelines.

In [1]:
As you can see, the returned `doc` is a Python dict storing outputs from different models. Refer to our GitHub docs for its format and guidelines.

Collecting git+https://github.com/emorynlp/elit.git@dev
  Cloning https://github.com/emorynlp/elit.git (to revision dev) to /tmp/pip-req-build-1i85sscw
  Running command git clone -q https://github.com/emorynlp/elit.git /tmp/pip-req-build-1i85sscw
  Running command git checkout -b dev --track origin/dev
  Switched to a new branch 'dev'
  Branch 'dev' set up to track remote branch 'dev' from 'origin'.
Building wheels for collected packages: elit
  Building wheel for elit (setup.py) ... [?25l[?25hdone
  Created wheel for elit: filename=elit-2.0.0a0-cp36-none-any.whl size=450825 sha256=aed0090b6dcda16e832337291a8e68994b15d1d31d3864e7028ea7842be29b50
  Stored in directory: /tmp/pip-ephem-wheel-cache-3jp631f7/wheels/6a/27/ba/27dafb4d248c33ad7841e480c3413bf938c38ff71e97b7cacb
Successfully built elit
Installing collected packages: elit
  Found existing installation: elit 2.0.0a0
    Uninstalling elit-2.0.0a0:
      Successfully uninstalled elit-2.0.0a0
Successfully installed elit-2.0.0a0


The common workflow for ELIT is to load a model then call it as a function. Models in ELIT are represented as string typed indentifiers which are grouped by tasks. For example, let's list all the models in ELIT.

In [2]:
import elit
elit.pretrained.ALL

{'DOC_COREF_SPANBERT_BASE_EN': 'https://elit-models.s3-us-west-2.amazonaws.com/v2/doc_coref_spanbert_base_20201222.zip',
 'DOC_COREF_SPANBERT_LARGE_EN': 'https://elit-models.s3-us-west-2.amazonaws.com/v2/doc_coref_spanbert_large_20201222.zip',
 'LEM_POS_NER_DEP_SDP_CON_AMR_ELECTRA_BASE_EN': 'https://elit-models.s3-us-west-2.amazonaws.com/v2/en_pos_ner_srl_dep_con_amr_electra_base_20201222.zip',
 'LEM_POS_NER_DEP_SDP_CON_AMR_ROBERTA_BASE_EN': 'https://elit-models.s3-us-west-2.amazonaws.com/v2/en_pos_ner_srl_dep_con_amr_roberta_base_20201219.zip',
 'ONLINE_COREF_SPANBERT_BASE_EN': 'https://elit-models.s3-us-west-2.amazonaws.com/v2/online_coref_spanbert_base_20201222.zip',
 'ONLINE_COREF_SPANBERT_LARGE_EN': 'https://elit-models.s3-us-west-2.amazonaws.com/v2/online_coref_spanbert_large_20201222.zip'}

List all the MultiTaskLearning models:

In [3]:
elit.pretrained.mtl.ALL

{'LEM_POS_NER_DEP_SDP_CON_AMR_ELECTRA_BASE_EN': 'https://elit-models.s3-us-west-2.amazonaws.com/v2/en_pos_ner_srl_dep_con_amr_electra_base_20201222.zip',
 'LEM_POS_NER_DEP_SDP_CON_AMR_ROBERTA_BASE_EN': 'https://elit-models.s3-us-west-2.amazonaws.com/v2/en_pos_ner_srl_dep_con_amr_roberta_base_20201219.zip'}

ELIT offers several models for the same task with different settings. For example, the `LEM_POS_NER_DEP_SDP_CON_AMR_ROBERTA_BASE_EN` model is finetuned with RoBERTa-base. Let's load it and see what it can do.

In [4]:
mtl = elit.load(elit.pretrained.mtl.LEM_POS_NER_DEP_SDP_CON_AMR_ROBERTA_BASE_EN)

Downloading https://elit-models.s3-us-west-2.amazonaws.com/v2/en_pos_ner_srl_dep_con_amr_roberta_base_20201219.zip to /root/.elit/en_pos_ner_srl_dep_con_amr_roberta_base_20201219.zip
100.00%, 488.6 MB/488.6 MB, 3.1 MB/s, ETA 0 s      
Extracting /root/.elit/en_pos_ner_srl_dep_con_amr_roberta_base_20201219.zip to /root/.elit
Downloading https://od.hankcs.com/research/amr2020/amr_3.0_utils.tgz to /root/.elit/thirdparty/od.hankcs.com/research/amr2020/amr_3.0_utils.tgz
100.00%, 3.8 MB/3.8 MB, 4.5 MB/s, ETA 0 s      
Extracting /root/.elit/thirdparty/od.hankcs.com/research/amr2020/amr_3.0_utils.tgz to /root/.elit/thirdparty/od.hankcs.com/research/amr2020
[2021-02-10 01:33:42,300 INFO] Lock 140422279085416 acquired on /root/.cache/huggingface/transformers/733bade19e5f0ce98e6531021dd5180994bb2f7b8bd7e80c7968805834ba351e.35205c6cfc956461d8515139f0f8dd5d207a2f336c0c3a83b4bc8dca3518e37b.lock


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=481.0, style=ProgressStyle(description_…

[2021-02-10 01:33:42,757 INFO] Lock 140422279085416 released on /root/.cache/huggingface/transformers/733bade19e5f0ce98e6531021dd5180994bb2f7b8bd7e80c7968805834ba351e.35205c6cfc956461d8515139f0f8dd5d207a2f336c0c3a83b4bc8dca3518e37b.lock





[2021-02-10 01:33:43,029 INFO] Lock 140422262262584 acquired on /root/.cache/huggingface/transformers/d3ccdbfeb9aaa747ef20432d4976c32ee3fa69663b379deb253ccfce2bb1fdc5.d67d6b367eb24ab43b08ad55e014cf254076934f71d832bbab9ad35644a375ab.lock


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=898823.0, style=ProgressStyle(descripti…

[2021-02-10 01:33:43,712 INFO] Lock 140422262262584 released on /root/.cache/huggingface/transformers/d3ccdbfeb9aaa747ef20432d4976c32ee3fa69663b379deb253ccfce2bb1fdc5.d67d6b367eb24ab43b08ad55e014cf254076934f71d832bbab9ad35644a375ab.lock





[2021-02-10 01:33:43,979 INFO] Lock 140422276590280 acquired on /root/.cache/huggingface/transformers/cafdecc90fcab17011e12ac813dd574b4b3fea39da6dd817813efa010262ff3f.5d12962c5ee615a4c803841266e9c3be9a691a924f72d395d3a6c6c81157788b.lock


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=456318.0, style=ProgressStyle(descripti…

[2021-02-10 01:33:44,579 INFO] Lock 140422276590280 released on /root/.cache/huggingface/transformers/cafdecc90fcab17011e12ac813dd574b4b3fea39da6dd817813efa010262ff3f.5d12962c5ee615a4c803841266e9c3be9a691a924f72d395d3a6c6c81157788b.lock





[2021-02-10 01:33:44,845 INFO] Lock 140422276590280 acquired on /root/.cache/huggingface/transformers/d53fc0fa09b8342651efd4073d75e19617b3e51287c2a535becda5808a8db287.fc9576039592f026ad76a1c231b89aee8668488c671dfbe6616bab2ed298d730.lock


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1355863.0, style=ProgressStyle(descript…

[2021-02-10 01:33:45,543 INFO] Lock 140422276590280 released on /root/.cache/huggingface/transformers/d53fc0fa09b8342651efd4073d75e19617b3e51287c2a535becda5808a8db287.fc9576039592f026ad76a1c231b89aee8668488c671dfbe6616bab2ed298d730.lock





Once you call `load` on a model, ELIT will download it and load it into the main memory or the GPU if you have one. The loaded model behaves just like a function which you can pass in a list of tokenized sentences as arguments and get the annotations as the returned value.

In [9]:
doc = mtl([
     ["Emory", "NLP", "is", "a", "research", "lab", "in", "Atlanta", "."],
     ["It", "is", "founded", "by", "Jinho", "D.", "Choi", "in", "2014", ".", "Dr.", "Choi", "is", "a", "professor", "at", "Emory", "University", "."]
])
doc

{'amr': [<AMRGraph object (top=c0) at 140422118106392>,
  <AMRGraph object (top=c0) at 140422118108128>],
 'con': [['TOP', [['S', [['NP', [['NNP', ['Emory']], ['NNP', ['NLP']]]], ['VP', [['VBZ', ['is']], ['NP', [['NP', [['DT', ['a']], ['NN', ['research']], ['NN', ['lab']]]], ['PP', [['IN', ['in']], ['NP', [['NNP', ['Atlanta']]]]]]]]]], ['.', ['.']]]]]],
  ['TOP', [['S', [['S', [['NP', [['PRP', ['It']]]], ['VP', [['VBZ', ['is']], ['VP', [['VBN', ['founded']], ['PP', [['IN', ['by']], ['NP', [['NNP', ['Jinho']], ['NNP', ['D.']], ['NNP', ['Choi']]]]]], ['PP', [['IN', ['in']], ['NP', [['CD', ['2014']]]]]]]]]]]], ['.', ['.']], ['NP', [['NNP', ['Dr.']], ['NNP', ['Choi']]]], ['VP', [['VBZ', ['is']], ['NP', [['NP', [['DT', ['a']], ['NN', ['professor']]]], ['PP', [['IN', ['at']], ['NP', [['NNP', ['Emory']], ['NNP', ['University']]]]]]]]]], ['.', ['.']]]]]]],
 'dep': [[(2, 'com'),
   (6, 'nsbj'),
   (6, 'cop'),
   (6, 'det'),
   (6, 'com'),
   (0, 'root'),
   (8, 'case'),
   (6, 'ppmod'),
   (6, 

As you can see, the returned `doc` is a Python dict storing outputs from different models. Refer to our GitHub docs for its format and guidelines.