# NLP Techniques Demo

You can find examples like the ones below on https://demo.allennlp.org/

Simply choose one of the tasks that you'd like to perform and see the code example under 'Model usage'.

In the lab, you could use this environment to run:

`conda activate /usr/shared/CMPT/big-data/condaenv/allenv`

In [1]:
# install pytorch
pip install allennlp
!pip install allennlp-models
!pip install nltk
!pip install pyyaml# !pip install allennlp==2.1.0 allennlp-models==2.1.0
# !pip install nltk
# !pip install pyyaml

In [2]:
from allennlp.predictors.predictor import Predictor

In [None]:
predictor_cp = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/elmo-constituency-parser-2020.02.10.tar.gz")

In [4]:
predictor_cp.predict(
    sentence="If you bring $10 with you tomorrow, can you pay for me to eat too?."
)

Your label namespace was 'pos'. We recommend you use a namespace ending with 'labels' or 'tags', so we don't add UNK and PAD tokens by default to your vocabulary.  See documentation for `non_padded_namespaces` parameter in Vocabulary.


{'class_probabilities': [[0.9999599456787109,
   2.9692400881770054e-12,
   1.5278754891584652e-14,
   1.970977336895885e-08,
   1.128296379206139e-18,
   4.1600816182445124e-16,
   6.99259372627381e-10,
   1.188018998732332e-07,
   4.227027172731823e-09,
   5.943435610333836e-08,
   8.072943118305701e-15,
   3.496096370302765e-13,
   1.940796634514186e-09,
   1.8695089920583996e-10,
   1.1191585436165497e-08,
   3.471339732641354e-05,
   1.3602249684138527e-14,
   1.1606218373524935e-12,
   7.137027635550908e-10,
   4.514462761950888e-11,
   4.997533498141138e-10,
   2.5343595666527108e-08,
   1.7528495988017312e-08,
   6.339585106873713e-11,
   2.9859347705496475e-06,
   1.6750734133097467e-09,
   2.6518904260031118e-12,
   7.229919560813869e-07,
   2.396788713679432e-12,
   1.8009447266820189e-09,
   2.153571543317412e-08,
   5.221314491876683e-09,
   1.1788463325501652e-06,
   6.286915432696105e-11,
   8.054441835714243e-11,
   3.540515791655707e-08,
   1.9490691283152728e-09,
   7

#### Constituency Parsing

In [5]:
x = predictor_cp.predict(sentence="Scotiabank raises interest rates")
print(x['hierplane_tree'])

{'linkNameToLabel': {'.': 'pos', ',': 'pos', '-LRB-': 'pos', '-RRB-': 'pos', '``': 'pos', '""': 'pos', "''": 'pos', ':': 'pos', '$': 'pos', '#': 'pos', 'AFX': 'pos', 'CC': 'pos', 'CD': 'pos', 'DT': 'pos', 'EX': 'pos', 'FW': 'pos', 'HYPH': 'pos', 'IN': 'pos', 'JJ': 'pos', 'JJR': 'pos', 'JJS': 'pos', 'LS': 'pos', 'MD': 'pos', 'NIL': 'pos', 'NN': 'pos', 'NNP': 'pos', 'NNPS': 'pos', 'NNS': 'pos', 'PDT': 'pos', 'POS': 'pos', 'PRP': 'pos', 'PRP$': 'pos', 'RB': 'pos', 'RBR': 'pos', 'RBS': 'pos', 'RP': 'pos', 'SP': 'pos', 'SYM': 'pos', 'TO': 'pos', 'UH': 'pos', 'VB': 'pos', 'VBD': 'pos', 'VBG': 'pos', 'VBN': 'pos', 'VBP': 'pos', 'VBZ': 'pos', 'WDT': 'pos', 'WP': 'pos', 'WP$': 'pos', 'WRB': 'pos', 'ADD': 'pos', 'NFP': 'pos', 'GW': 'pos', 'XX': 'pos', 'BES': 'pos', 'HVS': 'pos', '_SP': 'pos'}, 'nodeTypeToStyle': {'.': ['color0'], ',': ['color0'], '-LRB-': ['color0'], '-RRB-': ['color0'], '``': ['color0'], '""': ['color0'], "''": ['color0'], ':': ['color0'], '$': ['color0'], '#': ['color0'], 'AFX

#### Dependency Parsing

In [6]:
predictor_dp = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/biaffine-dependency-parser-ptb-2020.04.06.tar.gz")

In [7]:
x = predictor_dp.predict(sentence="Scotiabank raises interest rates")

import yaml
print(yaml.dump(x['hierplane_tree']['root']))
# see also https://allenai.github.io/hierplane/

attributes:
- VERB
children:
- attributes:
  - NOUN
  link: nsubj
  nodeType: nsubj
  spans:
  - end: 11
    start: 0
  word: Scotiabank
- attributes:
  - NOUN
  children:
  - attributes:
    - NOUN
    link: nn
    nodeType: nn
    spans:
    - end: 27
      start: 18
    word: interest
  link: dobj
  nodeType: dobj
  spans:
  - end: 33
    start: 27
  word: rates
link: root
nodeType: root
spans:
- end: 18
  start: 11
word: raises



#### Entity Recognition

In [8]:
predictor_ner = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/ner-model-2020.02.10.tar.gz")



In [9]:
x= predictor_ner.predict(sentence="Barack Obama went to Paris")
print(list(zip(x['tags'], x['words'])))

[('B-PER', 'Barack'), ('L-PER', 'Obama'), ('O', 'went'), ('O', 'to'), ('U-LOC', 'Paris')]


#### Sentiment Analysis

In [10]:
import nltk
nltk.download('vader_lexicon')
from nltk.sentiment.vader import SentimentIntensityAnalyzer
sid = SentimentIntensityAnalyzer()
sid.polarity_scores("I am happy today")

[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /home/sbergner/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


{'neg': 0.0, 'neu': 0.351, 'pos': 0.649, 'compound': 0.5719}

#### POS Tagging

In [11]:
import nltk
from nltk import word_tokenize
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
text = word_tokenize("He would not accept anything of value from those he was writing about")
nltk.pos_tag(text)

[nltk_data] Downloading package punkt to /home/sbergner/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/sbergner/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


[('He', 'PRP'),
 ('would', 'MD'),
 ('not', 'RB'),
 ('accept', 'VB'),
 ('anything', 'NN'),
 ('of', 'IN'),
 ('value', 'NN'),
 ('from', 'IN'),
 ('those', 'DT'),
 ('he', 'PRP'),
 ('was', 'VBD'),
 ('writing', 'VBG'),
 ('about', 'IN')]