## Hello!
Welcome to Demo-4, the following notebook will be showing you what you can do with 1stDayKit's multifarious ML-vision tools! Specifically, we will be looking at the following 1stDayKit submodules (all based on huggingface's transformer repo):
* Text Generator
* Text Summarizer
* Text Sentiment Analysis
* Text Question-Answering
* Text Translation (certain language-pairs only)

**Warning!**: The following demo notebook will trigger automatic downloading of heavy pretrained model weights.

Have fun!

---------------------

### 0. Importing Packages & Dependencies

In [1]:
#Import libs
from src.core.text_gen import TextGen_Base as TGL
from src.core.qa import QuesAns
from src.core.summarize import Summarizer
from src.core.sentiment import SentimentAnalyzer
from src.core.translate import Translator_M
from src.core.utils import utils

from PIL import Image
from pprint import pprint

import os
import matplotlib.pyplot as plt
import numpy

### 1. Simple Look at 1stDayKit NLP-Models

#### 1. Looking at Text Generation
Feel free to play around with all 4 variants of Text-Generator that we have provided in 1stDayKit. They are as follow in ascending order of computational demand:
* TextGen_Lite
* TextGen_Base
* TextGen_Large
* TextGen_XL

**Warning!**: If your machine does not meet the minimum computation requirement while running some of the larger models, it may crash!

In [None]:
#Initialization
textgen = TGL(name="Text Generator",max_length=16,num_return_sequences=3)

In [23]:
#Infer & Visualize
output = textgen.predict("Let me say")
textgen.visualize(output)

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence


[{'generated_text': 'Let me say as an atheist, I do believe that our religion '
                    'is a real'},
 {'generated_text': "Let me say this: in the last week I've found myself at "
                    'the bottom'},
 {'generated_text': 'Let me say thank you one more time – because of God we '
                    'will not be'}]


**Note**: <br>
Want to find out more on what GPT-2 (the model underlying 1stDayKit's TextGen modules) can do? Check out this cool blogpost on *poetry* generation with GPT-2 with some sweet examples! <br>
* Link: https://www.gwern.net/GPT-2

#### 2. Looking at Question & Answering

In [3]:
#Initialization
QA = QuesAns()

In [8]:
#Setup questions and answer and infer
context = """ Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
a model on a SQuAD task, you may leverage the `run_squad.py`."""

question = "What is extractive question answering?"

question_answer = {'question':question,'context':context}

In [10]:
#Infer and visualize
output = QA.predict(question_answer)
QA.visualize(output)

{'answer': 'the task of extracting an answer from a text given a question.',
 'end': 96,
 'score': 0.6185597764655871,
 'start': 34}


#### 3. Looking at Text Summarizer

In [2]:
#Initialize
SM = Summarizer()

Some weights of BartForConditionalGeneration were not initialized from the model checkpoint at facebook/bart-large-cnn and are newly initialized: ['final_logits_bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [3]:
#Setup text to summarize
main_text_to_summarize = """ New York (CNN)When Liana Barrientos was 23 years old, she got married in Westchester County, New York.
A year later, she got married again in Westchester County, but to a different man and without divorcing her first husband.
Only 18 days after that marriage, she got hitched yet again. Then, Barrientos declared "I do" five more times, sometimes only within two weeks of each other.
In 2010, she married once more, this time in the Bronx. In an application for a marriage license, she stated it was her "first and only" marriage.
Barrientos, now 39, is facing two criminal counts of "offering a false instrument for filing in the first degree," referring to her false statements on the
2010 marriage license application, according to court documents.
Prosecutors said the marriages were part of an immigration scam.
On Friday, she pleaded not guilty at State Supreme Court in the Bronx, according to her attorney, Christopher Wright, who declined to comment further.
After leaving court, Barrientos was arrested and charged with theft of service and criminal trespass for allegedly sneaking into the New York subway through an emergency exit, said Detective
Annette Markowski, a police spokeswoman. In total, Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002.
All occurred either in Westchester County, Long Island, New Jersey or the Bronx. She is believed to still be married to four men, and at one time, she was married to eight men at once, prosecutors say.
Prosecutors said the immigration scam involved some of her husbands, who filed for permanent residence status shortly after the marriages.
Any divorces happened only after such filings were approved. It was unclear whether any of the men will be prosecuted.
The case was referred to the Bronx District Attorney\'s Office by Immigration and Customs Enforcement and the Department of Homeland Security\'s
Investigation Division. Seven of the men are from so-called "red-flagged" countries, including Egypt, Turkey, Georgia, Pakistan and Mali.
Her eighth husband, Rashid Rajput, was deported in 2006 to his native Pakistan after an investigation by the Joint Terrorism Task Force.
If convicted, Barrientos faces up to four years in prison.  Her next court appearance is scheduled for May 18.
"""

In [4]:
#Infer
output = SM.predict(main_text_to_summarize)
SM.visualize(main_text_to_summarize,output)

{'Raw': ' New York (CNN)When Liana Barrientos was 23 years old, she got '
        'married in Westchester County, New York.\n'
        'A year later, she got married again in Westchester County, but to a '
        'different man and without divorcing her first husband.\n'
        'Only 18 days after that marriage, she got hitched yet again. Then, '
        'Barrientos declared "I do" five more times, sometimes only within two '
        'weeks of each other.\n'
        'In 2010, she married once more, this time in the Bronx. In an '
        'application for a marriage license, she stated it was her "first and '
        'only" marriage.\n'
        'Barrientos, now 39, is facing two criminal counts of "offering a '
        'false instrument for filing in the first degree," referring to her '
        'false statements on the\n'
        '2010 marriage license application, according to court documents.\n'
        'Prosecutors said the marriages were part of an immigration scam.\n'
        'O

**Note**:
* Please note that the summarizer is not perfect (as is with all ML models)! See that the model has wrongly concluded that Liana Barrientos got charged, whereby in fact the ruling on said charges was not available at the time of writing of the main text.
* However, this does not diminish significantly the fact that a summarizer as such would still be useful (and indeed much more accurate with further training) in many real-world applications. 

#### 4. Looking at Text Sentiment Analyzer 

In [2]:
#Initialize
ST = SentimentAnalyzer()

In [8]:
#Setup texts. Let's try a bunch of them.
main_text_to_analyze = ["The food is not too hot, which makes it just right.",
                        "The weather is not looking too good today",
                        "The sky is looking a bit gloomy, time to catch a nap!",
                        "War is what it is",
                        "Superheroes are mere child fantasies"]

In [9]:
#Infer
output = ST.predict(main_text_to_analyze)
ST.visualize(main_text_to_analyze,output)

{'Confidence': 0.9998283386230469,
 'Raw Text': 'The food is not too hot, which makes it just right.',
 'Sentiment': 'POSITIVE'}
{'Confidence': 0.9997212290763855,
 'Raw Text': 'The weather is not looking too good today',
 'Sentiment': 'NEGATIVE'}
{'Confidence': 0.9982035756111145,
 'Raw Text': 'The sky is looking a bit gloomy, time to catch a nap!',
 'Sentiment': 'NEGATIVE'}
{'Confidence': 0.9993534684181213,
 'Raw Text': 'War is what it is',
 'Sentiment': 'NEGATIVE'}
{'Confidence': 0.8927153944969177,
 'Raw Text': 'Superheroes are mere child fantasies',
 'Sentiment': 'NEGATIVE'}


**Note**: Interesting! See that there are gaps still at times in the language model when it comes to tricky statements.

#### 5. Looking at Text Translator

We will be using the **MarianMT** series of pre-trained language models available on HuggingFace. More info and documentation can be found at https://huggingface.co/transformers/model_doc/marian.html.

In [4]:
#Initialize
Trans = Translator_M(task='Helsinki-NLP/opus-mt-en-ROMANCE')

In [7]:
#Setup texts
text_to_translate = ['>>fr<< this is a sentence in english that we want to translate to french',
                     '>>pt<< This should go to portuguese',
                     '>>es<< And this to Spanish']

In [8]:
#Infer
output = Trans.predict(text_to_translate)
output

["c'est une phrase en anglais que nous voulons traduire en français",
 'Isto deve ir para o português.',
 'Y esto al español']

In [10]:
Trans.visualize(text_to_translate,output)

{'Raw Text': ['>>fr<< this is a sentence in english that we want to translate '
              'to french',
              '>>pt<< This should go to portuguese',
              '>>es<< And this to Spanish'],
 'Task': 'opus-mt-en-ROMANCE',
 'Translation': ["c'est une phrase en anglais que nous voulons traduire en "
                 'français',
                 'Isto deve ir para o português.',
                 'Y esto al español']}


In [11]:
#Setup texts longer text!
text_to_translate = ['>>fr<< Liana Barrientos, 39, is charged with two counts of "offering a false instrument for filing in the first degree" In total, she has been married 10 times, with nine of her marriages occurring between 1999 and 2002. She is believed to still be married to four men.']

In [12]:
#Infer
output = Trans.predict(text_to_translate)
output

["Liana Barrientos, 39 ans, est accusée de deux chefs d'accusation pour « offrir un faux instrument pour le dépôt au premier degré » Au total, elle a été mariée 10 fois, neuf de ses mariages se produisant entre 1999 et 2002. On pense qu'elle est toujours mariée à quatre hommes."]

**Google Translate from French to English**<br>
Liana Barrientos, 39, is charged with two counts of 'offering a false instrument for first degree deposition' In total, she has been married 10 times, with nine of her marriages occurring between 1999 and 2002. One thinks she's still married to four men. "

Not bad.


______________

### Thank You!