# Application of NLP & Previously Done Text Mining in Practice



**Objectives of this Project?**

Using Hugging Face Transformers, you'll build a real-time web translator app!

In this class activity you'll learn how to: 
1. Install Gradio (python UI package) and Transformers *(to be able to download a pre-trained language translation model from transformers -all open source & free)*
2. Translate text using a SOTA deep learning translation pipeline (using transformer pipeline)
3. Building a lightweight ML app using Python and Gradio (we will wrap it and build a web app using gradio)

More recently, **Huggingface** released over 1,000 pre-trained language models from Helsinki University. Hugging Face is one of the leading startups in the NLP space. Big tech companies such as Apple, Monzo, and Bing use its library in production. They build NLP libraries.




---


**Machine Translation with Transformers**

Huggingface has done an incredible job making SOTA (state of the art) models available in a simple Python API for copy + paste coders. This will download the required model and translate source text -> target text. 



---



---


More resources: https://towardsdatascience.com/build-your-own-machine-translation-service-with-transformers-d0709df0791b

Learn more on Gradio (https://gradio.app/)

Links

PyTorch Installation: https://pytorch.org/get-started/locally/
Hugging Face Transformers Pipelines: https://huggingface.co/transformers/main_classes/pipelines.html#transformers.TranslationPipeline..

Hugging Face Models: https://huggingface.co/models 

## 1. Let's Install Dependencies


1.   We are going to install and import **Pytorch** , **Transformers** & **Gradio** *(Pytorch is a key dependency for transfomers)*

2.   Then we download the pre-trained model langauge translator model


In [1]:
!pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio===0.8.1 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html 
# goto pytorch.org (getting started) LTS 1.81, Windows, python, cpu/cuda
#pip3 install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio===0.8.1 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html #cpu
#pip3 install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio===0.8.1 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html #cuda11.1
#before running, drop the 3- put an ! mark


Looking in links: https://download.pytorch.org/whl/lts/1.8/torch_lts.html
Collecting torch==1.8.1+cu111
  Downloading https://download.pytorch.org/whl/lts/1.8/cu111/torch-1.8.1%2Bcu111-cp37-cp37m-linux_x86_64.whl (1982.2 MB)
[K     |█████████████▌                  | 834.1 MB 1.5 MB/s eta 0:12:59tcmalloc: large alloc 1147494400 bytes == 0x55ae79ac0000 @  0x7f456b94e615 0x55ae40bed02c 0x55ae40ccd17a 0x55ae40befe4d 0x55ae40ce1c0d 0x55ae40c640d8 0x55ae40c5ec35 0x55ae40bf173a 0x55ae40c63f40 0x55ae40c5ec35 0x55ae40bf173a 0x55ae40c6093b 0x55ae40ce2a56 0x55ae40c5ffb3 0x55ae40ce2a56 0x55ae40c5ffb3 0x55ae40ce2a56 0x55ae40c5ffb3 0x55ae40bf1b99 0x55ae40c34e79 0x55ae40bf07b2 0x55ae40c63e65 0x55ae40c5ec35 0x55ae40bf173a 0x55ae40c6093b 0x55ae40c5ec35 0x55ae40bf173a 0x55ae40c5fb0e 0x55ae40bf165a 0x55ae40c5fd67 0x55ae40c5ec35
[K     |█████████████████               | 1055.7 MB 1.3 MB/s eta 0:11:39tcmalloc: large alloc 1434370048 bytes == 0x55aebe116000 @  0x7f456b94e615 0x55ae40bed02c 0x55ae40ccd17a 

In [2]:
!pip install transformers ipywidgets gradio #installing other dependencies 
#ipiwidgets useful for first-timers when downloading the model because it gives you the progress bar
#you can drop --upgrade in (!pip install transformers ipywidgets gradio --upgrade) if you are installing for the first time, it's for the widgets

Collecting transformers
  Downloading transformers-4.9.0-py3-none-any.whl (2.6 MB)
[K     |████████████████████████████████| 2.6 MB 7.8 MB/s 
Collecting gradio
  Downloading gradio-2.2.3-py3-none-any.whl (2.4 MB)
[K     |████████████████████████████████| 2.4 MB 47.3 MB/s 
Collecting tokenizers<0.11,>=0.10.1
  Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
[K     |████████████████████████████████| 3.3 MB 45.4 MB/s 
Collecting sacremoses
  Downloading sacremoses-0.0.45-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 17.0 MB/s 
[?25hCollecting pyyaml>=5.1
  Downloading PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl (636 kB)
[K     |████████████████████████████████| 636 kB 52.9 MB/s 
Collecting huggingface-hub==0.0.12
  Downloading huggingface_hub-0.0.12-py3-none-any.whl (37 kB)
Collecting paramiko
  Downloading paramiko-2.7.2-py2.py3-none-any.whl (206 kB)
[K     |█████

In [3]:
import gradio as gr                   # UI library
from transformers import pipeline     # Transformers pipeline

## 2. Load Pipeline

In [4]:
translation_pipeline = pipeline('translation_en_to_de') # https://huggingface.co/transformers/main_classes/pipelines.html#transformers.TranslationPipeline
#opensource models https://huggingface.co/models

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1199.0, style=ProgressStyle(description…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=891691430.0, style=ProgressStyle(descri…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=791656.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1389353.0, style=ProgressStyle(descript…




In [5]:
results = translation_pipeline('The weather is cold') #store it in a results variable

In [6]:
results[0]['translation_text'] #get the first value out of the list (dictionary with '')

'Das Wetter ist kalt'

## 3. Create Gradio Function and Interface

Create a function to wrap up all the codes and UI with gradio

In [7]:
def translate_transformers(from_text):    #our fucntion & the text it wants to source from
    results = translation_pipeline(from_text) #wrapping
    return results[0]['translation_text'] #return extracted results

In [9]:
translate_transformers('My name is Steph')

'Mein Name ist Steph'

In [10]:
interface = gr.Interface(fn=translate_transformers, 
                         inputs=gr.inputs.Textbox(lines=2, placeholder='Text to translate'),
                        outputs='text')  #let's create our interface with function fn 

In [11]:
interface.launch() #let's run our web app on a browser

Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
This share link will expire in 24 hours. If you need a permanent link, visit: https://gradio.app/introducing-hosted (NEW!)
Running on External URL: https://27836.gradio.app
Interface loading below...


(<Flask 'gradio.networking'>,
 'http://127.0.0.1:7860/',
 'https://27836.gradio.app')