<a href="https://colab.research.google.com/github/mrm8488/shared_colab_notebooks/blob/master/Huggingface_pipelines_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Demo of [Huggingface Transformers](https://github.com/huggingface/transformers) pipelines

New in version `v2.3`: `Pipeline` are high-level objects which automatically handle tokenization, running your data through a transformers model
and outputting the result in a structured object.

You can create `Pipeline` objects for the following down-stream tasks:

 - `feature-extraction`: Generates a tensor representation for the input sequence
 - `ner`: Generates named entity mapping for each word in the input sequence.
 - `sentiment-analysis`: Gives the polarity (positive / negative) of the whole input sequence.
 - `question-answering`: Provided some context and a question refering to the context, it will extract the answer to the question
 in the context.

 > Colab creator: [Manuel Romero](https://twitter.com/mrm8488)

In [1]:
%tensorflow_version 2.x

TensorFlow 2.x selected.


In [0]:
!pip install -q transformers

In [0]:
from transformers import pipeline

## 1. Sentiment Analysis

In [5]:
nlp_sentiment_analysis = pipeline("sentiment-analysis")
text_sentiment = "We are very happy to include pipeline into the transformers repository"
nlp_sentiment_analysis(text_sentiment)

[{'label': 'POSITIVE', 'score': 0.99687505}]

## 2. Question Answering

In [6]:
nlp_qa = pipeline("question-answering")
context = "Pipeline have been included in the huggingface/transformers repository"
question = "What is the name of the repository?"
nlp_qa({
    'question': question,
    'context': context
})

Converting examples to features: 100%|██████████| 1/1 [00:00<00:00, 181.11it/s]


{'answer': 'huggingface/transformers',
 'end': 59,
 'score': 0.2875603414669605,
 'start': 35}

## 3. NER
Not working at 01/31/2020

In [0]:
nlp_ner = pipeline("ner")
text_ner = "We are very happy to include pipeline into the transformers repository"
nlp_ner(text_ner)

## 4. Feature Extraction

In [0]:
nlp_fe = pipeline("feature-extraction")
text_fe = "We are very happy to include pipeline into the transformers repository"
nlp_fe(text_fe)

## 5. Bonus Forms (sentiment-analysis, ner, feature-extraction)

In [9]:
#@title Choose a pipeline and write a text { run: "auto" }
task = 'sentiment-analysis' #@param ["sentiment-analysis", "ner", "feature-extraction"]
text = 'We are very happy to include pipeline into the transformers repository.' #@param {type:"string"}
nlp = pipeline(task)
nlp(text)

[{'label': 'POSITIVE', 'score': 0.99781936}]

## question-answering

In [14]:
#@title Write a context and a question { run: "auto" }
context = 'Bitcoin[a] (\u20BF) is a cryptocurrency. It is a decentralized digital currency without a central bank or single administrator that can be sent from user to user on the peer-to-peer bitcoin network without the need for intermediaries.[8]  Transactions are verified by network nodes through cryptography and recorded in a public distributed ledger called a blockchain. Bitcoin was invented in 2008 by an unknown person or group of people using the name Satoshi Nakamoto[15] and started in 2009[16] when its source code was released as open-source software.[7]:ch. 1 Bitcoins are created as a reward for a process known as mining. They can be exchanged for other currencies, products, and services.[17] Research produced by University of Cambridge estimates that in 2017, there were 2.9 to 5.8 million unique users using a cryptocurrency wallet, most of them using bitcoin.[18]  Bitcoin has been criticized for its use in illegal transactions, its high electricity consumption, price volatility, and thefts from exchanges. Some economists, including several Nobel laureates, have characterized it as a speculative bubble. Bitcoin has also been used as an investment, although several regulatory agencies have issued investor alerts about bitcoin.[19][20]' #@param {type:"string"}
question = 'How are Bitcoins created?' #@param {type:"string"}
nlp_qa = pipeline("question-answering")
nlp_qa({
    'question': question,
    'context': context
})

Converting examples to features: 100%|██████████| 1/1 [00:00<00:00, 110.54it/s]


{'answer': 'as a reward for a process known as mining.',
 'end': 623,
 'score': 0.5793336972202461,
 'start': 581}

## 6. BONUS (II) Mask filling with **umBERTo**, an Italian <img alt="🇮🇹" draggable="false" src="https://abs-0.twimg.com/emoji/v2/svg/1f1ee-1f1f9.svg" width="32" height="32"> Language Model trained with Whole Word Masking.

(Not working yet at Not working at 01/31/2020. Source code: https://github.com/musixmatchresearch/umberto)

In [0]:
nlp_fill_mask_ita = pipeline(
	"fill-mask",
	model="Musixmatch/umberto-commoncrawl-cased-v1",
	tokenizer="Musixmatch/umberto-commoncrawl-cased-v1"
)

nlp_fill_mask_ita("Umberto Eco è <mask> un grande scrittore")