## Variety FIle Using Various Transformers & HuggingFace API

#### Import Necessary Libraries

In [1]:
import pandas as pd
from transformers import pipeline

### Text Used for Summarization (Source: https://www.bbc.com/news/business-62446937)

In [2]:
text_to_use = '''Two months later and Easyjet has confirmed that her luggage has been permanently lost. As headlines and social media posts around the world have shown in recent months, Ms Loucas's case is not unique, with some  calling it "the summer of lost luggage". The situation has been blamed on staff shortages both at the carriers, the airport security staff that have to scan all the checked-in luggage, and the ground handling firms that are typically employed to get all these suitcases and bags onto planes and then back to carousels.
With many of these teams seeing redundancies during the pandemic, they now can't cope with the pent-up demand to go abroad on holiday again. It has led to images of hundreds of missing suitcases piled up in warehouses.
And one insurance firm, Spain's Mapfre, said the number of passengers reporting missing luggage this summer was 30% higher than in 2019, the last year of normal travel before the pandemic. While no global estimates are yet available for the volume of delayed or lost luggage so far this year, data for 2019 shows that the problem has always existed.
That year 19 million bags and suitcases were late arriving around the world, and 1.3 million were never seen again, according to an annual report by SITA, a provider of baggage software. To try to keep tabs on their items of luggage, a growing number of passengers are turning to technology. Apple has reportedly seen a rise in sales of its AirTag tracking device. The AirTag works by sending out a secure Bluetooth signal that can be detected by nearby devices in the Find My network. These devices send the AirTag's location to iCloud, allowing the user to go to the Find My app and see it on a map.
In other words, you can see exactly where your missing suitcase is, via your smartphone or computer. Other travellers are attaching trackers that use GPS to their luggage. Yet while such tagging devices may give a passenger peace of mind, travel industry expert Eric Leopold says they don't solve the core issue - stopping the backlogs that prevent bags from catching the same flights as their owners.
"Tracking the bags is helpful when 99% arrive on time and 1% are mishandled, but when thousands of bags are stuck in London or elsewhere, the tags are not helping move the piles of bags," says Mr Leopold, who is the founder of Threedot.
New Tech Economy is a series exploring how technological innovation is set to shape the new emerging economic landscape. SeeTrue is one company that hopes to help airports and airlines get luggage onto planes more efficiently in the first place. The Israeli firm makes software that can do the security scans on check-in luggage much faster than human security staff.
"SeeTrue uses artificial intelligence and computer vision algorithms to discover prohibited items in bags," says executive Assaf Frenkel. "It connects to the existing X-ray and CT scanners, and detects in real-time, faster and more accurately than most human eyes, always on, and never getting tired or distracted.
"As a result, baggage is delivered on time to planes and not left behind." For UK tech firm AirPortr, its approach to tackling the problem is to remove the need for passengers to have to queue up at the airport to check in their luggage before their flight.
Instead passengers can use its app and website to arrange for their luggage to be taken door-to-door. Currently available for British Airways and Swiss International Air Lines flights between London and Geneva, an AirPortr worker will pick up a person's suitcase from their home. This driver will then take it to the departure airport's luggage area in the bowels of the terminal building for check-in, rather than going into the departure lounge.
Then at the destination airport, one of AirPortr's transportation partners will pick up the suitcases and deliver them to the person's destination address. Fees start from around £40 for one item of luggage, one way, if you don't mind your suitcase being picked up the day before you fly. But prices can be more than double that if you want your luggage collected during a specific one-hour slot on the day. The cost also rises the further you are from the airport. AirPortr's chief executive Randel Darby set up the firm in 2013, saying he was so frustrated that baggage was "travelling in the same way we have done for almost a century of commercial aviation". His aim is to expand the service around the world, and rather than just aiming it at business travellers, he hopes for it to ultimately become a "utility" service used by all types of holidaymakers.
Mr Darby even believes that airlines and airport operators will start to subside people's use of AirPortr, because it is "more cost effective than handling passengers checking in their luggage on-airport". Yet despite such technical solutions, passengers also want airlines to employ a few more customer care workers.'''

### Masked Language Modeling

#### Example 1

In [3]:
unmasker = pipeline('fill-mask', model="bert-base-uncased")
result = unmasker("This lady works as a [MASK].")
print("This lady works as a ____.", end=' ')
print([r["token_str"] for r in result])

NOTE: Redirects are currently not supported in Windows or MacOs.
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


This lady works as a ____. ['maid', 'nurse', 'teacher', 'waitress', 'prostitute']


#### Example 2

In [4]:
result = unmasker("This gentleman works as a [MASK].")
print("This gentleman works as a ____.", end=' ')
print([r["token_str"] for r in result])

This gentleman works as a ____. ['carpenter', 'farmer', 'tailor', 'merchant', 'waiter']


#### Example 3

In [5]:
result2 = unmasker("The [MASK] are going to win the Super Bowl this year.")
print("The ___ are going to win the Super Bowl this year.", end=' ')
print([r["token_str"] for r in result2])

The ___ are going to win the Super Bowl this year. ['patriots', 'steelers', 'cowboys', 'redskins', 'bills']


### Named Entity Recognition

In [6]:
ner_tagger = pipeline("ner", aggregation_strategy='simple')
outputs = ner_tagger(text_to_use)
pd.DataFrame(outputs)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Unnamed: 0,entity_group,score,word,start,end
0,ORG,0.997742,Easyjet,21,28
1,PER,0.99633,Loucas,172,178
2,LOC,0.999744,Spain,774,779
3,ORG,0.980514,Mapfre,782,788
4,ORG,0.998437,SITA,1249,1253
5,ORG,0.996588,Apple,1392,1397
6,MISC,0.842625,AirTag,1441,1447
7,MISC,0.970572,Air,1469,1472
8,ORG,0.490607,##T,1472,1473
9,MISC,0.948844,##ag,1473,1475


### Text Classification

In [7]:
classifier = pipeline("text-classification")

clf_text = "The sun will come out tomorrow!"

output = classifier(clf_text)
pd.DataFrame(output)

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Unnamed: 0,label,score
0,POSITIVE,0.999357


### Text Summarization

In [8]:
summarizer = pipeline("summarization")
summary = summarizer(text_to_use, max_length=75, clean_up_tokenization_spaces=True)
print("The Summary is: ", summary)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


The Summary is:  [{'summary_text': ' Staff shortages blamed on staff shortages at airports and ground handling firms. Airportr app and website AirPortr allows passengers to arrange for luggage to be picked up door-to-door. SeeTrue is one company that hopes to help airports and airlines get luggage onto planes more efficiently.'}]
