In [30]:
pip install transformers 



Fill Mask

In [31]:
from transformers import pipeline

In [32]:
unmasker = pipeline('fill-mask', model='bert-base-cased')

Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [33]:
unmasker("Hello, My name is [MASK].")

[{'score': 0.007879078388214111,
  'sequence': 'Hello, My name is David.',
  'token': 1681,
  'token_str': 'David'},
 {'score': 0.00730734970420599,
  'sequence': 'Hello, My name is Kate.',
  'token': 5036,
  'token_str': 'Kate'},
 {'score': 0.007054022513329983,
  'sequence': 'Hello, My name is Sam.',
  'token': 2687,
  'token_str': 'Sam'},
 {'score': 0.006197034846991301,
  'sequence': 'Hello, My name is James.',
  'token': 1600,
  'token_str': 'James'},
 {'score': 0.006146736443042755,
  'sequence': 'Hello, My name is Charlie.',
  'token': 4117,
  'token_str': 'Charlie'}]

Summerization

In [34]:
summarizer = pipeline("summarization")

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 (https://huggingface.co/sshleifer/distilbart-cnn-12-6)


In [35]:
ARTICLE = """The Apollo program, also known as Project Apollo, was the third United States human spaceflight program carried out by the National Aeronautics and Space Administration (NASA), which accomplished landing the first humans on the Moon from 1969 to 1972.
First conceived during Dwight D. Eisenhower's administration as a three-man spacecraft to follow the one-man Project Mercury which put the first Americans in space,
Apollo was later dedicated to President John F. Kennedy's national goal of "landing a man on the Moon and returning him safely to the Earth" by the end of the 1960s, which he proposed in a May 25, 1961, address to Congress.
Project Mercury was followed by the two-man Project Gemini (1962-66).
The first manned flight of Apollo was in 1968.
Apollo ran from 1961 to 1972, and was supported by the two-man Gemini program which ran concurrently with it from 1962 to 1966.
Gemini missions developed some of the space travel techniques that were necessary for the success of the Apollo missions.
Apollo used Saturn family rockets as launch vehicles.
Apollo/Saturn vehicles were also used for an Apollo Applications Program, which consisted of Skylab, a space station that supported three manned missions in 1973-74, and the Apollo-Soyuz Test Project, a joint Earth orbit mission with the Soviet Union in 1975.
"""

In [36]:
summary=summarizer(ARTICLE, max_length=130, min_length=30, do_sample=False)[0]

In [37]:
print(summary['summary_text'])

 The Apollo program, also known as Project Apollo, was the third U.S. human spaceflight program . The first manned flight of Apollo was in 1968 . It was followed by the two-man Project Gemini (1962-66) which ran concurrently with it .


Zero shot classification

In [38]:
classifier_zsl = pipeline("zero-shot-classification")

No model was supplied, defaulted to facebook/bart-large-mnli (https://huggingface.co/facebook/bart-large-mnli)


In [39]:
sequence_to_classify = "Bill gates founded a company called Microsoft in the year 1975"
candidate_labels = ["Europe", "Sports",'Leadership','business', "politics","startup"]

In [40]:
classifier_zsl(sequence_to_classify, candidate_labels)

{'labels': ['business',
  'startup',
  'Leadership',
  'Europe',
  'Sports',
  'politics'],
 'scores': [0.6144781708717346,
  0.18745461106300354,
  0.18227888643741608,
  0.006684558000415564,
  0.006318550556898117,
  0.002785284770652652],
 'sequence': 'Bill gates founded a company called Microsoft in the year 1975'}

Translation

In [41]:
# English to German
translator_ger = pipeline("translation_en_to_de")

No model was supplied, defaulted to t5-base (https://huggingface.co/t5-base)


In [42]:
print("German: ",translator_ger("Joe Biden became the 46th president of U.S.A.", max_length=40)[0]['translation_text'])

German:  Joe Biden wurde der 46. Präsident der USA.


In [43]:
# English to French
translator_fr = pipeline('translation_en_to_fr')

No model was supplied, defaulted to t5-base (https://huggingface.co/t5-base)


In [44]:
print("French: ",translator_fr("Joe Biden became the 46th president of U.S.A",  max_length=40)[0]['translation_text'])

French:  Joe Biden est devenu le 46e président des États-Unis


Question-Answering

In [45]:
nlp = pipeline("question-answering")
context = r"""
Microsoft was founded by Bill Gates and Paul Allen in 1975.
The property of being prime (or not) is called primality.
A simple but slow method of verifying the primality of a given number n is known as trial division.
It consists of testing whether n is a multiple of any integer between 2 and itself.
Algorithms much more efficient than trial division have been devised to test the primality of large numbers.
These include the Miller-Rabin primality test, which is fast but has a small probability of error, and the AKS primality test, which always produces the correct answer in polynomial time but is too slow to be practical.
Particularly fast methods are available for numbers of special forms, such as Mersenne numbers.
As of January 2016, the largest known prime number has 22,338,618 decimal digits.
"""

No model was supplied, defaulted to distilbert-base-cased-distilled-squad (https://huggingface.co/distilbert-base-cased-distilled-squad)


In [46]:
result = nlp(question="What is a simple method to verify primality?", context=context)
print(f"Answer 1: '{result['answer']}'")

Answer 1: 'trial division'


In [47]:
result = nlp(question="When did Bill gates founded Microsoft?", context=context)
print(f"Answer 2: '{result['answer']}'")

Answer 2: '1975'
