# Hugginface and Transformers
- Transformer sind keine Actionfiguren sondern eine Deeplearning Architektur 
- Hugginface ist eine open source firma die bestimmte fertige deep learning modelle zur Verfügung stellt
- Wir müssen dazu deeplearning bibliotheken lokal installieren
- Es gibt zwei wichtige: pytorch (Facebook) und tensorflow (Google)
- more https://huggingface.co/transformers/task_summary.html

In [3]:
#!pip install transformers

In [4]:
#!pip install torch torchvision

## Sentiment detection jetzt erst recht!

In [5]:
from transformers import pipeline

# Allocate a pipeline for sentiment-analysis
classifier = pipeline('sentiment-analysis')
classifier('We are very happy to include pipeline into the transformers repository.')

[{'label': 'POSITIVE', 'score': 0.9978193640708923}]

In [6]:
classifier('I am really sad that things had to be so terrible with this lockdown.')

[{'label': 'NEGATIVE', 'score': 0.9989656209945679}]

In [7]:
classifier('Sun is shining out of my ass.')

[{'label': 'POSITIVE', 'score': 0.9989978671073914}]

## Question Answering - ziemlich cool!
sog. Extractive Question Answering

In [9]:
from transformers import pipeline

# Allocate a pipeline for question-answering
question_answerer = pipeline('question-answering')
question_answerer({'question': 'What is the name of the repository ?',
                   'context': 'Pipeline have been included in the huggingface/transformers repository'
                  })

{'score': 0.513595461845398,
 'start': 35,
 'end': 59,
 'answer': 'huggingface/transformers'}

In [10]:
question_answerer({'question': 'Where has the tower been built?',
                   'context': 'The Eiffel tower has been built in the 18th century and stands in Paris. '
                  })

{'score': 0.9417340159416199, 'start': 66, 'end': 72, 'answer': 'Paris.'}

In [11]:
question_answerer({'question': 'How hot is the sun?',
                   'context': 'The Eiffel tower has been built in the 18th century and stands in Paris. '
                  })

{'score': 0.5862139463424683, 'start': 66, 'end': 72, 'answer': 'Paris.'}

# Lückentext ausfüllen
sog. Masked Language Modeling

In [12]:
from transformers import pipeline
nlp = pipeline("fill-mask")

Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at distilroberta-base and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [13]:
from pprint import pprint
pprint(nlp(f"HuggingFace is creating a {nlp.tokenizer.mask_token} that the community uses to solve NLP tasks."))

[{'score': 0.17927402257919312,
  'sequence': '<s>HuggingFace is creating a tool that the community uses to '
              'solve NLP tasks.</s>',
  'token': 3944,
  'token_str': 'Ġtool'},
 {'score': 0.11349397897720337,
  'sequence': '<s>HuggingFace is creating a framework that the community uses '
              'to solve NLP tasks.</s>',
  'token': 7208,
  'token_str': 'Ġframework'},
 {'score': 0.052435580641031265,
  'sequence': '<s>HuggingFace is creating a library that the community uses to '
              'solve NLP tasks.</s>',
  'token': 5560,
  'token_str': 'Ġlibrary'},
 {'score': 0.034935325384140015,
  'sequence': '<s>HuggingFace is creating a database that the community uses '
              'to solve NLP tasks.</s>',
  'token': 8503,
  'token_str': 'Ġdatabase'},
 {'score': 0.028602493926882744,
  'sequence': '<s>HuggingFace is creating a prototype that the community uses '
              'to solve NLP tasks.</s>',
  'token': 17715,
  'token_str': 'Ġprototype'}]


In [14]:
pprint(nlp(f"The man went skiing with his {nlp.tokenizer.mask_token} not knowing that he would have an accident."))

[{'score': 0.15306979417800903,
  'sequence': '<s>The man went skiing with his girlfriend not knowing that he '
              'would have an accident.</s>',
  'token': 6096,
  'token_str': 'Ġgirlfriend'},
 {'score': 0.07086747139692307,
  'sequence': '<s>The man went skiing with his daughter not knowing that he '
              'would have an accident.</s>',
  'token': 1354,
  'token_str': 'Ġdaughter'},
 {'score': 0.06778453290462494,
  'sequence': '<s>The man went skiing with his wife not knowing that he would '
              'have an accident.</s>',
  'token': 1141,
  'token_str': 'Ġwife'},
 {'score': 0.06104987487196922,
  'sequence': '<s>The man went skiing with his friends not knowing that he '
              'would have an accident.</s>',
  'token': 964,
  'token_str': 'Ġfriends'},
 {'score': 0.0447111539542675,
  'sequence': '<s>The man went skiing with his son not knowing that he would '
              'have an accident.</s>',
  'token': 979,
  'token_str': 'Ġson'}]


In [16]:
pprint(nlp(f"Its always the {nlp.tokenizer.mask_token} that repair things."))

[{'score': 0.032353002578020096,
  'sequence': '<s>Its always the robots that repair things.</s>',
  'token': 12129,
  'token_str': 'Ġrobots'},
 {'score': 0.03077397681772709,
  'sequence': '<s>Its always the ones that repair things.</s>',
  'token': 1980,
  'token_str': 'Ġones'},
 {'score': 0.028871040791273117,
  'sequence': '<s>Its always the tools that repair things.</s>',
  'token': 3270,
  'token_str': 'Ġtools'},
 {'score': 0.02700759470462799,
  'sequence': '<s>Its always the screws that repair things.</s>',
  'token': 34242,
  'token_str': 'Ġscrews'},
 {'score': 0.025950074195861816,
  'sequence': '<s>Its always the humans that repair things.</s>',
  'token': 5868,
  'token_str': 'Ġhumans'}]


# Nächstes Wort erraten
sog. Casual language modeling

In [21]:
from transformers import AutoModelWithLMHead, AutoTokenizer, top_k_top_p_filtering
import torch
from torch.nn import functional as F
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelWithLMHead.from_pretrained("gpt2", return_dict=True)
sequence = f"I could not believe that the answer is"
input_ids = tokenizer.encode(sequence, return_tensors="pt")
# get logits of last hidden state
next_token_logits = model(input_ids).logits[:, -1, :]
# filter
filtered_next_token_logits = top_k_top_p_filtering(next_token_logits, top_k=50, top_p=1.0)
# sample
probs = F.softmax(filtered_next_token_logits, dim=-1)
next_token = torch.multinomial(probs, num_samples=1)
generated = torch.cat([input_ids, next_token], dim=-1)
resulting_string = tokenizer.decode(generated.tolist()[0])

In [22]:
print(resulting_string)

I could not believe that the answer is so


# Ganzen text generieren

In [25]:
from transformers import pipeline

text_generator = pipeline("text-generation")
print(text_generator("The corona lockdown was a full success.", max_length=50, do_sample=False))

Some weights of GPT2Model were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'The corona lockdown was a full success.\n\n"We\'re still working on it," said Dr. David L. L. Litt, a professor of medicine at the University of California, San Francisco. "We\'re still working on it'}]


# Text automatisch zusammenfassen

In [29]:
from transformers import pipeline
summarizer = pipeline("summarization")
ARTICLE = """ Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, “and what is the use of a book,” thought Alice “without pictures or conversations?”

So she was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid), whether the pleasure of making a daisy-chain would be worth the trouble of getting up and picking the daisies, when suddenly a White Rabbit with pink eyes ran close by her.

There was nothing so very remarkable in that; nor did Alice think it so very much out of the way to hear the Rabbit say to itself, “Oh dear! Oh dear! I shall be late!” (when she thought it over afterwards, it occurred to her that she ought to have wondered at this, but at the time it all seemed quite natural); but when the Rabbit actually took a watch out of its waistcoat-pocket, and looked at it, and then hurried on, Alice started to her feet, for it flashed across her mind that she had never before seen a rabbit with either a waistcoat-pocket, or a watch to take out of it, and burning with curiosity, she ran across the field after it, and fortunately was just in time to see it pop down a large rabbit-hole under the hedge.

In another moment down went Alice after it, never once considering how in the world she was to get out again.

The rabbit-hole went straight on like a tunnel for some way, and then dipped suddenly down, so suddenly that Alice had not a moment to think about stopping herself before she found herself falling down a very deep well.

Either the well was very deep, or she fell very slowly, for she had plenty of time as she went down to look about her and to wonder what was going to happen next. First, she tried to look down and make out what she was coming to, but it was too dark to see anything; then she looked at the sides of the well, and noticed that they were filled with cupboards and book-shelves; here and there she saw maps and pictures hung upon pegs. She took down a jar from one of the shelves as she passed; it was labelled “ORANGE MARMALADE”, but to her great disappointment it was empty: she did not like to drop the jar for fear of killing somebody underneath, so managed to put it into one of the cupboards as she fell past it.

“Well!” thought Alice to herself, “after such a fall as this, I shall think nothing of tumbling down stairs! How brave they’ll all think me at home! Why, I wouldn’t say anything about it, even if I fell off the top of the house!” (Which was very likely true.)

Down, down, down. Would the fall never come to an end? “I wonder how many miles I’ve fallen by this time?” she said aloud. “I must be getting somewhere near the centre of the earth. Let me see: that would be four thousand miles down, I think—” (for, you see, Alice had learnt several things of this sort in her lessons in the schoolroom, and though this was not a very good opportunity for showing off her knowledge, as there was no one to listen to her, still it was good practice to say it over) “—yes, that’s about the right distance—but then I wonder what Latitude or Longitude I’ve got to?” (Alice had no idea what Latitude was, or Longitude either, but thought they were nice grand words to say.)

Presently she began again. “I wonder if I shall fall right through the earth! How funny it’ll seem to come out among the people that walk with their heads downward! The Antipathies, I think—” (she was rather glad there was no one listening, this time, as it didn’t sound at all the right word) “—but I shall have to ask them what the name of the country is, you know. Please, Ma’am, is this New Zealand or Australia?” (and she tried to curtsey as she spoke—fancy curtseying as you’re falling through the air! Do you think you could manage it?) “And what an ignorant little girl she’ll think me for asking! No, it’ll never do to ask: perhaps I shall see it written up somewhere.”

"""

In [30]:
print(summarizer(ARTICLE, max_length=130, min_length=30, do_sample=False))

[{'summary_text': ' Alice was beginning to get tired of sitting by her sister on the bank, and of having nothing to do, when suddenly a White Rabbit with pink eyes ran close by her . She ran across the field after it, and fortunately was just in time to see it pop down a large rabbit-hole under the hedge . Alice had not a moment to think about stopping herself before she found herself falling down a very deep well .'}]


# Summary

![alt text](huggin.jpeg "How I felt!")

# Und jetzt sogar noch in Deutsch!

In [25]:
ger_summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

In [26]:
ARTICLE = """Wenn er mit seinem Gilet daherkam und seiner Ledertasche, ein bisschen die Beine schleifen ließ, den Kopf ein wenig nach vorne gebeugt und dabei immer dieses Lachen im Gesicht, da hatte er etwas von einem älter gewordenen Jungen, der viel jünger wirkte, als er war. Und jetzt diese Nachricht: David Graeber ist gestorben, in Venedig. Er wurde 59 Jahre alt.      

Es gibt ja nicht viele Linksradikale, die den Linksradikalismus nicht zur Clownerie verkommen lassen, sondern ernst meinen, und die zugleich zu globalen Superstars und Bestsellerautoren werden. "Anarchist" nannte er sich – oder wurde er genannt –, aber ob er wirklich einer war, das kann man diskutieren. Er war einfach der Meinung, dass Menschen ein solidarisches Miteinander pflegen und aufeinander achtgeben würden, wenn sie nicht in repressiven Strukturen eingepfercht wären, und er war überzeugt, "dass Macht korrumpiert". Anderen linken Strömungen oder gar Parteien fühlte er sich nicht richtig zugehörig, so war er vielleicht eher ein Anarchist mangels besserer Alternative.

Graeber besaß noch etwas von dem Habitus früherer Revolutionäre und Intellektueller, die wichtigtuerische Aufgeblasenheit mancher akademischer Linker war nicht seine Sache, er war da viel bescheidener. Vielleicht hat das auch mit seiner Herkunft zu tun. Graeber wuchs in einer linken, jüdischen Arbeiterklassenfamilie auf, sein Vater kämpfte im Spanischen Bürgerkrieg.

Weltberühmt und zu einer Figur der internationalen Linken machte den US-amerikanischen Anthropologen, der seit Jahren in London lehrte, die Finanzkrise vor zehn Jahren.

Sein Buch Schulden. Die ersten 5.000 Jahre war ein Ereignis, natürlich auch deswegen, weil zockende Banken, weil Kredite, Budgetdefizite von Staaten das Thema der Stunde waren. Graeber zerlegte ein paar Mythen und blickte auf scheinbare Selbstverständlichkeiten mit einem neuen scharfen Blick. Sozialverhältnisse, die von Geld bestimmt werden, produzierten Gewalt, Entmenschlichung, Sklaverei, schrieb er. Zahlungsverhältnisse etablierten hierarchische Verhältnisse von Macht und Ohnmacht. Wo alle verschuldet seien, rennen viele nur mehr um das Überleben.

Ursprungsmythen, wie sie die Ökonomie so liebt, wie das Märchen, dass frühere Gesellschaften einfach Gebrauchswerte tauschten, wischte Graeber vom Tisch: Solche Gesellschaften gab es nie. Immer schon nutzten Menschen Äquivalente zur Vereinfachung des Tausches. Geld aber, in Form von Banknoten und Münzen, war bis in die frühe Neuzeit sehr selten. Man brauchte nicht viel davon, wenn alle anschreiben ließen und höchstens zweimal im Jahr abrechneten.

"""

In [27]:
print(ger_summarizer(ARTICLE, max_length=130, min_length=30, do_sample=False))

[{'summary_text': 'David Graeber ist gestorben, in Venedig. "Anarchist" nannte er sich – oder wurde er genannt –. Er war einfach der Meinung, dass Menschen ein solidarisches Miteinander pflegen und aufeinander achtgeben würden.'}]
