<a href="https://colab.research.google.com/github/kokchun/Deep-learning-AI21/blob/main/Lectures/Lec8-Transformers.ipynb" target="_parent"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> &nbsp; for interacting with the code

---
# Lecture notes - Transformers
---

This is the lecture note for **transformers**. 

<p class = "alert alert-info" role="alert"><b>Note</b> that this lecture note gives a brief introduction to transformers. I encourage you to read further about transformers. </p>

Read more:

- [Attention is all you need - Vaswani et. al. (2017)](https://arxiv.org/pdf/1706.03762.pdf)
- [BERT: Bidirectional Encoder Representation from Transformers - Devlin et. al. (2019)](https://arxiv.org/pdf/1810.04805.pdf)
- [spaCy](https://spacy.io/usage/models)
- [Hugging Face](https://huggingface.co/)
- [swedish-gpt - birgermoell Hugging Face](https://huggingface.co/birgermoell/swedish-gpt?text=grattis+p%C3%A5+f%C3%B6delsedagen)
- [GPT-2 - OpenAI team Hugging Face](https://huggingface.co/gpt2)
- [GPT-2 paper - Radford et. al. (2018)](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)
- [GPT-2 - wikipedia](https://en.wikipedia.org/wiki/GPT-2)
- [Named Entity Recognition (NER) - wikipedia](https://en.wikipedia.org/wiki/Named-entity_recognition)
- [spaCy English](https://spacy.io/models/en)
---

## Named entity recognition (NER)

It's an information extraction where we look for and classify named entities in text into predefined names, organisations, locations etc.

In [4]:
import spacy
from spacy import displacy

# python3 -m spacy download en_core_web_md

# this is not a transformers model 
nlp_en_md = spacy.load("en_core_web_md")

# text from here 
# https://en.wikipedia.org/wiki/Explainable_artificial_intelligence
text_sample = """As regulators, official bodies, and general users come to depend on AI-based dynamic systems, clearer accountability will be required for automated decision-making processes to ensure trust and transparency. Evidence of this requirement gaining more momentum can be seen with the launch of the first global conference exclusively dedicated to this emerging discipline, the International Joint Conference on Artificial Intelligence: Workshop on Explainable Artificial Intelligence (XAI).[63]

The European Union introduced a right to explanation in the General Data Protection Right (GDPR) as an attempt to deal with the potential problems stemming from the rising importance of algorithms. The implementation of the regulation began in 2018. However, the right to explanation in GDPR covers only the local aspect of interpretability. In the United States, insurance companies are required to be able to explain their rate and coverage decisions.[64]
"""

doc = nlp_en_md(text_sample)
print(type(doc))

displacy.render(doc, style="ent")

<class 'spacy.tokens.doc.Doc'>


### NER with transformers

In [3]:
# download the transformers model 

# python3 -m spacy download en_core_web_trf
nlp_en_trf = spacy.load("en_core_web_trf")
doc = nlp_en_trf(text_sample)
displacy.render(doc, style="ent")

# note that it is much more accurate than the medium model 

In [7]:
# extract the entities 
entities = {f"{entity}": entity.label_ for entity in doc.ents}
entities

{'AI': 'ORG',
 'first': 'ORDINAL',
 'the International Joint Conference on Artificial Intelligence': 'ORG',
 'The European Union': 'ORG',
 'the General Data Protection Right': 'ORG',
 'GDPR': 'ORG',
 '2018': 'DATE',
 'the United States': 'GPE',
 'decisions.[64': 'ORG'}

### Swedish

In [14]:
# python3 -m spacy download sv_core_news_sm
nlp_swe = spacy.load("sv_core_news_sm")

# text from here
# https://www.svt.se/nyheter/utrikes/klimatkrisen-gar-att-losa-har-ar-sex-tekniker-som-visar-pa-vagen-framat
text_sample_swe = """
Grannlandet Norge har kommit långt med att elektrifiera sin bilflotta. Om ett år kommer nybilsförsäljningen i Norge vara uppe i 100 procent bilar med sladd. Min kollega , techkorrespondenten Alexander Norén berättar att det som förbluffade honom när han åkte till Norge för att få förklaringen till elbilsboomen där var hur starka de ekonomiska incitamenten är, att det för många är en plånboksfråga att dumpa fossilbilen. 
"""

doc_swe = nlp_swe(text_sample_swe)
displacy.render(doc_swe, "ent")


---
## Hugging face 

Lot of pretrained language models. As of this time of writing, many models are based on various types of transformers. Choose a model for your specific task.

### Sentiment analysis

This is a classification task to classify text into different sentiments, e.g. happy, neutral, sad

- [bert-base-swedish-cased-sentiment - marma Hugging Face](https://huggingface.co/models?pipeline_tag=text-classification&sort=downloads&search=sentiment+swe)

In [15]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

tokenizer = AutoTokenizer.from_pretrained("marma/bert-base-swedish-cased-sentiment")
model = AutoModelForSequenceClassification.from_pretrained("marma/bert-base-swedish-cased-sentiment")

sentiment = pipeline("sentiment-analysis", model = 'marma/bert-base-swedish-cased-sentiment')

In [17]:
sentiment("cool pryl där")

[{'label': 'POSITIVE', 'score': 0.9981237053871155}]

In [34]:
sentences = [
    "Jag älskar dig så mycket",
    "Skit, vad jag gillar dig",
    "Skitbra eller skitdåligt?",
    "Boken är OK",
    "AI är väl okej coolt, I guess",
    "Den här boken är sådär",
    "svår"
]

for sentence in sentences:
    label, score = sentiment(sentence)[0]["label"], sentiment(sentence)[0]["score"]
    print(f"{sentence}: {label}, {score:.3f}")


Jag älskar dig så mycket: POSITIVE, 0.999
Skit, vad jag gillar dig: POSITIVE, 0.999
Skitbra eller skitdåligt?: NEGATIVE, 0.997
Boken är OK: POSITIVE, 0.993
AI är väl okej coolt, I guess: NEGATIVE, 0.972
Den här boken är sådär: NEGATIVE, 0.995
svår: NEGATIVE, 0.995


### Generative Pre-trained Transformer 2 (GPT-2)

Created by OpenAI, used for text generation, translation, answer questions, summarization and more

In [35]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

tokenizer = AutoTokenizer.from_pretrained("gpt2")

model = AutoModelForCausalLM.from_pretrained("gpt2")

gpt2 = pipeline("text-generation", model = "gpt2")

gpt2("Welcome to IT-högskolan, we are a school specialised in IT", max_length = 100)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Welcome to IT-högskolan, we are a school specialised in IT management and data processing related to the IT industry. Our goal is to keep you up to date in the field of data management, IT and IT networking with a focus on IT performance, data consumption and collaboration.\n\nFor more information visit the portal at:\n\nwww.heik.com>\n\nFollow us on Facebook: https://www.facebook.com/heik-hockel'}]

In [36]:
gpt2(
    "Welcome to IT-högskolan, we are a school specialised in IT. We are a school with 500 students.",
    max_length=100,
)[0]["generated_text"]


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


"Welcome to IT-högskolan, we are a school specialised in IT. We are a school with 500 students. In addition to this we also have our own IT community of its own, and we have very high expectations for every IT worker! We can tell you the truth about IT that comes from a teacher who comes from a world that doesn't fit neatly into that circle of expectations of success and integrity. How could he do it, given all these limitations, if he did"

In [37]:
meme = gpt2("Backend, frontend, weekend :)")
print(meme[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Backend, frontend, weekend :)


How You can access the API

The frontend is based on http://localhost:2580/

The backend is based on http://localhost:2500/

You can access the


In [38]:
bella_text = gpt2("Bella is a cute rabbit", max_length = 100)
print(bella_text[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Bella is a cute rabbit named Bella who loves to sleep with her friend, and sometimes when Bella falls asleep Bella's ears aren't as large but they are still pretty darn big (they grow like they're eating peas every night). For instance if Bella is in a different room from her bunny friend instead of upstairs or back by herself, or Bella is in the house with her boyfriend it's impossible to tell if her ears are growing bigger. Even Bella's ears look larger but those ears are still


In [43]:
print(gpt2("Congratulations on your graduation")[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Congratulations on your graduation. Have fun at work, play golf with friends/family, check out any of your hobbies, or hang out with your best friends/fiance, or just look at you over the shoulder every day from noon until 2:


### Generating text in swedish

In [45]:
tokenizer = AutoTokenizer.from_pretrained("flax-community/swe-gpt-wiki")

model = AutoModelForCausalLM.from_pretrained("flax-community/swe-gpt-wiki")

generator_swe = pipeline("text-generation", model = "birgermoell/swedish-gpt")

print(generator_swe("grattis på födelsedagen")[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


grattis på födelsedagen och hoppas så klart att du tar med dig en eller två gubbar hit till Falun! Jag hoppas dock att de har det så lite trångt... Men, en dag får vi väl se om det sker nåt


In [48]:
print(generator_swe("Grattis tille examenen!", max_length = 100)[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Grattis tille examenen! :D Håller också hårt på min bok (läs: jag kan inte läsa ett piss för resten av mitt liv!!!!) och kommer att göra massor av andra saker resten av mitt liv men det får bli någon annan gång då jag har mycket med jobb och annat!! Nu ska jag och min underbara vän ta oss en tur till skogen iaf! :-DSvar: Ja, det var en helt annan värld ;) Men det
