# Exercise 1 

In this exercise, we will familiarize ourselves with text-generations model via `transformers`.

### Exercise 1(a) (2 points)

Load `pipeline` from `transformers`. Then, load the `gpt2` model.

In [11]:
from transformers import pipeline

# Load the text-generator pipeline
generator = pipeline("text-generation", model="gpt2")

All PyTorch model weights were used when initializing TFGPT2LMHeadModel.

All the weights of TFGPT2LMHeadModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.
Device set to use 0


### Exercise 1(b) (3 points)

Using the pre-trained `gpt2` model, write a two sentence and generate text after those two sentences with the `gpt2`.

In [12]:
text = "this is my first time using the gpt2 model. I'm really excited to see what it can do."

generator(text, max_length=50, num_return_sequences=3)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


[{'generated_text': "this is my first time using the gpt2 model. I'm really excited to see what it can do.\n\n\nA big thanks to everyone who sent feedback and suggestions. I really did not expect the gpt2 models so soon. I"},
 {'generated_text': "this is my first time using the gpt2 model. I'm really excited to see what it can do.\n\nMore interesting is this:\n\nWhat's happening with a 2.4GHz CPU? It's not all going according to"},
 {'generated_text': "this is my first time using the gpt2 model. I'm really excited to see what it can do.\n\nAfter working with the gpt2 for 15 months I finally decided to put the sensor on my iPhone which is a 4."}]

# Exercise 2 

In this exercise, we will familiarize ourselves with named-entity-recognition models via `tranformers`.

### Exercise 2(a) (2 points)

Using the `pipeline`, load `dslim/bert-base-NER` with `task=ner`.

In [13]:
ner = pipeline(task="ner", model="dslim/bert-base-NER")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
All PyTorch model weights were used when initializing TFBertForTokenClassification.

All the weights of TFBertForTokenClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForTokenClassification for predictions without further training.
Device set to use 0


### Exercise 2(b) (3 points)

Use the pre-trained model from part (a) to identify entities in a text.

In [14]:
text = "Oscar Aguilar is an awesome data professor at Grand View University, a small college in Iowa."

for entity in ner(text):
    print(f"Entity: {entity['word']}, Score: {entity['score']}, Type: {entity['entity']}")

Entity: Oscar, Score: 0.9996612071990967, Type: B-PER
Entity: A, Score: 0.9996281862258911, Type: I-PER
Entity: ##gu, Score: 0.999373733997345, Type: I-PER
Entity: ##ila, Score: 0.9955099821090698, Type: I-PER
Entity: ##r, Score: 0.9710258841514587, Type: I-PER
Entity: Grand, Score: 0.9955690503120422, Type: B-ORG
Entity: View, Score: 0.9942438006401062, Type: I-ORG
Entity: University, Score: 0.9948624968528748, Type: I-ORG
Entity: Iowa, Score: 0.9984671473503113, Type: B-LOC


# Exercise 3 

In this exercise, we will familiarize ourselves with summarization models via `transformers`.

### Exercise 3(a) (2 points)

Using the `pipeline`, load `facebook/bart-large-cnn` with `task=summarization`.

In [15]:
# loading the summarizer
summarizer = pipeline(task="summarization", model="facebook/bart-large-cnn")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
All PyTorch model weights were used when initializing TFBartForConditionalGeneration.

All the weights of TFBartForConditionalGeneration were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBartForConditionalGeneration for predictions without further training.
Device set to use 0


### Exercise 3(b) (3 points)

Given the below text:

```
The deep learning field has been experiencing a seismic shift, thanks to the emergence and rapid evolution of Transformer models.
These groundbreaking architectures have not just redefined the standards in Natural Language Processing (NLP) but have broadened their horizons to revolutionize numerous facets of artificial intelligence.
Characterized by their unique attention mechanisms and parallel processing abilities, Transformer models stand as a testament to the innovative leaps in understanding and generating human language with an accuracy and efficiency previously unattainable.
First appeared in 2017 in the “Attention is all you need” article by Google, the transformer architecture is at the heart of groundbreaking models like ChatGPT, sparking a new wave of excitement in the AI community. They've been instrumental in OpenAI's cutting-edge language models and played a key role in DeepMind's AlphaStar.
In this transformative era of AI, the significance of Transformer models for aspiring data scientists and NLP practitioners cannot be overstated.
As one of the core fields for most of the latest technological leap forwards, this article aims to decipher the secrets behind these models.
```

Summarize the above text in no more than 50 words and at least 25 words.

In [16]:
text = """

The deep learning field has been experiencing a seismic shift, thanks to the emergence and rapid evolution of Transformer models.
These groundbreaking architectures have not just redefined the standards in Natural Language Processing (NLP) but have broadened their horizons to revolutionize numerous facets of artificial intelligence.
Characterized by their unique attention mechanisms and parallel processing abilities, Transformer models stand as a testament to the innovative leaps in understanding and generating human language with an accuracy and efficiency previously unattainable.
First appeared in 2017 in the “Attention is all you need” article by Google, the transformer architecture is at the heart of groundbreaking models like ChatGPT, sparking a new wave of excitement in the AI community. They've been instrumental in OpenAI's cutting-edge language models and played a key role in DeepMind's AlphaStar.
In this transformative era of AI, the significance of Transformer models for aspiring data scientists and NLP practitioners cannot be overstated.
As one of the core fields for most of the latest technological leap forwards, this article aims to decipher the secrets behind these models.

"""

summarizer(text, max_length=50, min_length=25, do_sample=False)

[{'summary_text': "The transformer architecture is at the heart of groundbreaking models like ChatGPT, sparking a new wave of excitement in the AI community. They've been instrumental in OpenAI's cutting-edge language models and played a key role in DeepMind"}]