# [Practice Notebook] AfterWork: Working with Opensource LLMs

# Pre-requisites

In [None]:
# Install the transformers library for working with LLMs
!pip install transformers



In [None]:
# We install the Pytorch library; models on hugging face rely on PyTorch
!pip install torch



In [None]:
# Importing the required library
from transformers import pipeline

# 1. Text translation


Translation with Hugging Face models enables developers to easily convert text from one language to another using advanced transformer-based models like MarianMT, MBART, and T5. These models are pre-trained on large multilingual datasets, allowing them to deliver high-quality translations across a wide range of languages.

Text translation model examples:
* **MarianMT** (Helsinki-NLP/opus-mt-en-de): A multilingual model trained for high-quality translations between various language pairs, such as English to German.
* **mBART** (facebook/mbart-large-50): A multilingual model designed for text generation and translation across 50 languages, known for its versatility and effectiveness.
* **M2M100** (facebook/m2m100_418M): A multilingual model that supports direct translation between multiple languages without relying on English as an intermediary.




### Example 1: English to French

In [None]:
# Load the translation model
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-fr")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.42k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/778k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.34M [00:00<?, ?B/s]



In [None]:
# Define the text to translate
text = "The Hugging Face platform provides state-of-the-art machine learning models and tools."

# Perform the translation
translated_text = translator(text, max_length=40)

# Print the translated text

print(translated_text[0]['translation_text'])

La plateforme Hugging Face propose des modèles et des outils d'apprentissage automatique de pointe.


### <font color="green">Challenge</font>

You are developing a multilingual application that needs to provide content in various languages. Your task is to use the Hugging Face pipeline to translate a piece of text into a specific language relevant to your target audience.

* Write Python code that loads a Hugging Face translation model to translate text from English to your national language (e.g., Swahili, Igbo, etc.).
* Use the provided text and model to perform the translation.


In [None]:
# Load the translation model for English to your national language: "Swahili"
# Your code goes here
translator = pipeline("translation", model="Tritkoman/English2AlgerianArabic")

config.json:   0%|          | 0.00/841 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/478 [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/4.31M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/16.3M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/74.0 [00:00<?, ?B/s]

In [None]:
# Define the text to translate
text = """The advancements in technology have a profound impact on our daily lives,
shaping how we communicate, work, and interact with the world around us."""

# Perform the translation
# Your code goes here

translated_text = translator(text, max_length=60)



# Print the translated text
# Your code goes here

print(translated_text[0]['translation_text'])



تقدم التقنيات يُحدث تأثيراً بالغًا على حياتنا اليومية، عبر تغيير كيفية التواصل، العمل، والتواصل مع العالم حولنا. 


# 2. Question answering


Question answering allows developers to build systems that can automatically find answers to specific questions within a given context. By leveraging state-of-the-art transformer models like BERT, RoBERTa, and DistilBERT, which are fine-tuned on datasets such as SQuAD, these models are capable of handling both answerable and unanswerable questions.

Question answering model examples:
* **DistilBERT** (`distilbert-base-cased-distilled-squad`): A distilled version of BERT optimized for question-answering tasks, offering a good balance between performance and efficiency.
* **BERT** (`bert-base-cased`): The original BERT model fine-tuned for question-answering, known for its strong contextual understanding and accuracy.
* **RoBERTa** (`roberta-base`): An optimized variant of BERT with enhanced performance for question-answering tasks through better training strategies and data.


### Example 1: Short context

In [None]:
# Load the question-answering model
qa_pipeline = pipeline("question-answering", model="twmkn9/bert-base-uncased-squad2")

config.json:   0%|          | 0.00/465 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

Some weights of the model checkpoint at twmkn9/bert-base-uncased-squad2 were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/39.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

In [None]:
# Define the context and question
context = "The Eiffel Tower is located in Paris and is one of the most famous landmarks in the world."
question = "Which city is the Eiffel Tower located in?"

# Get the answer
result = qa_pipeline(question=question, context=context)
print(result["answer"])

Paris


In [None]:
# Load the question-answering model
qa_pipeline = pipeline("question-answering", model="twmkn9/bert-base-uncased-squad2")

Some weights of the model checkpoint at twmkn9/bert-base-uncased-squad2 were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


### Example 2: Longer context

In [None]:
# Define a longer context and a new question
context = """
The Great Wall of China is one of the most remarkable architectural feats in human history.
Stretching over 13,000 miles, the wall was built to protect Chinese states and empires against various nomadic groups from the north.
Construction began as early as the 7th century BC, but most of the existing wall was built during the Ming Dynasty (1368–1644).
The Great Wall is not a single continuous wall but rather a series of walls and fortifications.
It also functioned as a means of border control, allowing the imposition of duties on goods transported along the Silk Road, regulation of immigration, and encouragement of trade.
Though large parts of the wall have deteriorated over time, it remains one of the most popular tourist attractions in the world.
"""

question = "When did the Great Wall of China construction begin?"

# Get the answer directly
result = qa_pipeline(question=question, context=context)
answer = result["answer"]

print("Answer:", answer)

Answer: 7th century BC


### <font color="green">Challenge</font>

You are tasked with building a question-answering system using the Hugging Face pipeline for NLP tasks. Specifically, you will use the `twmkn9/bert-base-uncased-squad2` model, which is fine-tuned on SQuAD 2.0. This model is well-suited for handling both answerable and unanswerable questions and provides a strong balance between performance and accuracy for general-purpose question-answering tasks.

* Write Python code that loads the twmkn9/bert-base-uncased-squad2 model using the Hugging Face pipeline.
* Use the sample context and question below.
* Use the model to extract the answer from the context.


In [None]:
# Load the question-answering model
# Your code goes here


# Define the context and question
context = """
The Great Pyramid of Giza, built as a tomb for the Egyptian Pharaoh Khufu, is one of the Seven Wonders of the Ancient World.
Constructed over 4,500 years ago, it is the only Wonder that remains largely intact. Its exact construction methods remain a mystery to this day.
The pyramid consists of more than 2 million blocks of limestone and granite, some weighing as much as 80 tons.
It stood as the tallest man-made structure in the world for over 3,800 years.
"""

question = "How many blocks of limestone and granite were used in the construction of the Great Pyramid of Giza?"

In [None]:
# Get the answer from the model
# Your code goes here
result = qa_pipeline(question=question, context=context)
answer = result["answer"]



# Print the answer
# Your code goes here

print("Answer:", answer)

Answer: more than 2 million


# 3. Text generation

Text generation models are a subset of natural language processing (NLP) models designed to create coherent and contextually relevant text based on a given input. These models, often powered by advanced architectures like the Transformer, are capable of producing human-like text across various applications, including content creation, conversational agents, and creative writing.

Text generation model examples:
* **GPT-Neo** (`EleutherAI/gpt-neo-2.7B`): Open-source alternative to GPT-3 with high-quality text generation.
* **GPT-J** (`EleutherAI/gpt-j-6B`): Efficient 6 billion parameter model for high-performance text generation.
* **DistilGPT-2** (`distilgpt2`): Smaller and faster version of GPT-2, balancing performance and efficiency.

In [None]:
# Load the text generation model
text_generator = pipeline("text-generation", model="gpt2")

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [None]:
# Define the prompt
prompt = "In a future where humans and robots coexist, the key to a harmonious society is"

# Generate text
generated_text = text_generator(
    prompt,
    max_length=250,          # Increase max_length to ensure full response
    num_return_sequences=1,  # Generate one sequence
    temperature=1,           # Adjust temperature for more coherent text
)

# Print the generated text
print(generated_text[0]['generated_text'])

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In a future where humans and robots coexist, the key to a harmonious society is to avoid friction between each other and seek agreement. This need has been proven in several recent social sciences research studies, including those conducted in a population research project. The research indicates that humans and robots contribute in large part to each other for many hours on the job, as well as to economic and human survival. By the same token, our ability to work together contributes to and drives the emergence of more cooperative economies.

The goal of this paper is not to give a specific scientific conclusion but to provide a general point about the evolution of cooperative economies. There is also a possible basis for using science to guide future research, but this may be insufficient. We have already covered cooperative economies here because, unlike other economic systems, their main goal is altruism, which is why they appear to be highly motivated in the majority of experiment

### <font color="green">Challenge</font>

Use the `distilgpt2` model from Hugging Face to generate creative text based on a given prompt. `distilgpt2` is a lighter model that offers a good trade-off between performance and efficiency, making it suitable for scenarios where computational resources are limited.

In [None]:
# Load the text generation model
text_generator = pipeline("text-generation", model="distilgpt2")

config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/353M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [None]:
# Define the prompt
prompt = "As the sun set over the horizon, the city began to"

# Generate text with adjustments
generated_text = text_generator(
    prompt,
    max_length=150,          # Increase max_length to ensure full response
    num_return_sequences=1,  # Generate one sequence
    temperature=0.8,         # Adjust temperature for more creative text
    top_k=50                 # Adjust top_k for better text diversity
)

# Print the generated text
print(generated_text[0]['generated_text'])

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


As the sun set over the horizon, the city began to fall back on its feet.



The city has had a long week to prepare for the worst weather in the American history, but now there's a very good chance that it will not be without a couple of storms.
And it's hard to imagine the worst.
As the storm approached, the city was forced into a deep, wet air.
Some residents feared that the city would not be able to provide food or shelter.
On that day, a small group of residents who lived in the area were asked if they could go back to their homes.
They said they would not. They said they would not.
A group of residents were asked if
