URL context: https://www.pna.gov.ph/articles/1234579


# Installation



This command installs the most recent release of Hugging Face's `transformers` library. The `transformers` library contains implementations of numerous transformer-based models such as BERT, GPT, RoBERTa, etc. It is commonly applied for purposes like text classification, question answering, named entity recognition, among others.

In [41]:
!pip install transformers



This command serves to install version 4.31.0 of the package called `transformers`. At times, it becomes essential to stick to a certain version of a library especially if the code or other libraries in the environment have to be made compatible.

In [42]:
!pip install transformers==4.31.0



This command installs the `datasets `library which was also developed by Hugging and significantly eases various challenges that are faced when downloading, processing or even sharing datasets. There is also provided a fair usage policy of this library in working with very large datasets and it is commonly integrated with transformations in order to prepare data for training and testing of the models that are created artificially.

In [43]:
!pip install datasets



This command is just like the others but with `accelerate` library. It assists in the parallelization of the model training over many hardware devices namely CPU, GPU, multi-GPU devices etc. It reduces the time taken in the training, thus increases the efficiency of training thanks where it is very useful such as in the case of large model BERT.

In [44]:
!pip install transformers==4.31.0 datasets accelerate



In [45]:
!pip install evaluate==0.4.0

Collecting evaluate==0.4.0
  Downloading evaluate-0.4.0-py3-none-any.whl.metadata (9.4 kB)
Collecting responses<0.19 (from evaluate==0.4.0)
  Downloading responses-0.18.0-py3-none-any.whl.metadata (29 kB)
Downloading evaluate-0.4.0-py3-none-any.whl (81 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.4/81.4 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading responses-0.18.0-py3-none-any.whl (38 kB)
Installing collected packages: responses, evaluate
Successfully installed evaluate-0.4.0 responses-0.18.0


# Store Data of Context Article & Questions

The variable named `question` is comprised of 10 questions intended to provide insight on the difficulties and procedures associated with bringing back overseas Filipino workers (OFWs) from Lebanon. Each question is focused on a particular aspect of the situation, for example – logistical challenges, supervision by authorities, incorporation of the documented and undocumented elements, etc.

The `context` variable encapsulates an elaborate portion regarding the repatriation of OFWs by the Philippine government, particularly from Lebanon, where recent hegemonic bombings in Beirut have grounded flights. It outlines some of their struggles, such as provision of chartered flights without landing privileges to the various government agencies responsible for the repatration activities, and what package will be available for the returning OFWs.

In [65]:
question = [
    "What caused the difficulty in scheduling repatriation flights from Lebanon?",
    "Which department of the Philippine government is overseeing the repatriation efforts?",
    "How are undocumented Filipino workers being affected by the crisis in Lebanon?",
    "What measures are being considered to expedite the repatriation of OFWs?",
    "Who is responsible for arranging transportation for repatriated OFWs back to their hometowns?",
    "What financial assistance is provided to repatriated OFWs upon their return?",
    "What role does the Philippine embassy in Lebanon play in the repatriation process?",
    "How is the Lebanese airspace situation impacting the repatriation efforts?",
    "How many OFWs have already been repatriated from Lebanon as of the article's date?",
    "What other forms of support, besides transportation, are being offered to OFWs awaiting repatriation?"
]



context = """MANILA – The government is arranging chartered flights for the repatriation of more than 200 overseas Filipino workers in Beirut, Lebanon, the Department of Migrant Workers (DMW) said Wednesday.
“We are trying to provide for chartered flights. We’re talking to airline companies so that the chartered flights would be able to accommodate for example, no less than 300 overseas Filipino workers from Beirut,” DMW Undersecretary Bernard Olalia said in a Palace press briefing.
This was after the scheduled flights of around 15 OFWs on Sept. 25 were cancelled because of the recent bombings in Beirut.
Olalia said around 111 OFWs are staying in four temporary shelters in Beirut and waiting for their repatriation.
An additional 110 OFWs are applying for exit permits from the Lebanese government, Olalia said.
“Apart from the documented OFWs, we have undocumented OFWs who need to secure travel documents and once they’re given travel documents, we will help them in securing also exit visas or exit permits from the Immigration of the Lebanese government,” he said.
Olalia, however, said the Philippine government is facing several challenges, including securing landing rights for chartered flights.
He said land and sea routes are being considered, in case the situation escalates and makes it “impossible” to take the air route.
“The DMW is also studying the possibility of other routes. Apart from air route, we will be assessing the sea and the land route, should the case or the situation there worsen,” Olalia said.
He said the DMW, the Overseas Workers Welfare Administration (OWWA), and other concerned agencies will adopt a “whole-of-government assistance" upon the directive of President Ferdinand R. Marcos Jr.
He said each repatriated OFW will get PHP150,000 in financial assistance from the DMW and OWWA, as well as psychosocial services.
Israel has intensified its airstrikes across the northern border into Lebanon, targeting the Iran-backed militant group Hezbollah.
Iran fired ballistic missiles in Israel on Tuesday night, following the deadly attacks on Gaza and Lebanon and the recent killings of Hamas, Hezbollah, and Islamic Revolutionary Guard Corps leaders.
Olalia said no Filipinos were hurt since the attacks were launched.
“We have men on the ground. They work around the clock. At ‘yung mga staff po natin, dinagdagan na po natin (And we augmented our staff) both in Lebanon at (and) nearby posts to be able to provide safest route, to evacuate and ultimately to facilitate the repatriation of our OFWs both either in Lebanon or in Israel,” he said. (PNA)"""


### Text Preprocess

The text preprocess uses the `NLTK` library to prepare the text for use in some natural language processing task. It starts with the installation of any relevant libraries and resources needed for tokenization, for removal of stop words and for lemmatization. The function `preprocess_tex`t takes the input text and performs operations such as changing the case to lower, removing punctuations, breaking the text into words, removing unnecessary words called stop words, and lemmatizing the words to their root form. Ultimately it does this preprocessing on the `context` and on each of the questions in the `questions` list so that, the format looks uniform and is ready for further processing in the tasks such as Question answering.

In [66]:
!pip install nltk



In [67]:
import nltk
import re
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer

In [68]:
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

In [69]:
def preprocess_text(text):
    # Lowercase the text
    text = text.lower()
    # Remove punctuation
    text = re.sub(r'[^\w\s]', '', text)
    # Tokenize the text
    tokens = nltk.word_tokenize(text)
    # Remove stop words
    tokens = [token for token in tokens if token not in stopwords.words('english')]
    # Lemmatize the tokens
    lemmatizer = WordNetLemmatizer()
    tokens = [lemmatizer.lemmatize(token) for token in tokens]
    # Join the tokens back into a string
    preprocessed_text = ' '.join(tokens)
    return preprocessed_text

In [70]:
context = preprocess_text(context)
questions = [preprocess_text(question) for question in question]

# **Model of Deepset (Bert Base)**

> This is a provided code with some changes



This code snippet implements a question-answering system using a fine-tuned BERT model from the Hugging Face Transformers library.

This code uses all the necessary libraries in particular the pipeline, BertForQuestionAnswering, and BertTokenizer from the transformers package and use of textwrap and SequenceMatcher to format text and find the string similarity respectively.

In [None]:
# Import necessary libraries
from transformers import pipeline, BertForQuestionAnswering, BertTokenizer
import textwrap
from difflib import SequenceMatcher

The purpose of the calculate_similarity function is to employ the capabilities of the Python module called difflib and more specifically its SequenceMatcher class to obtain the similarity ratio of two input strings. This in turn, aids in assessing the accuracy of the model's overall response to the context where the answer is embedded.

In [None]:
# Function to calculate similarity between two strings
def calculate_similarity(a, b):
    return SequenceMatcher(None, a, b).ratio()

This code imports a pre-trained BERT model that has been specifically optimized for question answering (- deepset/bert-base-cased-squad2) with the use of its associated tokenizer. This particular arrangement is vital when it comes to preparing the text input for the model.

In [None]:
# Load the fine-tuned model
model = BertForQuestionAnswering.from_pretrained('deepset/bert-base-cased-squad2')
tokenizer = BertTokenizer.from_pretrained('deepset/bert-base-cased-squad2')

Utilizing the pre-existing model and tokenizer, a question-answering pipeline has been built. Such a pipeline essentially makes it easier to provide the model with questions and contexts and retrieve answers instead.


In [None]:
# Create a pipeline for question answering
qna_pipeline = pipeline('question-answering', model=model, tokenizer=tokenizer)

Reference article providing the context to the questions is typed, formatted and printed.

In [None]:
# Display the context article
dedented_text = textwrap.dedent(context).strip()
print("Context Article:\n")
print(textwrap.fill(dedented_text, width=120))

This code starts off by setting correct_count and total_count to zero in order to keep a record of the number of correct answers predicted by the model.

In [None]:
# Initialize counters for correctness
correct_count = 0
total_count = len(question)

This code loops through each question in the questions list:


*   It employs the Question-and-Answer processing component to formulate an answer in reference to the given question and context.
*   The anticipated response along with the start and end position indices pertaining to the context are gathered.
*   A portion of the background information relevant to the expected response is also retrieved for the purposes of assessing similarity.

In [None]:
# Loop through the questions array
for inquiry in questions:
    # Get the model's answer
    answer = qna_pipeline({'question': inquiry, 'context': context})
    model_answer = answer['answer']

    # Try to extract the context around the predicted answer
    start_idx = answer['start']
    end_idx = answer['end']
    extracted_context = context[start_idx-50:end_idx+50]  # Get nearby context around the answer


This code as provided, for every question, perform the following actions: First, it displays the question, next displays the answer generated by the model and its positions, and finally displays the prediction score which is the confidence level in answering that question.

In [None]:
    print(f"Question: {inquiry}")
    print("Answer found: " + model_answer)
    print("At Index:", start_idx, "-", end_idx)
    print("With probability:", answer['score'], "\n")

A numerical score is derived which approximates the distance of the model's answer to the surrounding context. In case this score passes a predefined value, which is 0.8 in this case, the answer is assumed to be accurate and the correct answer count is increased by one. If no, it means that the answer is wrong or it is not clear whether it is wrong or correct.

In [None]:
# Check if the model's answer is a reasonable match based on nearby context
    similarity_score = calculate_similarity(model_answer.lower(), extracted_context.lower())

# If similarity score is high, assume the answer is correct
    if similarity_score > 0.8:  # You can adjust this threshold
        print("The model's answer is likely correct!")
        correct_count += 1
    else:
        print("The model's answer seems incorrect or unclear.")

    print("-" * 80)

When all of the questions have been processed the code will compute and display the complete accuracy of the model predictions in relation to how many questions were answered correctly.

In [78]:
# Calculate overall accuracy
accuracy = correct_count / total_count * 100
print(f"Accuracy: {accuracy:.2f}%")

config.json:   0%|          | 0.00/508 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/433M [00:00<?, ?B/s]

Some weights of the model checkpoint at deepset/bert-base-cased-squad2 were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/152 [00:00<?, ?B/s]

Context Article:

manila government arranging chartered flight repatriation 200 overseas filipino worker beirut lebanon department migrant
worker dmw said wednesday trying provide chartered flight talking airline company chartered flight would able
accommodate example le 300 overseas filipino worker beirut dmw undersecretary bernard olalia said palace press briefing
scheduled flight around 15 ofws sept 25 cancelled recent bombing beirut olalia said around 111 ofws staying four
temporary shelter beirut waiting repatriation additional 110 ofws applying exit permit lebanese government olalia said
apart documented ofws undocumented ofws need secure travel document theyre given travel document help securing also exit
visa exit permit immigration lebanese government said olalia however said philippine government facing several challenge
including securing landing right chartered flight said land sea route considered case situation escalates make
impossible take air route dmw also studying po

This model comes in handy especially in assessing the performance of a question answering framework in relation to a specific context in regard with its correctness and dependability.

#  THREE FINE TUNED MODEL

## **Model of Google Bert**

This model puts together an elaborate pipeline for training a BERT model for question answering tasks with SQUAD dataset.

This code helps utilize the first available GPU for model’s training that can greatly reduce the time taken. This command also launches a question answering training script indicating the BERT model to use, the target dataset SQUAD, training parameters such as learning rate, number of epochs and the output directory to save the model. The accelerate command makes it easy to train on the hardware available.

In [None]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # Use the first GPU
!accelerate launch ./examples/question-answering/run_qa.py \
    --model_name_or_path bert-large-uncased-whole-word-masking \
    --dataset_name squad \
    --do_train \
    --do_eval \
    --learning_rate 3e-5 \
    --num_train_epochs 2 \
    --max_seq_length 384 \
    --doc_stride 128 \
    --output_dir ./models/wwm_unc

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `0`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
/usr/bin/python3: can't open file '/content/./examples/question-answering/run_qa.py': [Errno 2] No such file or directory
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 48, in main
    args.func(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1174, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 769, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command

This line gives a command to the system to move into the hugging face transformers repository, which ensures that the user has the most recent version of the library, as well as the examples that come with it.

In [46]:
!git clone https://github.com/huggingface/transformers.git

fatal: destination path 'transformers' already exists and is not an empty directory.


The necessary libraries for the construction of the question-answering system are imported. In other words, those libraries include the essence of how datasets/models are taken care of.

In [None]:
# Import necessary libraries
from transformers import pipeline, AutoTokenizer, AutoModelForQuestionAnswering, TrainingArguments, Trainer
import textwrap
from difflib import SequenceMatcher
import datasets
from datasets import load_dataset


This function helps calculate a similarity score between two strings, and hence assist in evaluating the model by checking how appropriate its answers are with respect to the context.

In [None]:
# Function to calculate similarity between two strings
def calculate_similarity(a, b):
    return SequenceMatcher(None, a, b).ratio()

This piece loads a BERT model already trained on SQuAD and its tokenizer.

In [None]:
# Load the fine-tuned model
tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-large-uncased-whole-word-masking-finetuned-squad")


The SQuAD dataset is being downloaded, and the train and validation set are assigned to the variables train_dataset and test_dataset respectively.

In [None]:

# Load and prepare the SQuAD dataset
squad = datasets.load_dataset("rajpurkar/squad")
train_dataset = squad["train"]
test_dataset = squad["validation"]

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
Some weights of the model checkpoint at google-bert/bert-large-uncased-whole-word-masking-finetuned-squad were not used when initializing BertForQuestionAnswering: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be e

The preprocess_function splits the question and context into tokens while also truncating and padding the input to fit the requisite input size. This function is used for both training and testing datasets.

In [None]:
# Preprocess data
def preprocess_function(examples):
    return tokenizer(
        examples["question"],
        examples["context"],
        truncation="only_second",
        padding="max_length",
        max_length=384,
    )

train_dataset = train_dataset.map(preprocess_function, batched=True)
test_dataset = test_dataset.map(preprocess_function, batched=True)

This code gives the values of all the training related variables such as the location to save the model, the batch sizes, number of epochs for training, and how often to evaluate the model.

In [None]:

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    evaluation_strategy="epoch",
)


The Trainer class is created using provided model, arguments for training and datasets. This class makes training and evaluation easier.

In [None]:
# Create Trainer and fine-tune
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
)

This code is the one that would call the start of the training. It has been removed as a prior risk of this might occur due to system constraints.

In [None]:
# Fine tune the model
# trainer.train() I removed this because it crashed

Once the training process is over, there is a process of saving the learned model for use at a later.

In [None]:
trainer.save_model("./fine_tuned_model")

For the inference purpose, the saved fine-tuned model and tokenizer are retrieved.

In [None]:
# Load fine-tuned model and tokenizer
fine_tuned_tokenizer = AutoTokenizer.from_pretrained("./fine_tuned_model")
fine_tuned_model = AutoModelForQuestionAnswering.from_pretrained("./fine_tuned_model")

In this line, a question answering pipeline is constructed which serves to answer questions within the provided context.

In [None]:
# Create question answering pipeline
qna_pipeline = pipeline('question-answering', model=fine_tuned_model, tokenizer=fine_tuned_tokenizer)


The context article is prepared and displayed in a formatted way for easier and clearer reference.

In [None]:
# Display the context article
dedented_text = textwrap.dedent(context).strip()
print("Context Article:\n")
print(textwrap.fill(dedented_text, width=120))

This code assesses the model’s response to every question from the questions list and retrieves their answers using a question answering pipeline, then prints the output and the similarity of the obtained answer to the context in order to evaluate the probability of this answer being correct. Then the accuracy of predictions made by the model is calculated and displayed.

In [None]:
# Initialize counters for correctness
correct_count = 0
total_count = len(questions)

# Loop through the questions array
for i, inquiry in enumerate(questions):

    # Get the model's answer
    answer = qna_pipeline({'question': inquiry, 'context': context})
    model_answer = answer['answer']

    # Try to extract the context around the predicted answer
    start_idx = answer['start']
    end_idx = answer['end']
    extracted_context = context[start_idx-50:end_idx+50]

    print(f"Question: {question[i]}")
    print("Answer found: " + model_answer)
    print("At Index:", start_idx, "-", end_idx)
    print("With probability:", answer['score'], "\n")

    # Check if the model's answer is a reasonable match based on nearby context
    similarity_score = calculate_similarity(model_answer.lower(), extracted_context.lower())

    # If similarity score is high, assume the answer is correct
    if similarity_score > 0.5:
        print("The model's answer is likely correct!")
        correct_count += 1
    else:
        print("The model's answer seems incorrect or unclear.")

    print("-" * 80)

# Calculate overall accuracy
accuracy = correct_count / total_count * 100
print(f"Accuracy: {accuracy:.2f}%")

  return torch.load(checkpoint_file, map_location="cpu")
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'


Context Article:

manila government arranging chartered flight repatriation 200 overseas filipino worker beirut lebanon department migrant
worker dmw said wednesday trying provide chartered flight talking airline company chartered flight would able
accommodate example le 300 overseas filipino worker beirut dmw undersecretary bernard olalia said palace press briefing
scheduled flight around 15 ofws sept 25 cancelled recent bombing beirut olalia said around 111 ofws staying four
temporary shelter beirut waiting repatriation additional 110 ofws applying exit permit lebanese government olalia said
apart documented ofws undocumented ofws need secure travel document theyre given travel document help securing also exit
visa exit permit immigration lebanese government said olalia however said philippine government facing several challenge
including securing landing right chartered flight said land sea route considered case situation escalates make
impossible take air route dmw also studying po

## **Model of Deepset (Roberta)**

The explanation of each each is same as the Google Bert, but the different is the model name

In [56]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # Use the first GPU
!accelerate launch ./examples/question-answering/run_qa.py \
    --model_name_or_path deepset/roberta-base-squad2 \
    --dataset_name squad_v2 \
    --do_train \
    --do_eval \
    --learning_rate 3e-5 \
    --num_train_epochs 2 \
    --max_seq_length 384 \
    --doc_stride 128 \
    --output_dir ./models/wwm_unc

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `0`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
/usr/bin/python3: can't open file '/content/./examples/question-answering/run_qa.py': [Errno 2] No such file or directory
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 48, in main
    args.func(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1174, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 769, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command

In [57]:
!git clone https://github.com/huggingface/transformers.git

fatal: destination path 'transformers' already exists and is not an empty directory.


In [58]:
# Import necessary libraries
from transformers import pipeline, AutoTokenizer, AutoModelForQuestionAnswering, TrainingArguments, Trainer
import textwrap
from difflib import SequenceMatcher
import datasets
from datasets import load_dataset

# Function to calculate similarity between two strings
def calculate_similarity(a, b):
    return SequenceMatcher(None, a, b).ratio()

# Load the fine-tuned model
model = AutoModelForQuestionAnswering.from_pretrained('deepset/roberta-base-squad2')
tokenizer = AutoTokenizer.from_pretrained('deepset/roberta-base-squad2')

# Load and prepare the SQuAD dataset
squad = datasets.load_dataset("rajpurkar/squad")
train_dataset = squad["train"]
test_dataset = squad["validation"]



In [59]:
# Preprocess data
def preprocess_function(examples):
    # Tokenize the questions and contexts
    tokenized_examples = tokenizer(
        examples["question"],
        examples["context"],
        truncation="only_second",
        padding="max_length",
        max_length=500,
    )
    return tokenized_examples

train_dataset = train_dataset.map(preprocess_function, batched=True)
test_dataset = test_dataset.map(preprocess_function, batched=True)

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    evaluation_strategy="epoch",
)

Map:   0%|          | 0/87599 [00:00<?, ? examples/s]

Exception: Truncation error: Sequence to truncate too short to respect the provided max_length

In [None]:
# Create Trainer and fine-tune
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
)

In [None]:
# Fine tune the model
# trainer.train() I removed this because it crashed

trainer.save_model("./fine_tuned_model")

In [None]:
# Load fine-tuned model and tokenizer
fine_tuned_tokenizer = AutoTokenizer.from_pretrained("./fine_tuned_model")
fine_tuned_model = AutoModelForQuestionAnswering.from_pretrained("./fine_tuned_model")

# Create question answering pipeline
qna_pipeline = pipeline('question-answering', model=fine_tuned_model, tokenizer=fine_tuned_tokenizer)

# Display the context article
dedented_text = textwrap.dedent(context).strip()
print("Context Article:\n")
print(textwrap.fill(dedented_text, width=120))

# Initialize counters for correctness
correct_count = 0
total_count = len(questions)

# Loop through the questions array
for i, inquiry in enumerate(questions):

    # Get the model's answer
    answer = qna_pipeline({'question': inquiry, 'context': context})
    model_answer = answer['answer']

    # Try to extract the context around the predicted answer
    start_idx = answer['start']
    end_idx = answer['end']
    extracted_context = context[start_idx-50:end_idx+50]

    print(f"Question: {question[i]}")
    print("Answer found: " + model_answer)
    print("At Index:", start_idx, "-", end_idx)
    print("With probability:", answer['score'], "\n")

    # Check if the model's answer is a reasonable match based on nearby context
    similarity_score = calculate_similarity(model_answer.lower(), extracted_context.lower())

    # If similarity score is high, assume the answer is correct
    if similarity_score > 0.5:
        print("The model's answer is likely correct!")
        correct_count += 1
    else:
        print("The model's answer seems incorrect or unclear.")

    print("-" * 80)

# Calculate overall accuracy
accuracy = correct_count / total_count * 100
print(f"Accuracy: {accuracy:.2f}%")

## **Model of Distilbert (Uncased)**

The explanation of each each is same as the Google Bert, but the different is the model name

In [60]:
# Import necessary libraries
from transformers import pipeline, AutoTokenizer, AutoModelForQuestionAnswering, TrainingArguments, Trainer
import textwrap
from difflib import SequenceMatcher
import datasets
from datasets import load_dataset

# Function to calculate similarity between two strings
def calculate_similarity(a, b):
    return SequenceMatcher(None, a, b).ratio()

# Load the fine-tuned model
tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased-distilled-squad")
model = AutoModelForQuestionAnswering.from_pretrained("distilbert/distilbert-base-uncased-distilled-squad")

# Load and prepare the SQuAD dataset
squad = datasets.load_dataset("rajpurkar/squad")
train_dataset = squad["train"]
test_dataset = squad["validation"]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/451 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/265M [00:00<?, ?B/s]

In [61]:
# Preprocess data
def preprocess_function(examples):
    return tokenizer(
        examples["question"],
        examples["context"],
        truncation="only_second",
        padding="max_length",
        max_length=384,
    )

train_dataset = train_dataset.map(preprocess_function, batched=True)
test_dataset = test_dataset.map(preprocess_function, batched=True)

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    per_device_train_batch_size=16,  # Increased batch size
    per_device_eval_batch_size=16,  # Increased batch size
    num_train_epochs=3,           # Increased number of epochs
    evaluation_strategy="epoch",
)

Map:   0%|          | 0/87599 [00:00<?, ? examples/s]

Map:   0%|          | 0/10570 [00:00<?, ? examples/s]

In [62]:
# Create Trainer and fine-tune
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
)

In [63]:
# Fine tune the model
# trainer.train() I removed this because it crashed

trainer.save_model("./fine_tuned_model")

In [77]:
# Load fine-tuned model and tokenizer
fine_tuned_tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased-distilled-squad")
fine_tuned_model = AutoModelForQuestionAnswering.from_pretrained("./fine_tuned_model")

# Create question answering pipeline
qna_pipeline = pipeline('question-answering', model=fine_tuned_model, tokenizer=fine_tuned_tokenizer)

# Display the context article
dedented_text = textwrap.dedent(context).strip()
print("Context Article:\n")
print(textwrap.fill(dedented_text, width=120))

# Initialize counters for correctness
correct_count = 0
total_count = len(questions)

# Loop through the questions array
for i, inquiry in enumerate(questions):

    # Get the model's answer
    answer = qna_pipeline({'question': inquiry, 'context': context})
    model_answer = answer['answer']

    # Try to extract the context around the predicted answer
    start_idx = answer['start']
    end_idx = answer['end']
    extracted_context = context[start_idx-50:end_idx+50]  # Get nearby context around the answer

    print(f"Question: {question[i]}")
    print("Answer found: " + model_answer)
    print("At Index:", start_idx, "-", end_idx)
    print("With probability:", answer['score'], "\n")

    # Check if the model's answer is a reasonable match based on nearby context
    similarity_score = calculate_similarity(model_answer.lower(), extracted_context.lower())

    # If similarity score is high, assume the answer is correct
    if similarity_score > 0.5:  # You can adjust this threshold
        print("The model's answer is likely correct!")
        correct_count += 1
    else:
        print("The model's answer seems incorrect or unclear.")

    print("-" * 80)

# Calculate overall accuracy
accuracy = correct_count / total_count * 100
print(f"Accuracy: {accuracy:.2f}%")

  return torch.load(checkpoint_file, map_location="cpu")


Context Article:

manila government arranging chartered flight repatriation 200 overseas filipino worker beirut lebanon department migrant
worker dmw said wednesday trying provide chartered flight talking airline company chartered flight would able
accommodate example le 300 overseas filipino worker beirut dmw undersecretary bernard olalia said palace press briefing
scheduled flight around 15 ofws sept 25 cancelled recent bombing beirut olalia said around 111 ofws staying four
temporary shelter beirut waiting repatriation additional 110 ofws applying exit permit lebanese government olalia said
apart documented ofws undocumented ofws need secure travel document theyre given travel document help securing also exit
visa exit permit immigration lebanese government said olalia however said philippine government facing several challenge
including securing landing right chartered flight said land sea route considered case situation escalates make
impossible take air route dmw also studying po