<a href="https://colab.research.google.com/github/mmaguero/diploma_fpuna_nlp_ia/blob/master/2025/guarani_wiki_question_answering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [64]:
# Transformers installation
! pip install transformers datasets evaluate accelerate
# To install from source instead of the last release, comment the command above and uncomment the following one.
# ! pip install git+https://github.com/huggingface/transformers.git



# Question answering

In [65]:
#@title
from IPython.display import HTML

HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/ajPx5LwJD-I?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>')



Question answering tasks return an answer given a question. If you've ever asked a virtual assistant like Alexa, Siri or Google what the weather is, then you've used a question answering model before. There are two common types of question answering tasks:

- Extractive: extract the answer from the given context.
- Abstractive: generate an answer from the context that correctly answers the question.

This guide will show you how to:

1. Finetune [DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased) on the [SQuAD](https://huggingface.co/datasets/squad) dataset for extractive question answering.
2. Use your finetuned model for inference.

<Tip>

To see all architectures and checkpoints compatible with this task, we recommend checking the [task-page](https://huggingface.co/tasks/question-answering)

</Tip>

Before you begin, make sure you have all the necessary libraries installed:

```bash
pip install transformers datasets evaluate
```

We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:

In [66]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv‚Ä¶

## Load SQuAD dataset

Start by loading a smaller subset of the SQuAD dataset from the ü§ó Datasets library. This'll give you a chance to experiment and make sure everything works before spending more time training on the full dataset.

In [67]:
#from datasets import load_dataset

#squad = load_dataset("squad")

Split the dataset's `train` split into a train and test set with the [train_test_split](https://huggingface.co/docs/datasets/main/en/package_reference/main_classes#datasets.Dataset.train_test_split) method:

In [122]:
from datasets import load_dataset

# 1. Reload the original squad dataset
squad = load_dataset("alexandrainst/multi-wiki-qa", "gn")

Then take a look at an example:

In [69]:
import random
rnd = random.randint(0, len(squad["train"]))
rnd, squad["train"][rnd]

(912,
 {'id': 'https://gn.wikipedia.org/wiki/Ypa%E2%80%99%C5%A9%20Umbu',
  'title': 'Ypa‚Äô≈© Umbu',
  'context': "Umbu peteƒ© ypa‚Äô≈© michƒ©va oƒ©va tet√£vore √ëe'·∫Ωmbuk√∫pe, opyta 12 km ko tet√£‚Äôi tavus√∫gui. Ojekuaa Cuenca lechera √ëe'·∫Ωmbuku ramo.\n\nT√°va \n\nKo t√°vape oiko 320 yvyp√≥ra, ha‚Äôeku√©ra hory ha ojokuaapa, up√©vare ojoayhu hiku√°i.\nUmb√∫gui ou Pilar‚Äìpe 3000lts kamby. ko ypa‚Äô≈©me o√±e√±angareko gueteri umi tava‚Äôi ymagua rehe. \n\nHe√±√≥i ary 1860‚Äìpe, karai Carlos Antonio L√≥pez ohenda umi pyaenda ypy o√±emopu‚Äô√£ hagua Tup√£o. Ojapoma 150 ary  he√±√≥i hague ha ko‚Äô√°gaite peve o√±emomba‚Äôe ha ojeguerohory avaku√©ra mborayhu.\n\nTup√£o Umb√∫  megua ha‚Äôe mba‚Äôe ojeguerohory ha o√±embotuichav√©va ko t√°vape, ary 1862-pe ojejapova‚Äôekue ha ko‚Äô√°ga meve oguerek√≥iti estilo colonial siglo XIX pegua, ogyke i√±anambus√∫va, √≥gahoja karanda‚Äôy, takuarilla ha √±ay‚Äô≈© kaigu√©gui. Henondet√©pe oƒ© peteƒ© kurusu yvyr√°gui ojejap√≥va.\n\nIta marangatu mbyt

Adding a true ID, the existing ones is the URL and can be duplicated...

In [70]:
import hashlib

def generate_md5(input_string):
    """Generates an MD5 hash for the given input string."""
    # Encode the input string to bytes
    encoded_string = input_string.encode('utf-8')

    # Compute the MD5 hash
    md5_hash = hashlib.md5(encoded_string)

    # Return the hexadecimal representation of the hash
    return md5_hash.hexdigest()

In [71]:
def add_hashed_id(example):
    # Rename the existing 'id' key to 'url'
    example['url'] = example.pop('id')

    # Extract question
    question = example['question']

    # Safely extract answer text
    answers = example['answers']
    answer_text = ""
    if answers and answers['text']:
        answer_text = answers['text'][0]

    # Concatenate url, question, and answer_text
    combined_string = example['url'] + question + answer_text

    # Generate MD5 hash for the new 'id'
    example['id'] = generate_md5(combined_string)

    return example

In [72]:
squad = squad.map(add_hashed_id)
print("Dataset 'squad_final' updated with new hashed IDs and 'url' fields.")
print(squad)

Dataset 'squad_final' updated with new hashed IDs and 'url' fields.
DatasetDict({
    train: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers', 'url'],
        num_rows: 5003
    })
})


Adding test and validation sets...

In [125]:
# 1. Split the original 'train' split into:
# 90% for new 'train' and 10% for 'temp_test'
squad_temp = squad["train"].train_test_split(test_size=0.1, seed=42)
print("Initial split (train and temp_test):")
print(squad_temp)

# 2. add validation
squad_final = squad_temp["train"].train_test_split(test_size=0.0625, seed=42) # 5%
squad_final["validation"] = squad_final.pop("test")
squad_final["test"] = squad_temp["test"]

print("Final dataset splits (train, validation, test):")
print(squad_final)

Initial split (train and temp_test):
DatasetDict({
    train: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 4502
    })
    test: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 501
    })
})
Final dataset splits (train, validation, test):
DatasetDict({
    train: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 4220
    })
    validation: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 282
    })
    test: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers'],
        num_rows: 501
    })
})


In [126]:
import random
rnd = random.randint(0, len(squad_final["train"]))
rnd, squad_final["train"][rnd]

(912,
 {'id': 'https://gn.wikipedia.org/wiki/Vyret%C3%A1%C3%B1a%20Mburuvi',
  'title': 'Vyret√°√±a Mburuvi',
  'context': "Vyret√°√±a Mburuvi (Ingle√±e'·∫Ωme: British Empire) ha'e akue opaite umi yvy peh·∫Ωngue, kol√≥√±a, tet√£ mo'√£mbyre ha tekoha ambu√©va oƒ© Tavet√£ Joaju pogu√Ωpe saro'y XVI guive XX meve. Ha'e akue pe mburuvi tuichav√©va oiko Yv√Ωpe ko √°ra peve.\nAmo saro'y XX √±epyr≈© pukukue, oiko Vyret√°√±a Mburuv√≠pe amo 458 sua tapicha ha ijyvy apekue ohupyty amo 35.000.000\xa0km¬≤, upe niko he'ise irund√Ωgui peteƒ© oikove ko Mburuv√≠pe ha 5-gui peteƒ© opaite yvy apeku√©gui oƒ© ipogu√Ωpe. Up√©icha k√≥va niko pe Mburuvi tuichav√©va ojehecha Yvy ap√©re.\n\nKo Mburuvi omosarambi oparupi umi mba'e pyahu ojejap√≥va Ingyat√©rrape, up√©icha avei o√±emyas√£i het√£ ter√£re pe ingle√±e'·∫Ω, ku o√±e√±emuh√°icha ha ku ojejokuaih√°icha Tavat√£ Joaj√∫pe. Upe aja ko Mburuvi a√±√≥nte heko pu'akaite up√©icha Tavet√£ Joaju tuicha imba'ehetave. Ko'√£ga, heta Vyret√°√±a Mburuvi kolo√±akue oƒ© um

There are several important fields here:

- `answers`: the starting location of the answer token and the answer text.
- `context`: background information from which the model needs to extract the answer.
- `question`: the question a model should answer.

## Preprocess

In [75]:
#@title
from IPython.display import HTML

HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/qgaM0weJHpA?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>')

The next step is to load a DistilBERT tokenizer to process the `question` and `context` fields:

In [128]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("facebook/xlm-v-base")
#"mmaguero/gn-bert-tiny-cased")
#"distilbert/distilbert-base-uncased")
#"mmaguero/multilingual-bert-gn-base-cased") #

config.json:   0%|          | 0.00/650 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/18.2M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/61.4M [00:00<?, ?B/s]

There are a few preprocessing steps particular to question answering tasks you should be aware of:

1. Some examples in a dataset may have a very long `context` that exceeds the maximum input length of the model. To deal with longer sequences, truncate only the `context` by setting `truncation="only_second"`.
2. Next, map the start and end positions of the answer to the original `context` by setting
   `return_offset_mapping=True`.
3. With the mapping in hand, now you can find the start and end tokens of the answer. Use the [sequence_ids](https://huggingface.co/docs/tokenizers/main/en/api/encoding#tokenizers.Encoding.sequence_ids) method to
   find which part of the offset corresponds to the `question` and which corresponds to the `context`.

Here is how you can create a function to truncate and map the start and end tokens of the `answer` to the `context`:

In [129]:
def preprocess_function(examples):
    questions = [q.strip() for q in examples["question"]]
    inputs = tokenizer(
        questions,
        examples["context"],
        max_length=384,
        truncation="only_second",
        return_offsets_mapping=True,
        padding="max_length",
    )

    offset_mapping = inputs.pop("offset_mapping")
    answers = examples["answers"]
    start_positions = []
    end_positions = []

    for i, offset in enumerate(offset_mapping):
        answer = answers[i]
        start_char = answer["answer_start"][0]
        end_char = answer["answer_start"][0] + len(answer["text"][0])
        sequence_ids = inputs.sequence_ids(i)

        # Find the start and end of the context
        idx = 0
        while sequence_ids[idx] != 1:
            idx += 1
        context_start = idx
        while sequence_ids[idx] == 1:
            idx += 1
        context_end = idx - 1

        # If the answer is not fully inside the context, label it (0, 0)
        if offset[context_start][0] > end_char or offset[context_end][1] < start_char:
            start_positions.append(0)
            end_positions.append(0)
        else:
            # Otherwise it's the start and end token positions
            idx = context_start
            while idx <= context_end and offset[idx][0] <= start_char:
                idx += 1
            start_positions.append(idx - 1)

            idx = context_end
            while idx >= context_start and offset[idx][1] >= end_char:
                idx -= 1
            end_positions.append(idx + 1)

    inputs["start_positions"] = start_positions
    inputs["end_positions"] = end_positions
    return inputs

To apply the preprocessing function over the entire dataset, use ü§ó Datasets [map](https://huggingface.co/docs/datasets/main/en/package_reference/main_classes#datasets.Dataset.map) function. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once. Remove any columns you don't need:

In [130]:
tokenized_squad = squad_final.map(preprocess_function, batched=True, remove_columns=squad_final["train"].column_names)
print("Tokenized dataset structure:")
print(tokenized_squad)

Map:   0%|          | 0/4220 [00:00<?, ? examples/s]

Map:   0%|          | 0/282 [00:00<?, ? examples/s]

Map:   0%|          | 0/501 [00:00<?, ? examples/s]

Tokenized dataset structure:
DatasetDict({
    train: Dataset({
        features: ['input_ids', 'attention_mask', 'start_positions', 'end_positions'],
        num_rows: 4220
    })
    validation: Dataset({
        features: ['input_ids', 'attention_mask', 'start_positions', 'end_positions'],
        num_rows: 282
    })
    test: Dataset({
        features: ['input_ids', 'attention_mask', 'start_positions', 'end_positions'],
        num_rows: 501
    })
})


Now create a batch of examples using [DefaultDataCollator](https://huggingface.co/docs/transformers/main/en/main_classes/data_collator#transformers.DefaultDataCollator). Unlike other data collators in ü§ó Transformers, the [DefaultDataCollator](https://huggingface.co/docs/transformers/main/en/main_classes/data_collator#transformers.DefaultDataCollator) does not apply any additional preprocessing such as padding.

In [131]:
#from transformers import DefaultDataCollator

#data_collator = DefaultDataCollator()

## Evaluate

Evaluation for question answering requires a significant amount of postprocessing. To avoid taking up too much of your time, this guide skips the evaluation step. The [Trainer](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer) still calculates the evaluation loss during training so you're not completely in the dark about your model's performance.

If you have more time and you're interested in how to evaluate your model for question answering, take a look at the [Question answering](https://huggingface.co/course/chapter7/7?fw=pt#post-processing) chapter from the ü§ó Hugging Face Course!

Let's try this [guide](https://huggingface.co/learn/llm-course/chapter7/7?fw=pt#post-processing)...

In [132]:
from tqdm.auto import tqdm
import evaluate
import numpy as np
import collections

metric = evaluate.load("squad")
n_best = 20
max_answer_length = 30

def compute_metrics(start_logits, end_logits, features, examples):
    example_to_features = collections.defaultdict(list)
    for idx, feature in enumerate(features):
        example_to_features[feature["example_id"]].append(idx)

    predicted_answers = []
    for example in tqdm(examples):
        example_id = example["id"]
        context = example["context"]
        answers = []

        # Loop through all features associated with that example
        for feature_index in example_to_features[example_id]:
            start_logit = start_logits[feature_index]
            end_logit = end_logits[feature_index]
            offsets = features[feature_index]["offset_mapping"]

            start_indexes = np.argsort(start_logit)[-1 : -n_best - 1 : -1].tolist()
            end_indexes = np.argsort(end_logit)[-1 : -n_best - 1 : -1].tolist()
            for start_index in start_indexes:
                for end_index in end_indexes:
                    # Skip answers that are not fully in the context
                    if offsets[start_index] is None or offsets[end_index] is None:
                        continue
                    # Skip answers with a length that is either < 0 or > max_answer_length
                    if (
                        end_index < start_index
                        or end_index - start_index + 1 > max_answer_length
                    ):
                        continue

                    answer = {
                        "text": context[offsets[start_index][0] : offsets[end_index][1]],
                        "logit_score": start_logit[start_index] + end_logit[end_index],
                    }
                    answers.append(answer)

        # Select the answer with the best score
        if len(answers) > 0:
            best_answer = max(answers, key=lambda x: x["logit_score"])
            predicted_answers.append(
                {"id": example_id, "prediction_text": best_answer["text"]}
            )
        else:
            predicted_answers.append({"id": example_id, "prediction_text": ""})

    theoretical_answers = [{"id": ex["id"], "answers": ex["answers"]} for ex in examples]
    return metric.compute(predictions=predicted_answers, references=theoretical_answers)

In [133]:
max_length = 384
stride = 128

def preprocess_validation_examples(examples):
    questions = [q.strip() for q in examples["question"]]
    inputs = tokenizer(
        questions,
        examples["context"],
        max_length=max_length,
        truncation="only_second",
        stride=stride,
        return_overflowing_tokens=True,
        return_offsets_mapping=True,
        padding="max_length",
    )

    sample_map = inputs.pop("overflow_to_sample_mapping")
    example_ids = []

    for i in range(len(inputs["input_ids"])):
        sample_idx = sample_map[i]
        example_ids.append(examples["id"][sample_idx])

        sequence_ids = inputs.sequence_ids(i)
        offset = inputs["offset_mapping"][i]
        inputs["offset_mapping"][i] = [
            o if sequence_ids[k] == 1 else None for k, o in enumerate(offset)
        ]

    inputs["example_id"] = example_ids
    return inputs

In [134]:
validation_dataset = squad_final["validation"].map(
    preprocess_validation_examples,
    batched=True,
    remove_columns=squad_final["validation"].column_names,
)
len(squad_final["validation"]), len(validation_dataset)

Map:   0%|          | 0/282 [00:00<?, ? examples/s]

(282, 1595)

In [135]:
test_dataset = squad_final["test"].map(
    preprocess_validation_examples,
    batched=True,
    remove_columns=squad_final["test"].column_names,
)
len(squad_final["test"]), len(test_dataset)

Map:   0%|          | 0/501 [00:00<?, ? examples/s]

(501, 2540)

In [136]:
def preprocess_training_examples(examples):
    questions = [q.strip() for q in examples["question"]]
    inputs = tokenizer(
        questions,
        examples["context"],
        max_length=max_length,
        truncation="only_second",
        stride=stride,
        return_overflowing_tokens=True,
        return_offsets_mapping=True,
        padding="max_length",
    )

    offset_mapping = inputs.pop("offset_mapping")
    sample_map = inputs.pop("overflow_to_sample_mapping")
    answers = examples["answers"]
    start_positions = []
    end_positions = []

    for i, offset in enumerate(offset_mapping):
        sample_idx = sample_map[i]
        answer = answers[sample_idx]
        start_char = answer["answer_start"][0]
        end_char = answer["answer_start"][0] + len(answer["text"][0])
        sequence_ids = inputs.sequence_ids(i)

        # Find the start and end of the context
        idx = 0
        while sequence_ids[idx] != 1:
            idx += 1
        context_start = idx
        while sequence_ids[idx] == 1:
            idx += 1
        context_end = idx - 1

        # If the answer is not fully inside the context, label is (0, 0)
        if offset[context_start][0] > start_char or offset[context_end][1] < end_char:
            start_positions.append(0)
            end_positions.append(0)
        else:
            # Otherwise it's the start and end token positions
            idx = context_start
            while idx <= context_end and offset[idx][0] <= start_char:
                idx += 1
            start_positions.append(idx - 1)

            idx = context_end
            while idx >= context_start and offset[idx][1] >= end_char:
                idx -= 1
            end_positions.append(idx + 1)

    inputs["start_positions"] = start_positions
    inputs["end_positions"] = end_positions
    return inputs

In [137]:
train_dataset = squad_final["train"].map(
    preprocess_training_examples,
    batched=True,
    remove_columns=squad_final["train"].column_names,
)
len(squad_final["train"]), len(train_dataset)

Map:   0%|          | 0/4220 [00:00<?, ? examples/s]

(4220, 20995)

## Train

<Tip>

If you aren't familiar with finetuning a model with the [Trainer](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer), take a look at the basic tutorial [here](https://huggingface.co/docs/transformers/main/en/tasks/../training#train-with-pytorch-trainer)!

</Tip>

You're ready to start training your model now! Load DistilBERT with [AutoModelForQuestionAnswering](https://huggingface.co/docs/transformers/main/en/model_doc/auto#transformers.AutoModelForQuestionAnswering):

In [138]:
from transformers import AutoModelForQuestionAnswering, TrainingArguments, Trainer

model = AutoModelForQuestionAnswering.from_pretrained("facebook/xlm-v-base")
#"mmaguero/gn-bert-tiny-cased")
#"mmaguero/multilingual-bert-gn-base-cased")
#"distilbert/distilbert-base-uncased")

pytorch_model.bin:   0%|          | 0.00/3.12G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/3.12G [00:00<?, ?B/s]

Some weights of XLMRobertaForQuestionAnswering were not initialized from the model checkpoint at facebook/xlm-v-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


At this point, only three steps remain:

1. Define your training hyperparameters in [TrainingArguments](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.TrainingArguments). The only required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model).
2. Pass the training arguments to [Trainer](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer) along with the model, dataset, tokenizer, and data collator.
3. Call [train()](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer.train) to finetune your model.

In [None]:
training_args = TrainingArguments(
    output_dir="multi-wiki-qa-gn-xlm-v-base",
    #output_dir="multi-wiki-qa-gn-bert-tiny-cased",
    #output_dir="multi-wiki-qa-multilingual-bert-gn-base-cased",
    eval_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=10,
    weight_decay=0.01,
    save_total_limit=3,
    #metric_for_best_model="combined",
    push_to_hub=False,
    fp16=True,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,#tokenized_squad["train"],
    eval_dataset=validation_dataset,#tokenized_squad["validation"],
    processing_class=tokenizer,
    #data_collator=data_collator,
    #compute_metrics=compute_metrics,

)

trainer.train()

In [None]:
predictions, _, _ = trainer.predict(validation_dataset)
start_logits, end_logits = predictions
compute_metrics(start_logits, end_logits, validation_dataset, squad_final["validation"])

In [None]:
predictions, _, _ = trainer.predict(test_dataset)
start_logits, end_logits = predictions
compute_metrics(start_logits, end_logits, test_dataset, squad_final["test"])

In [None]:
trainer.evaluate(test_dataset)

Once training is completed, share your model to the Hub with the [push_to_hub()](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer.push_to_hub) method so everyone can use your model:

In [93]:
trainer.push_to_hub()

Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

  ...y-cased/model.safetensors:   2%|1         |  556kB / 36.5MB            

  ...898633.14ce3e7f05e8.723.0:   2%|1         |  85.0B / 5.62kB            

  ...900062.14ce3e7f05e8.723.1:   2%|1         |  85.0B / 5.62kB            

  ...900728.14ce3e7f05e8.723.2:   2%|1         |  98.0B / 6.46kB            

  ...900914.14ce3e7f05e8.723.3:   1%|1         |  5.00B /   359B            

  ...y-cased/training_args.bin:   2%|1         |  88.0B / 5.84kB            

CommitInfo(commit_url='https://huggingface.co/mmaguero/multi-wiki-qa-gn-bert-tiny-cased/commit/c82c21950858ccc045d73fece246ee286f79ec10', commit_message='End of training', commit_description='', oid='c82c21950858ccc045d73fece246ee286f79ec10', pr_url=None, repo_url=RepoUrl('https://huggingface.co/mmaguero/multi-wiki-qa-gn-bert-tiny-cased', endpoint='https://huggingface.co', repo_type='model', repo_id='mmaguero/multi-wiki-qa-gn-bert-tiny-cased'), pr_revision=None, pr_num=None)

<Tip>

For a more in-depth example of how to finetune a model for question answering, take a look at the corresponding
[PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb).

</Tip>

## Inference

Great, now that you've finetuned a model, you can use it for inference!

Come up with a question and some context you'd like the model to predict:

In [None]:
question = "Mba‚Äôe √°rapepa he√±√≥ikuri Jos√© Carlos Cabrera?"
context = """Jos√© Carlos Cabrera (Sapucai, 1 jasypok√µi ary 1989 -pe) ha'e peteƒ© artista paraguayo concierto guitarra cl√°sica rehegua.

Mba'apokuaa te√©va
Ary 2010 guive oiko Buenos Aires, Argentina-pe, up√©pe o√±emotenonde licenciatura de M√∫sica orek√≥va especializaci√≥n Guitarra-pe, o√±emoarand√∫vo Javier Bravo ndive Departamento de Artes Musicales y Sonido "Carlos L√≥pez Buchardo" Universidad Nacional de Artes-pe.

Ko'√£ga, ojeguereko ha'eha peteƒ©va umi omomba'eguas√∫va Agust√≠n P√≠o Barrios "Mangor√©" purah√©i, ha, jep√©mo imit√£, ha'e guitarrista paraguayo omoingev√©va Barrios rembiapokue irrepertorio-pe.

Marzo 2010 jave, ojere Europa-pe, ombovy'√°vo p√∫blico-pe umi obra "Mangor√©" Francia ha Holanda-pe.

Ome'√™ heta concierto Argentina-pe, um√≠va apyt√©pe Festival Internacional Guitarras del Mundo, 2o Encuentro Internacional Guitarra, ha Festival Internacional TSONAMI de M√∫sica Contempor√°nea, orepresent√°va Paraguay orek√≥va estreno obra contempor√°nea "Mangor√©" compositor paraguayo Nicol√°s P√©rez Gonz√°lez. Avei oime kuri Paragu√°i representante ramo Feria Internacional del Libro Buenos Aires 2011-pe.

Ojekuaa solista invitado ramo heta orquesta ndive: Orquesta Sinf√≥nica Nacional de Argentina, Orquesta Sinf√≥nica Ciudad de Asunci√≥n, Camerata Miranda, ha Orquesta C√°mara Centro Cultural Paraguayo-Americano, up√©pe oestrena concierto guitarra, flauta ha orquesta "Homage to Mangor√©" Maestro Luis Szar√°n. Omotenonde director nacional ha internacional, ha'eh√°icha Diego S√°nchez Haase, C√©sar Manuel "Lito" Barrios, Miguel A. Gilardi, ha Javier Aquino Maidana, ambue apyt√©pe.

Paragu√°ipe ombosako‚Äôi mok√µi programa amplio orek√≥va Agust√≠n Barrios rembiapo, ojejap√≥va op√°ichagua tend√°re tet√£ pukukue. Pete√Æva umi concierto oiko Mangor√© mansi√≥n San Juan Bautista-pe, oipor√∫vo guitarra Morant ha'eva'ekue Agust√≠n Barrios mba'e. Avei omimbi oparticip√°vo 5o Festival Internacional de Guitarra "Homage a Mangor√©", o√±emotenond√©va Asunci√≥n-pe. Ome'√´ actuaciones significativas, ha'eh√°icha concierto omotenond√©va estreno mundial √∫nico obra guitarra cl√°sica-pe guar√£ ilustre compositor paraguayo Carlos Lara Bareiro, 22 ary oman√≥ha. Ohupyty pete√Æha jop√≥i concurso internacional interpretaci√≥n "Momento Musical Opus 2009" agosto upe ar√Ωpe Asunci√≥n, Paraguay-pe, ha oime juez ramo upe competencia-pe guar√£ ambue ar√Ωpe. Avei ohupyty mok√µiha jop√≥i "Musicampus 2007 Guitarra Cl√°sica Concurso" C√≥rdoba, Argentina-pe.

Oike mundo de la m√∫sica-pe orek√≥pe 11 ary, m√∫sica folkl√≥rica paraguaya rupive. Orek√≥pe 14 ary, o√±epyr≈© ijestudio viol√≠n rehegua, ha orek√≥pe 15 ary, oiporavo definitivamente guitarra cl√°sica instrumento principal ramo. O√±embokatupyry i√±epyr√ªh√°pe umi conservatorio ojeguerohor√Ωva mbo'eh√°ra paraguayo ojekua√°va ndive, omohu'√£vo honores orek√≥va 18 ary orek√≥va carrera Actuaci√≥n de Guitarra Cl√°sica ha Teor√≠a de la M√∫sica ha Solf√®ge. Ojapo op√°ichagua curso avanzado guitarrista heraku√£it√©va ndive, um√≠va apyt√©pe Pablo M√°rquez, Eduardo Fern√°ndez, Jos√© Antonio Escobar, Berta Rojas, ha V√≠ctor Villadangos, ambue apyt√©pe.

...‚ÄúRohecha h√≠na peteƒ© talento excepcional, peteƒ© mit√£rusu, adem√°s de italento, orek√≥va virtudes ha‚Äôeh√°icha humildad, seriedad, dedicaci√≥n ha peteƒ© capacidad expresiva ha‚Äô√©va peteƒ© joya ojejuh√∫va mbovyeterei int√©rprete-pe‚Äù (Berta Rojas).
"""

The simplest way to try out your finetuned model for inference is to use it in a [pipeline()](https://huggingface.co/docs/transformers/main/en/main_classes/pipelines#transformers.pipeline). Instantiate a `pipeline` for question answering with your model, and pass your text to it:

In [None]:
from transformers import pipeline

question_answerer = pipeline("question-answering",
                             #model="mmaguero/multi-wiki-qa-multilingual-bert-gn-base-cased"
                             #model="mmaguero/multi-wiki-qa-gn-bert-tiny-cased"
                             model="mmaguero/multi-wiki-qa-gn-xlm-v-base"
                             )
question_answerer(question=question, context=context)

In [None]:
question = "Moo oiko Jos√© Carlos Cabrera?"
question_answerer(question=question, context=context)

In [None]:
question_answerer(question=question, context=context, top_k=3)

In [None]:
question = "Araka'e o√±epyr≈© ijestudio viol√≠n rehegua?"
question_answerer(question=question, context=context, top_k=5)

You can also manually replicate the results of the `pipeline` if you'd like:

Tokenize the text and return PyTorch tensors:

In [98]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("mmaguero/multi-wiki-qa-gn-bert-tiny-cased")
inputs = tokenizer(question, context, return_tensors="pt", truncation=True, max_length=384)

Pass your inputs to the model and return the `logits`:

In [99]:
import torch
from transformers import AutoModelForQuestionAnswering

model = AutoModelForQuestionAnswering.from_pretrained("mmaguero/multi-wiki-qa-gn-bert-tiny-cased")
with torch.no_grad():
    outputs = model(**inputs)

Get the highest probability from the model output for the start and end positions:

In [100]:
answer_start_index = outputs.start_logits.argmax()
answer_end_index = outputs.end_logits.argmax()

Decode the predicted tokens to get the answer:

In [103]:
predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
#tokenizer.decode(predict_answer_tokens)
predict_answer_tokens

tensor([2])