## 📚 1. Installing Essential Libraries

Before diving into our NLP task, we need to set up our environment by installing the necessary Python packages. This cell handles the installation of key libraries required for working with Hugging Face models.

- **`transformers`**: This is the main library from Hugging Face. It gives us access to thousands of pre-trained models and the powerful `pipeline` API, which simplifies using these models.
- **`sentencepiece`**: A tokenizer and detokenizer library. It's a dependency for many modern transformer models, used to convert text into a format the model can process.
- **`sacremoses`**: Another dependency used for text tokenization, which is a fundamental step in any NLP pipeline.

In [None]:
!pip install -U transformers
!pip install -U sentencepiece
!pip install -U sacremoses

## 📂 2. Setting a Custom Cache Directory (Optional)

Hugging Face models can be quite large, and the library downloads them to a local cache folder the first time you use them. This code block is a handy way to control where these large model files are stored.

By setting the `HF_HOME` environment variable, we instruct the `transformers` library to use our specified folder (`X:\...\models`). This helps in managing disk space and keeping project-related files organized in one place.

In [None]:
import os
new_cache_dir = """X:\AI-learin\courss\Fine-Tuning-LLM-with-HuggingFace-main\models"""
os.environ['HF_HOME'] = new_cache_dir

## 📦 3. Importing Necessary Modules

Now, we import the specific components we'll use from our installed libraries.

- **`pipeline` from `transformers`**: The high-level API that allows us to perform complex NLP tasks in just a few lines of code. It handles all the background work for us.
- **`pandas` as `pd`**: A powerful library for data manipulation and analysis. We will use it to display the model's output neatly in a table (DataFrame).

In [None]:
from transformers import pipeline
import pandas as pd

## 🤔 4. Performing Question Answering

In this section, we'll perform **Extractive Question Answering**. The goal is to ask a model a question and have it find the answer within a given piece of text (the context).

1.  **Defining Context and Question**: We first define the `text` (our context) and the `question` we want to ask about it.

2.  **Model Selection**: We choose a model fine-tuned for this task. `"deepset/roberta-base-squad2"` is a RoBERTa model that has been trained on the SQuAD 2.0 dataset, a benchmark for question answering.

3.  **Creating the Pipeline**: We initialize the `pipeline` for `"question-answering"`. This specific pipeline is designed to take a `question` and a `context` as input.

4.  **Getting the Answer**: We call the `reader` pipeline, passing our question and context. The model reads the context and extracts the span of text that it believes is the most likely answer.

5.  **Displaying the Result**: The output is a dictionary containing the `answer`, a confidence `score`, and its start/end positions. We wrap it in a list `[outputs]` and convert it to a Pandas DataFrame for a clean, easy-to-read presentation.

In [None]:

text = """
Dear Amazon, last week I ordered an Optimus Prime action figure from your
online store in India. Unfortunately when I opened the package, I discovered to
my horror that I had been sent an action figure of Megatron instead!
"""


model = "deepset/gelectra-large-germanquad"
model = "mrm8488/bert-tiny-5-finetuned-squadv2"
model = "deepset/roberta-base-squad2"

reader = pipeline("question-answering", model=model, device="cuda")
question = "from where did I placed order?"

outputs = reader(question=question, context=text)
pd.DataFrame([outputs])