# Question & Answer with hugging face pipelines
* Notebook by Adam Lang
* Date: 12/3/2024

# Overview
* In this notebook we will demonstrate how to implement a Question-Answer pipeline using hugging face transformers.

# Install dependencies
* We have to install `Sacremoses'. Sacremoses is a Python library that provides a port of the Moses tokenizer, truecaser, and other text normalization tools used in natural language processing (NLP).
* link: https://pypi.org/project/sacremoses/

In [1]:
!pip install -U transformers #upgrades
!pip install -U sentencepiece #upgrades
!pip install -U sacremoses #upgrades

Collecting transformers
  Downloading transformers-4.46.3-py3-none-any.whl.metadata (44 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.1/44.1 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
Downloading transformers-4.46.3-py3-none-any.whl (10.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.0/10.0 MB[0m [31m19.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: transformers
  Attempting uninstall: transformers
    Found existing installation: transformers 4.46.2
    Uninstalling transformers-4.46.2:
      Successfully uninstalled transformers-4.46.2
Successfully installed transformers-4.46.3
Collecting sacremoses
  Downloading sacremoses-0.1.1-py3-none-any.whl.metadata (8.3 kB)
Downloading sacremoses-0.1.1-py3-none-any.whl (897 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m897.5/897.5 kB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: sacremoses
Successfully installed sacr

In [2]:
## imports
from transformers import pipeline
import pandas as pd

# Question-Answer Pipeline using Hugging Face Transformers
* In this example we will utilize a demo of customer service question and answers.

* The default transformer model used for this pipeline is the `distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5`
  * model card: https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad
  * The model was pretrained using the SQUAD question answer dataset from Stanford.

In [5]:
text = """
Dear Amazon, last week I ordered a new pair of alpine skis
from your online store in Seattle. Unfortunately when I opened
the package, I discovered that I had accidentally been sent a Snowboard instead!

"""

## load a Q&A pipeline
reader = pipeline("question-answering")
question = "from where did i place the order?"

## pipeline output
outputs = reader(question=question,
                 context=text)

## output to df
pd.DataFrame([outputs])

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


Unnamed: 0,score,start,end,answer
0,0.440275,70,82,online store
