<a href="https://colab.research.google.com/github/rahiakela/transformers-research-and-practice/blob/main/natural-language-processing-with-transformers/05-making-transformers-efficient-in-production/case_study_intent_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Intent Detection: Case Study

Let’s suppose that we’re trying to build a text-based assistant for our company’s call center so
that customers can request the balance of their account or make bookings without needing to
speak with a human agent. In order to understand the goals of a customer, our assistant will
need to be able to classify a wide variety of natural language text into a set of predefined
actions or intents.

For example, a customer may send a message about an upcoming trip:

```txt
Hey, I’d like to rent a vehicle from Nov 1st to Nov 15th in Paris and I need a 15 passenger van
```

and our intent classifier could automatically categorize this as a Car Rental intent, which then triggers an action and response.

To be robust in a production environment, our classifier will
also need to be able to handle out-of-scope queries.

<img src='images/1.png?raw=1' width='600'/>

In the third case, the text-assistant
has been trained to detect out-of-scope queries (usually labelled as a separate class) and informs the customer about which topics they can respond to.

As a baseline we’ve fine-tuned a BERT-base model that achieves around `94%` accuracy on the
`CLINC150` dataset. This dataset includes `22,500` in-scope queries across `150` intents and `10`
domains like banking and travel, and also includes `1,200` out-of-scope queries that belong to an
oos intent class. In practice we would also gather our own in-house dataset, but using public
data is a great way to iterate quickly and generate preliminary results.



##Setup

In [None]:
!pip -q install transformers[sentencepiece]
!pip -q install datasets

In [2]:
from transformers import (AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline)

Now, let’s download our fine-tuned model from the Hugging Face Hub and wrap it in a pipeline for text classification:

In [None]:
# molde path has changed: https://huggingface.co/transformersbook/bert-base-uncased-finetuned-clinc
bert_ckpt = "transformersbook/bert-base-uncased-finetuned-clinc"

bert_tokenizer = AutoTokenizer.from_pretrained(bert_ckpt)
bert_model = (AutoModelForSequenceClassification.from_pretrained(bert_ckpt).to("cpu"))

bert_pipeline = TextClassificationPipeline(model=bert_model, tokenizer=bert_tokenizer)

Here we’ve set the model’s device to cpu since our text-assistant will need to operate in an
environment where queries are processed and responded to in real-time.

Now that we have a pipeline, we can pass a query to get the predicted intent and confidence
score from the model:

In [5]:
query = """Hey, I'd like to rent a vehicle from Nov 1st to Nov 15th in Paris and I need a 15 passenger van"""

bert_pipeline(query)

[{'label': 'car_rental', 'score': 0.5490034818649292}]

Great, the `car_rental` intent makes sense so let’s now look at creating a benchmark that we
can use to evaluate the performance of our baseline model.

##Performance Benchmark