Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: extend tutorial14 about query classification #3013

Merged
merged 23 commits into from Aug 12, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
e352e6f
Merge remote-tracking branch 'origin/master' into extend_tutorial14
anakin87 Aug 9, 2022
dd5cac8
Merge remote-tracking branch 'origin/master' into extend_tutorial14
anakin87 Aug 9, 2022
76816fe
first draft for tutorial extension
anakin87 Aug 9, 2022
480a4c7
forgotten markdown
anakin87 Aug 9, 2022
c248776
Merge remote-tracking branch 'upstream/master' into extend_tutorial14
anakin87 Aug 10, 2022
455396f
improved tutorial
anakin87 Aug 10, 2022
49ddbc5
Apply suggestions from code review
anakin87 Aug 10, 2022
0392b47
Merge remote-tracking branch 'origin/extend_tutorial14' into extend_t…
anakin87 Aug 10, 2022
d9e8d90
add markdown
anakin87 Aug 10, 2022
5588f8a
Merge branch 'deepset-ai:master' into extend_tutorial14
anakin87 Aug 10, 2022
5bec0a8
first draft for tutorial extension
anakin87 Aug 9, 2022
aa290a3
forgotten markdown
anakin87 Aug 9, 2022
44e0fda
improved tutorial
anakin87 Aug 10, 2022
e13f247
Apply suggestions from code review
anakin87 Aug 10, 2022
cd2413c
add markdown
anakin87 Aug 10, 2022
1a99cbd
little corrections
anakin87 Aug 10, 2022
e1bf486
little corrections and add py tutorial
anakin87 Aug 10, 2022
2d25ced
Update tutorials/Tutorial14_Query_Classifier.ipynb
anakin87 Aug 11, 2022
0891fa5
Update tutorials/Tutorial14_Query_Classifier.ipynb
anakin87 Aug 11, 2022
09ca7df
Update tutorials/Tutorial14_Query_Classifier.ipynb
anakin87 Aug 11, 2022
c75ff89
Update tutorials/Tutorial14_Query_Classifier.ipynb
anakin87 Aug 11, 2022
b7114b8
update tutorial webpage
tstadel Aug 11, 2022
55bf7ee
fix typo
tstadel Aug 12, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
95 changes: 94 additions & 1 deletion docs/_src/tutorials/tutorials/14.md
Expand Up @@ -56,7 +56,7 @@ Next we make sure the latest version of Haystack is installed:
!pip install pygraphviz
```

## Logging
### Logging

We configure how logging messages should be displayed and which log level should be used before importing Haystack.
Example log message:
Expand Down Expand Up @@ -339,6 +339,99 @@ print(f"\n\n{equal_line}\nSTATEMENT QUERY RESULTS\n{equal_line}")
print_documents(res_2)
```

### Other use cases for Query Classifiers: custom classification models and zero shot classification

`TransformersQueryClassifier` is very flexible and also supports other possibilities for classifying queries, including loading a custom classification model from Transformers Hub or using zero shot classification.

#### Using custom classification models
We can use a public model, available in the Transformer Hub. For example, we might be interested in classifying the sentiment of the *queries*, so we choose a appropriate model, such as https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment.### Using zero-shot-classification
anakin87 marked this conversation as resolved.
Show resolved Hide resolved


```python
from haystack.nodes import TransformersQueryClassifier

# remember to compile a list with the exact model labels
# the first provided label will corresponds to output_1, the second label to output_2, and so on.
labels = ["LABEL_0", "LABEL_1", "LABEL_2"]

sentiment_query_classifier = TransformersQueryClassifier(
model_name_or_path="cardiffnlp/twitter-roberta-base-sentiment",
use_gpu=True,
task="text-classification",
labels=labels,
)
```


```python
queries = [
"What's the answer?", # neutral query
"Would you be so lovely to tell me the answer?", # positive query
"Can you give me the damn right answer for once??", # negative query
]
```


```python
import pandas as pd

sent_results = {"Query": [], "Output Branch": [], "Class": []}

for query in queries:
result = sentiment_query_classifier.run(query=query)
sent_results["Query"].append(query)
sent_results["Output Branch"].append(result[1])
if result[1] == "output_1":
sent_results["Class"].append("negative")
elif result[1] == "output_2":
sent_results["Class"].append("neutral")
elif result[1] == "output_3":
sent_results["Class"].append("positive")

pd.DataFrame.from_dict(sent_results)
```

#### Using zero shot classification
It is also possible to perform zero shot classifcation, by providing a suitable base transformer model and **choosing** the classes that the model should predict.
For example, we may be interested in whether the user query is related to music or cinema.


```python
# in zero-shot-classification, the labels can be freely chosen
labels = ["music", "cinema"]

query_classifier = TransformersQueryClassifier(
model_name_or_path="typeform/distilbert-base-uncased-mnli",
use_gpu=True,
task="zero-shot-classification",
labels=labels,
)
```


```python
queries = [
"In which films does John Travolta appear?", # query about cinema
"What is the Rolling Stones first album?", # query about music
"Who was Sergio Leone?", # query about cinema
]
```


```python
import pandas as pd

query_classification_results = {"Query": [], "Output Branch": [], "Class": []}

for query in queries:
result = query_classifier.run(query=query)
query_classification_results["Query"].append(query)
query_classification_results["Output Branch"].append(result[1])
query_classification_results["Class"].append("music" if result[1] == "output_1" else "cinema")

pd.DataFrame.from_dict(query_classification_results)
```

## About us

This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany
Expand Down