# Rubrix Cookbook

Yeah, you heard it right! Not a cheatsheet, but a cookbook. A notebook of recipes. 

In this quick guide, we are going to show you how easy can Rubrix be used side by side with some of the most popular AI Python libraries. Rubrix is *agnostic*, it can be used  with any library or framework, no need to implement any interface or modify your existing toolbox and workflows. With these few example you will be able to start loging and exploring your data for any of these libraries with just a glance, and maybe pick up some inspiration if your library of choice is not in this list.

If you miss one AI library in this list, tell us about it at [our Github forum](https://github.com/recognai/rubrix/discussions).

## HuggingFace Transformers

HuggingFace has given to the NLP community many useful tools, and with HuggingFace Transformers is easier than ever. With a few lines of code we can take a Transformer model from their hub, start making some predictions and then log them into Rubrix.

### Text Classification

In [1]:
import rubrix as rb
from transformers import pipeline

# We define our HuggingFace Pipeline
classifier = pipeline(
        "zero-shot-classification",
        model="typeform/squeezebert-mnli",
        framework="pt",
    )
    
# Choosing our input
text_input = "I love watching rock climbing competitions!"

# Making the prediction
prediction = classifier(
    text_input,
    candidate_labels=[
        "politics",
        "sports",
        "technology",
    ],
    hypothesis_template="This text is about {}.",
)

# Creating a record object to log into rubrix.
record = rb.TextClassificationRecord(
    inputs={"text": prediction["sequence"]},
    prediction=list(zip(prediction["labels"], prediction["scores"])),
    prediction_agent="https://huggingface.co/typeform/squeezebert-mnli",
)

# Logging into Rubrix
rb.log(records=record, name="zeroshot-topic-classifier")

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


BulkResponse(dataset='zeroshot-topic-classifier', processed=1, failed=0)

### Token Classification

In [2]:
import rubrix as rb
from transformers import pipeline

# We define our HuggingFace Pipeline
classifier = pipeline(
        "ner",
        model="elastic/distilbert-base-cased-finetuned-conll03-english",
        framework="pt",
    )

# Choosing our input
text_input = "My name is Sarah and I live in London"

# Making the prediction
predictions = classifier(
    text_input,
)

# Creating a record object to log into rubrix.
record = rb.TokenClassificationRecord(
    text=text_input,
    tokens=text_input.split(),
    prediction=[(pred["entity"], pred["start"], pred["end"]) for pred in predictions],
    prediction_agent="https://huggingface.co/elastic/distilbert-base-cased-finetuned-conll03-english",
)

# Logging into Rubrix
rb.log(records=record, name="zeroshot-ner")

BulkResponse(dataset='zeroshot-ner', processed=1, failed=0)

## Spacy

### Text Classification

### Token Classification

In [1]:
!python -m spacy download fr_core_news_sm

Collecting fr_core_news_sm==2.3.0
  Downloading https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-2.3.0/fr_core_news_sm-2.3.0.tar.gz (14.7 MB)
[K     |████████████████████████████████| 14.7 MB 9.8 MB/s 
Building wheels for collected packages: fr-core-news-sm
  Building wheel for fr-core-news-sm (setup.py) ... [?25ldone
[?25h  Created wheel for fr-core-news-sm: filename=fr_core_news_sm-2.3.0-py3-none-any.whl size=14718367 sha256=5eaeec363470c1b2310cc4646840538dcb4f1c6a8c3dbb3058b0de272341c044
  Stored in directory: /private/var/folders/mb/lvj4fyds5757cy_7swmlpt_00000gn/T/pip-ephem-wheel-cache-ec51_xp6/wheels/48/ca/2e/2a3756cab2ba8745ce853319ba0d44b1efb8892a86320e9633
Successfully built fr-core-news-sm
Installing collected packages: fr-core-news-sm
Successfully installed fr-core-news-sm-2.3.0
[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('fr_core_news_sm')


In [12]:
import rubrix as rb
import spacy

input_text = "Paris a un enfant et la forêt a un oiseau ; l’oiseau s’appelle le moineau ; l’enfant s’appelle le gamin"

# Loading spaCy model
nlp = spacy.load("fr_core_news_sm")

# Creating spaCy doc
doc = nlp(input_text)

# Building TokenClassificationRecord
record = rb.TokenClassificationRecord(
    text=input_text,
    tokens=[token.text for token in doc],
    prediction=[(ent.label_, ent.start_char, ent.end_char) for ent in doc.ents],
    prediction_agent="spacy.fr_core_news_sm",
)

# Logging into Rubrix
rb.log(records=record, name="lesmiserables-ner")

BulkResponse(dataset='lesmiserables-ner', processed=1, failed=0)