# Logging an inference run on production data

In this notebook we learn how to log an inference run, demonstrating common flows and errors
If you are new to the dataquality repo, check out the Dataquality-Client-Demo first!

## Setup
In this demo we use the same setup as the Dataquality-Client-Demo.

In [1]:
import os
os.environ['GALILEO_CONSOLE_URL']="http://localhost:8088"

In [2]:
# If you have cloned the dataquality repo and are running this from the docs folder, you can run this
#!pip install -q ../../../../dataquality
import dataquality

Create an admin if one doesn't exist. Set admin credentials as environment variables to automatically login during `dataquality.init()` below.

In [3]:
import requests

pwd = "MyPassword!123"

data={
  "email": "me@rungalileo.io",
  "first_name": "Me",
  "last_name": "Me",
  "username": "Galileo",
  "auth_method": "email",
  "password": pwd
}

# This will silently fail with a requests status code of 400 if admin is already set
r = requests.post(f'{dataquality.config.api_url}/users/admin', json=data)

import os
os.environ["GALILEO_USERNAME"]="me@rungalileo.io"
os.environ["GALILEO_PASSWORD"]=pwd

We create a few helper functions for creating and logging fake data.

In [49]:
from sklearn.datasets import fetch_20newsgroups
import numpy as np
import pandas as pd

def create_dataset():
    newsgroups = fetch_20newsgroups(subset="train", remove=('headers', 'footers', 'quotes'))
    dataset = pd.DataFrame()
    dataset["text"] = newsgroups.data
    label_ind = newsgroups.target_names
    dataset["label"] = [label_ind[i] for i in newsgroups.target]
    return dataset, label_ind

def fetch_dataset(dataset, split, inference_name = None):
    if split == "training":
        return dataset[:100]
    if split == "test":
        return dataset[100:200]

    if split == "inference":
        if inference_name == "cool":
            return dataset[200:300]
        if inference_name == "awesome":
            return dataset[300:400]
        if inference_name == "swag":
            return dataset[400:500]

    raise ValueError("Uh oh something happened")

# Generate fake model outputs
def log_fake_data(dataset_len, log_num: int = 0):
    num_rows = dataset_len // (log_num + 1)

    emb = np.random.rand(num_rows, 800)
    prob = np.random.rand(num_rows, 20)
    for split in ['test','training']:
        epoch = 0
        
        r = range(num_rows*log_num, num_rows*(log_num+1))
        ids = list(r)
        dataquality.log_model_outputs(emb=emb, probs=prob, split=split, epoch=epoch, ids=ids)

## Start with a train / test run

Inference data will usually be logged after training / test runs. We simulate this flow by populating minio with training and test data. 

In [6]:
dataquality.init(task_type="text_classification", project_name="gonzaga", run_name="duke")
base_dataset, labels = create_dataset()
train_dataset = fetch_dataset(base_dataset, "training")
dataquality.log_input_data(text=train_dataset['text'], labels=train_dataset['label'], split="training")
test_dataset = fetch_dataset(base_dataset, "test")
dataquality.log_input_data(text=test_dataset['text'], labels=test_dataset['label'], split="test")

log_fake_data(len(train_dataset), 1)
dataquality.set_labels_for_run(labels)
dataquality.finish()

🔭 Logging you into Galileo

👀 Found auth method email set via env, skipping prompt.
🚀 You're logged in to Galileo as me@rungalileo.io!
📡 Retrieving run from existing project, gonzaga




🛰 Connected to project, gonzaga, and run, duke.
Exporting input data [########################################] 100.00% elapsed time  :     0.01s =  0.0m =  0.0h
Appending input data [########################################] 100.00% elapsed time  :     0.01s =  0.0m =  0.0h
 



☁️ Uploading Data
Combining batches for upload


  0%|          | 0/1 [00:00<?, ?it/s]

training:   0%|          | 0/3 [00:00<?, ?it/s]

Writing data for upload [########################################] 100.00% elapsed time  :     0.04s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.09s =  0.0m =  0.0h  
Writing data for upload [########################################] 100.00% elapsed time  :     0.14s =  0.0m =  0.0h
 Combining batches for upload


  0%|          | 0/1 [00:00<?, ?it/s]

test:   0%|          | 0/3 [00:00<?, ?it/s]

Writing data for upload [########################################] 100.00% elapsed time  :     0.04s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.09s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.14s =  0.0m =  0.0h 
 🧹 Cleaning up
Job default successfully submitted. Results will be available soon at http://127.0.0.1:3000/insights?projectId=18bca69e-ba3e-4b3f-b504-820124538a35&runId=39fa57ff-9c14-4af7-bb35-21efed1cb1a3&split=training&taskType=0&activeDepHigh=1&activeDepLow=0


{'project_id': '18bca69e-ba3e-4b3f-b504-820124538a35',
 'run_id': '39fa57ff-9c14-4af7-bb35-21efed1cb1a3',
 'job_name': 'default',
 'labels': ['alt.atheism',
  'comp.graphics',
  'comp.os.ms-windows.misc',
  'comp.sys.ibm.pc.hardware',
  'comp.sys.mac.hardware',
  'comp.windows.x',
  'misc.forsale',
  'rec.autos',
  'rec.motorcycles',
  'rec.sport.baseball',
  'rec.sport.hockey',
  'sci.crypt',
  'sci.electronics',
  'sci.med',
  'sci.space',
  'soc.religion.christian',
  'talk.politics.guns',
  'talk.politics.mideast',
  'talk.politics.misc',
  'talk.religion.misc'],
 'tasks': None,
 'non_inference_logged': False,
 'message': 'Processing dataquality!',
 'link': 'http://127.0.0.1:3000/insights?projectId=18bca69e-ba3e-4b3f-b504-820124538a35&runId=39fa57ff-9c14-4af7-bb35-21efed1cb1a3&split=training&taskType=0&activeDepHigh=1&activeDepLow=0'}

## Inference run

Now log an inference run. Notice that when we log inference data it is appending to Minio, meaning that existing training / test data is not deleted. 

We can log multiple inference runs with different inference names. 

In [96]:
dataquality.init(task_type="text_classification", project_name="gonzaga", run_name="duke")

📡 Retrieving run from existing project, gonzaga
🛰 Connected to project, gonzaga, and run, duke.




In [97]:
split = "inference"
INFERENCE_NAMES = ["cats", "dogs", "iguanas"]

In [98]:
base_dataset, labels = create_dataset()

In [100]:
cats_dataset = fetch_dataset(base_dataset, split, "cats")
dogs_dataset = fetch_dataset(base_dataset, split, "dogs")
iguanas_dataset = fetch_dataset(base_dataset, split, "iguanas")
datasets = {
    "cats": cats_dataset,
    "dogs": dogs_dataset,
    "iguanas": iguanas_dataset
}
starting_indices = {
    "cats": 200,
    "dogs": 300,
    "iguanas": 400
}

In [101]:
cats_dataset.head()

Unnamed: 0,text,label
300,\nI was at a Cincinnati Cyclones game a year a...,rec.sport.hockey
301,,sci.crypt
302,"Is it possible to do a ""wheelie"" on a motorcyc...",rec.motorcycles
303,"Hello src readers,\n\nAgain the misconception ...",soc.religion.christian
304,\nThere are ALWAYS scalpers with tickets outsi...,rec.sport.hockey


In [102]:
for inference_name in INFERENCE_NAMES:
    starting_index = starting_indices[inference_name]
    ids = list(range(starting_index, starting_index + 100))
    # Inference doesn't expect labels, but does need an inference name
    dataquality.log_input_data(
        text=datasets[inference_name]["text"],
        split=split,
        inference_name=inference_name,
        ids=ids
    )

Exporting input data [########################################] 100.00% elapsed time  :     0.01s =  0.0m =  0.0h
Appending input data [########################################] 100.00% elapsed time  :     0.01s =  0.0m =  0.0h
 

In [103]:
import numpy as np

def get_model_outputs(data, starting_index):
    num_rows = len(data)
    logits = np.random.rand(num_rows, 20) # fake logits
    emb = np.random.rand(num_rows, 768) # fake embeddings
    ids = list(range(starting_index, starting_index + 100))

    return (emb, logits, ids)

In [104]:
for inference_name in INFERENCE_NAMES:
    # Set split takes in an optional inference name
    dataquality.set_split(split, inference_name=inference_name)

    emb, logits, ids = get_model_outputs(datasets[inference_name], starting_indices[inference_name)
    dataquality.log_model_outputs(emb=emb, logits=logits, ids=ids, split="inference")

In [105]:
!tree .galileo/logs/{dataquality.config.current_project_id}/{dataquality.config.current_run_id}

[01;34m.galileo/logs/18bca69e-ba3e-4b3f-b504-820124538a35/39fa57ff-9c14-4af7-bb35-21efed1cb1a3[0m
├── [01;34minference[0m
│   ├── [01;34mcool[0m
│   │   └── [00m3811f729e992.hdf5[0m
│   └── [01;34mswag[0m
│       └── [00m36ab78bbb25e.hdf5[0m
└── [00minput_data.arrow[0m

3 directories, 3 files


In [106]:
# Finish will kickoff job with name "inference"
dataquality.set_labels_for_run(labels)
dataquality.finish()


☁️ Uploading Data
Combining batches for upload


  0%|          | 0/1 [00:00<?, ?it/s]

inference:   0%|          | 0/3 [00:00<?, ?it/s]

Writing data for upload [########################################] 100.00% elapsed time  :     0.05s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.05s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.22s =  0.0m =  0.0h
 Combining batches for upload


  0%|          | 0/1 [00:00<?, ?it/s]

inference:   0%|          | 0/3 [00:00<?, ?it/s]

Writing data for upload [########################################] 100.00% elapsed time  :     0.05s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.05s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.22s =  0.0m =  0.0h
 🧹 Cleaning up
Job inference successfully submitted. Results will be available soon at http://127.0.0.1:3000/insights?projectId=18bca69e-ba3e-4b3f-b504-820124538a35&runId=39fa57ff-9c14-4af7-bb35-21efed1cb1a3&split=training&taskType=0&activeDepHigh=1&activeDepLow=0


{'project_id': '18bca69e-ba3e-4b3f-b504-820124538a35',
 'run_id': '39fa57ff-9c14-4af7-bb35-21efed1cb1a3',
 'job_name': 'inference',
 'labels': ['alt.atheism',
  'comp.graphics',
  'comp.os.ms-windows.misc',
  'comp.sys.ibm.pc.hardware',
  'comp.sys.mac.hardware',
  'comp.windows.x',
  'misc.forsale',
  'rec.autos',
  'rec.motorcycles',
  'rec.sport.baseball',
  'rec.sport.hockey',
  'sci.crypt',
  'sci.electronics',
  'sci.med',
  'sci.space',
  'soc.religion.christian',
  'talk.politics.guns',
  'talk.politics.mideast',
  'talk.politics.misc',
  'talk.religion.misc'],
 'tasks': None,
 'non_inference_logged': False,
 'message': 'Processing dataquality!',
 'link': 'http://127.0.0.1:3000/insights?projectId=18bca69e-ba3e-4b3f-b504-820124538a35&runId=39fa57ff-9c14-4af7-bb35-21efed1cb1a3&split=training&taskType=0&activeDepHigh=1&activeDepLow=0'}

## Log a new training run, inference data is wiped

By default, logging a new training or test run wipes all Minio data. We log a new training run and can confirm that all data is wiped in the Minio bucket.

In [30]:
dataquality.init(task_type="text_classification", project_name="gonzaga", run_name="duke")
base_dataset, labels = create_dataset()
train_dataset = fetch_dataset(base_dataset, "training")
dataquality.log_input_data(text=train_dataset['text'], labels=train_dataset['label'], split="training")
test_dataset = fetch_dataset(base_dataset, "test")
dataquality.log_input_data(text=test_dataset['text'], labels=test_dataset['label'], split="test")

log_fake_data(len(train_dataset), 1)
dataquality.set_labels_for_run(labels)
dataquality.finish()

📡 Retrieving run from existing project, gonzaga
🛰 Connected to project, gonzaga, and run, duke.




Exporting input data [########################################] 100.00% elapsed time  :     0.00s =  0.0m =  0.0h
Appending input data [########################################] 100.00% elapsed time  :     0.00s =  0.0m =  0.0h
 



☁️ Uploading Data
Combining batches for upload


  0%|          | 0/1 [00:00<?, ?it/s]

training:   0%|          | 0/3 [00:00<?, ?it/s]

Writing data for upload [########################################] 100.00% elapsed time  :     0.04s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.10s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.22s =  0.0m =  0.0h
 Combining batches for upload


  0%|          | 0/1 [00:00<?, ?it/s]

test:   0%|          | 0/3 [00:00<?, ?it/s]

Writing data for upload [########################################] 100.00% elapsed time  :     0.05s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.10s =  0.0m =  0.0h
Writing data for upload [########################################] 100.00% elapsed time  :     0.19s =  0.0m =  0.0h
 🧹 Cleaning up
Job default successfully submitted. Results will be available soon at http://127.0.0.1:3000/insights?projectId=18bca69e-ba3e-4b3f-b504-820124538a35&runId=39fa57ff-9c14-4af7-bb35-21efed1cb1a3&split=training&taskType=0&activeDepHigh=1&activeDepLow=0


{'project_id': '18bca69e-ba3e-4b3f-b504-820124538a35',
 'run_id': '39fa57ff-9c14-4af7-bb35-21efed1cb1a3',
 'job_name': 'default',
 'labels': ['alt.atheism',
  'comp.graphics',
  'comp.os.ms-windows.misc',
  'comp.sys.ibm.pc.hardware',
  'comp.sys.mac.hardware',
  'comp.windows.x',
  'misc.forsale',
  'rec.autos',
  'rec.motorcycles',
  'rec.sport.baseball',
  'rec.sport.hockey',
  'sci.crypt',
  'sci.electronics',
  'sci.med',
  'sci.space',
  'soc.religion.christian',
  'talk.politics.guns',
  'talk.politics.mideast',
  'talk.politics.misc',
  'talk.religion.misc'],
 'tasks': None,
 'non_inference_logged': False,
 'message': 'Processing dataquality!',
 'link': 'http://127.0.0.1:3000/insights?projectId=18bca69e-ba3e-4b3f-b504-820124538a35&runId=39fa57ff-9c14-4af7-bb35-21efed1cb1a3&split=training&taskType=0&activeDepHigh=1&activeDepLow=0'}