In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
# setup
import pathlib
base_dir = pathlib.Path("/Users/eyubogln/.meerkat/datasets/rfw/")

# Model Auditing with Meerkat 

In this workshop we’ll be using the Racial Faces in the Wild (RFW) dataset to audit AWS’s FaceCompare API. We provide predictions from the API on this dataset. Your task is to analyze model performance on this dataset and identify slices where the model is performing particularly poorly or particularly well. 

## Loading Data

So, let’s import the Python package and get started!

In [3]:
import meerkat as mk

In [4]:
network, register_api = mk.interactive_mode()


> src@0.0.1 dev
> vite dev "--port" "7861"


  VITE v3.0.3  ready in 311 ms

  ➜  Local:   http://localhost:7861/
  ➜  Network: use --host to expose


In [5]:
register_api()

Meerkat provides a [registry](https://meerkat.readthedocs.io/en/dev/datasets/datasets.html) of commonly used datasets, like RFW, which allows us to load the data into memory with one line of code.  We can then merge the dataset with a CSV containing the model predictions. 

In memory, the dataset and model predictions are stored in a [Meerkat DataPanel](https://meerkat.readthedocs.io/en/latest/guide/data_structures.html). A `DataPanel` is in many ways just like a Pandas DataFrame: it’s a tabular data structure made up of columns. Unlike a DataFrame though, the `DataPanel` is designed for unstructured data types like images and audio. As you can see in the table visualization below, there’s a column for the image, the false non-match rate (FNMR), id etc.

In [6]:
dp = mk.get("rfw")
dp.columns

['image_id', 'identity', 'ethnicity', 'image']

In [8]:
dp = dp.merge(
	mk.DataPanel.from_csv(base_dir / "themis/facecompare_v6_errors.csv"),
	on="image_id",
)
dp.columns

['image_id', 'identity', 'ethnicity', 'image', 'v6_fnmr']

In [9]:
dp = dp.sample(frac=1, replace=False)

## Visualizing Data

We’ll begin by visualizing the images in our dataset. Meerkat allows you to spin up interactive visualizations from within your notebook. These visualizations allow you to efficiently explore large image, audio, and video datasets. 

Note that the visualizations are highly customizable. There are a few different interface types (*e.g.  “*gallery”, “table”, “iplot”) that can be customized from within the notebook.  See the documentation for a full list of interfaces. 

Below we’ll explore our dataset using the gallery interface:

In [29]:
dp.gui.table()

## Computing global metrics
Next we’ll  compute some average metrics across the entire dataset to get a sense of how the model is performing globally.

In [11]:
global_fnmr = dp["v6_fnmr"].mean()
print(f"Global False Non-Match Rate: {global_fnmr}")

Global False Non-Match Rate: 0.022848628146396724


In [30]:
dp["hello3"] = dp["image"]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

9:33:40 AM [vite] hmr update /src/routes/network/register.svelte
hmr update /src/app.css
hmr update /src/app.css
hmr update /src/routes/interface.svelte
hmr update /src/app.css
hmr update /src/lib/components/sliceby/SliceCards.svelte
hmr update /src/app.css
hmr update /src/lib/TableView.svelte
hmr update /src/app.css
hmr update /src/lib/components/sliceby/SliceCard.svelte
hmr update /src/app.css
hmr update /src/lib/components/match_header/MatchHeader.svelte
hmr update /src/app.css
9:33:40 AM [vite] page reload .svelte-kit/runtime/client/singletons.js
9:33:40 AM [vite] page reload .svelte-kit/runtime/client/start.js
9:33:40 AM [vite] hmr update /.svelte-kit/runtime/components/error.svelte
9:33:40 AM [vite] hmr update /src/routes/network/register.svelte
hmr update /src/app.css
hmr update /src/app.css
hmr update /src/routes/interface.svelte
hmr update /src/app.css
hmr update /src/lib/components/sliceby/SliceCards.svelte
hmr update /src/app.css
hmr update /src/lib/TableView.svelte
hmr upda

## Computing group statistics

RFW provides annotations for limited set of high-level racial groups. In ths section, we’ll see how performance varies when stratifying by these groups. To do so, we’ll use `mk.groupby`.

In [12]:
gb = dp.groupby("ethnicity")
gb["v6_fnmr"].mean()

Unnamed: 0,v6_fnmr (NumpyArrayColumn),ethnicity (PandasSeriesColumn)
0,0.011853,african
1,0.028925,asian
2,0.028373,caucasian
3,0.022784,indian


We can also visualize the groups in a GroupBy with the interface

In [13]:
gb.gui.cards(main_column="image", tag_columns=["v6_fnmr"])

In [None]:
dp = mk.embed(dp, input="image", num_workers=0, encoder="clip", device=0)

In [None]:
dp.map(lambda fn: , num_workers=10)

In [14]:
dp = dp.merge(
    mk.DataPanel.read(base_dir / "main/rfw_embedded.mk")["image_id", "clip(image)"],
    on="image_id"
)

In [16]:
dp.gui.table()

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

In [23]:
dp["sunglasses"] = dp["_match_image_A photo of a person wearing sunglasses. "] > 20

dp.groupby("sunglasses")["v6_fnmr"].mean()

Unnamed: 0,v6_fnmr (NumpyArrayColumn),sunglasses (NumpyArrayColumn)
0,0.019221,False
1,0.026599,True


  0%|          | 0/1 [00:00<?, ?it/s]

## Discovering slices

The subgroup annotations provided in RFW are quite limited, so we’ll use meerkat to *discover* new slices. To do so, we’ll use `dp.sliceby`, a method that identifies a set of slices (*i.e.* scalar functions of the `by` column) that explain the variance in the response variable.

In [26]:
cb["v6_fnmr"].mean()

Unnamed: 0,v6_fnmr (NumpyArrayColumn),image (NumpyArrayColumn)
0,0.038069,0
1,0.022183,1
2,0.045377,2
3,0.007601,3
4,0.012937,4
5,0.027814,5
6,0.019717,6
7,0.01583,7


In [24]:
cb = dp.clusterby(by="image")
cb.gui.cards(main_column="image", tag_columns=["v6_fnmr"])

In [None]:
#dp = dp[dp["race"] == "indian"]
sb = dp.sliceby(
	by="image", 
	response="fnmr", 
	encoder="clip", 
	method="domino"
)
sb.gui.slice_gallery(
	main="image",
	tags=["id", "fnmr"], 
	stats=[
		{
			"fn": "mean",
			"columns": ["fnmr", "fmr"]
		}
	]
)

## Diving deeper

In practice, the slices discovered in the previous section should serve as inspiration for further exploration. One great way to quickly continue exploring other slices is via the plot interface. Unlike standard plotting interfaces, you can actually manipulate the axes and add labels for columns that don’t yet exist. For example, TODO add labeling example.

In [None]:
dp.gui.plot(suggestions=[sb, "pca", "umap"])

In [None]:
https://www.loom.com/share/9bde7342c0ad4a6290fffe5e322134f3