# How to use `scivision`

In this notebook, we will:

1. Demonstrate using the scivision [Python API](https://scivision.readthedocs.io/en/latest/api.html) to load a pretrained (ImageNet) model, which we previously added to the scivision catalog with the name "scivision-test-plugin", as per [this guide](https://scivision.readthedocs.io/en/latest/contributing.html#extending-the-scivision-catalog)
2. Use the scivision catalog to find a matching dataset, which the model can be run on
3. Run the model on the data, performing simple model inference

Note: The model repository follows the strcuture specified in [this template](https://scivision.readthedocs.io/en/latest/model_repository_template.html), including a `scivision` [model config file](https://github.com/alan-turing-institute/scivision-test-plugin/blob/main/.scivision/model.yml).

We first import some things from scivision: `default_catalog` is a scivision **catalog** that will let us discover models and datasets, and `load_pretrained_model` provides a convenient way to load and run a model.

In [None]:
from scivision import default_catalog, load_pretrained_model

## Inspecting our model in the scivision catalog

A scivision catalog is a collection of **models** and **datasources**.

For this example, we want to find datasources compatible with "scivision-test-plugin".  But first, let's first let's use the catalog to retrive the "scivision-test-plugin" repository url, take a look at the other models in the *default catalog* (the built-in catalog, distributed as part of scivision) and see how this catalog is structured.

In [None]:
# Get the model repo url
models_catalog = default_catalog.models.to_dataframe()
stp_repo = models_catalog[models_catalog.name == "scivision-test-plugin"].url.item()
stp_repo # Why not paste the repo link into your browser and see how it looks?

In [None]:
# Inspecting models in the default catalog
default_catalog.models.to_dataframe().head()

## Loading the model

Next, let's load the "scivision-test-plugin" model using the scivision python API, specifically the `load_pretrained_model` function:

In [None]:
model = load_pretrained_model(stp_repo, allow_install=True)

In [None]:
# let's explore the model object
model

Later, we'll use this ImageNet model to make predictions on image data found in the scivision catalog.

## Query the default scivision catalog

Now let's use the `default_catalog` to identify datasources in the catalog that are compatible with the model (based on `tasks`, `format` and `labels_provided`/`labels_required`).

In [None]:
compatible_datasources = default_catalog.compatible_datasources("scivision-test-plugin").to_dataframe()
compatible_datasources

Let's use `data-003`, a single image dataset (of a baby Koala!)

In [None]:
target_datasource = compatible_datasources.loc[compatible_datasources['name'] == 'data-003']
target_datasource

## Load the dataset

Now let's load the dataset using the scivision python API, specifically the [load_dataset](https://scivision.readthedocs.io/en/latest/api.html#scivision.io.reader.load_dataset) function, which takes as input the url of the data repository (structured as per [this template](https://scivision.readthedocs.io/en/latest/data_repository_template.html)), which we can get from the target datasource:

In [None]:
from scivision import load_dataset

In [None]:
data_url = target_datasource['url'].item()

The returned data config object of the`load_dataset` function is an "intake catalog". You can read our [documentation](https://scivision.readthedocs.io/en/latest/data_repository_template.html#data-config-file) to understand this better, but for now, let's inspect this config:

In [None]:
data_config = load_dataset(data_url)
data_config

Clicking the `path` link to the location of this data config file online (in the dataset repo) reveals that there is one data source called `test_image`, and that the `intake_xarray.image.ImageSource` is being used. We can retrive the test image data in an image format which the model will accept, like so:

In [None]:
test_image = data_config.test_image().to_dask() # The xarray.DataArray is one format accepted by the ImageNet model
test_image

Let's take a look at the image with `matplotlib`:

In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.imshow(test_image)

## Model predictions

Now let's use the loaded model on the test image data we found in the via catalog.

In [None]:
model.predict(test_image)

As you can see, this model has given a prediction of the test image, with a confidence score. Check out the code in the model repo to see how this was determined!