# Inference

Down the road, you will need to make real-life predictions using the models that you've trained.

Inference is a breeze with AIQC because it persists all of the information that we need to preprocess our new samples and reconstruct our model.

Normally, the challenge with inference is being able to preprocess your new samples the same way as your processed your training samples. Additionally, if you provide labels with your new data for the purpose of evaluation, then PyTorch requires you to reconstruct parts of your model like your optimizer in order to calculate loss.

---

In [2]:
import aiqc
from aiqc import datum
from aiqc import tests

Below we're just making a trained model so that we have examples to work with for making inference-based predictions.

In [3]:
%%capture
queue_multiclass = tests.make_test_queue('keras_multiclass')
queue_multiclass.run_jobs()

## Predictor

Let's say that we have a trained model in the form of a `Predictor`,

In [4]:
predictor = queue_multiclass.jobs[0].predictors[0]

and that we have samples that we want to generate predictions for.

## New Splitset

In [5]:
df = datum.to_pandas('iris.tsv').sample(10)

In [6]:
df[:5]

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
63,6.1,2.9,4.7,1.4,versicolor
2,4.7,3.2,1.3,0.2,setosa
101,5.8,2.7,5.1,1.9,virginica
83,6.0,2.7,5.1,1.6,versicolor
74,6.4,2.9,4.3,1.3,versicolor


We'll fashion a new `Splitset` of the samples that we want to predict using the high-level API.

- Leave the `label_column` blank if you are conducting pure inference where you don't know the real Label/target.
- Otherwise, `splitset.label` will be used to generate metrics for your new predictions.

In [7]:
splitset = aiqc.Pipeline.Tabular.make(
    dataFrame_or_filePath = df
    , label_column = 'species'
)

## Run Inference

Then pass that `Splitset` to `Predictor.infer()`.

During `infer`, it will validate that the schema of your new Splitset's `Feature` and `Label` match the schema of the original training Splitset. It will also ignore any splits that you make, fetching the entire Feature and Label.

- `Dataset.Tabular` schema includes column ordering and dtype.
- `Dataset.Image` schema includes Pillow size (height/width) and mode (color dimensions).

In [8]:
prediction = predictor.infer(splitset_id=splitset.id)

- The key in the dictionary-based `Prediction` attributes will be equal to the `str(splitset.id)`.
- If you trained on encoded Labels, don't worry, the output will be `inverse_transform`'ed.

In [9]:
prediction.predictions

{'8': array(['versicolor', 'setosa', 'virginica', 'versicolor', 'versicolor',
        'setosa', 'virginica', 'setosa', 'virginica', 'setosa'],
       dtype=object)}

For more information on the `Prediction` object, reference the [Low-Level API](api_low_level.html) documentation.