# Inference

Colab link [here](https://colab.research.google.com/drive/16g7itVEzx2C6qVLkJu9KtgxzgdIciEJ8?usp=sharing)

<br>

In ML terms, inference is the act of a model making a prediction once it is fully trained. In this tutorial we are going to discuss how to properly do inference with your models.

Inference is different than training in several aspects. It does not require gradients or parameter updates, purely a forward pass. If you've already seen my lesson on **evaluation** then you probably know what this means. We must use `model.eval()` and `torch.no_grad()` before sending any inputs.

### Important Note

During training, inputs to the model are always preprocessed. We must always use the same preprocessing steps during inference as well. Finally, if needed, the preprocessing steps are reversed.

For example, if all the training images are normalized between [0,1] then the inference inputs must also be normalized between [0, 1].

An example regarding LLM's is tokenization. Words are split into tokens during training. Since humans can't read tokens, the output is converted back to words.

Here's a simple example of inference with a model.

In [None]:
# import necessary modules
import torch

# fake raw image
image = torch.randn(1, 3, 50, 50) # batch dimension is included even in inference

# preprocess image
processed_image = preprocess(image)

# code for model is omitted
model = fakeModel()

# inference
model.eval()
with torch.no_grad():
  # forward pass only
  output = model(processed_image)
  prediction = output.argmax(1)