# Alterative use of the `transformers` library

- Learning goal: Illustrate a mode advanced use of the `transformers` library beyond the simple `pipeline` object. 

## Example 1: Image classification model

When clicking "Use in transformers" in the Huggingface model card e.g. for the ResNet-18 models, you also see a suggestion to use a code similar to the one below: 

In [1]:
from transformers import AutoImageProcessor, AutoModelForImageClassification

processor = AutoImageProcessor.from_pretrained("microsoft/resnet-18")
model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-18")

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.


It uses more specific classes (`AutoImageProcessor` and `AutoModelForImageClassification`), that bundle image processing operations, and an image classification model, respectively. 

This is how you would use these classes e.g. to get the top prediction: 




In [2]:
from PIL import Image
import torch

PATH = "/Users/joseantonio.rodriguez15/Downloads/images"
full_path = f"{PATH}/bridge-667997_1280.jpg"
image = Image.open(full_path)

model_inputs = processor(image, return_tensors="pt")
model_output = model(**model_inputs)

logits =  model_output.logits
predicted_class_idx = torch.argmax(logits, dim=-1).item()

label_map = model.config.id2label
predicted_class_label = label_map[predicted_class_idx]

print(f"Predicted class label: {predicted_class_label}")



Predicted class label: suspension bridge


Through the `model_output` object, we have access to any information of the underlying neural network, such as the logits of each category. 

The transformers library offers many other classes that facilitate different use cases of working with deep learning models. 

If you are interested in manipulating models beyond just extracting the predictions, a recommendation is to learn a bit of `pytorch` and the `transformers` library. For illustration, this is how to extract features (as "embeddings") using a ResNet model: 

In [13]:
from transformers import AutoModel, AutoImageProcessor
import torch
from PIL import Image
import requests

# Load the model
model_name = "microsoft/resnet-18"
model = AutoModel.from_pretrained(model_name)
image_processor = AutoImageProcessor.from_pretrained(model_name)

# Preprocess image
inputs = image_processor(images=image, return_tensors="pt")

# Extract feature map (without torch.no_grad())
outputs = model(**inputs)

# Feature map (last hidden state)
feature_map = outputs.pooler_output

features = feature_map.detach().numpy().flatten()
print("Feature Map Shape:", features.shape)


Feature Map Shape: (512,)
