# Hibou Model Usage Example

This notebook showcases the basic usage of the Hibou model. The minimal installation for this notebook to work should be:
```bash
pip install torch torchvision opencv-python matplotlib
```

In this notebook the basic usage of the Hibou model is showcased.

The minimal installation for this notebook to work should be: `pip install torch torchvision opencv-python matplotlib`

In [None]:
import torch, torchvision
import matplotlib.pyplot as plt
from PIL import Image

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

#### Load the test image.

In [None]:
image = Image.open("images/sample.png").convert("RGB")
plt.imshow(image)
plt.axis('off')
plt.show()

## HuggingFace Hub Example

In [None]:
from transformers import AutoImageProcessor, AutoModel

processor = AutoImageProcessor.from_pretrained("histai/hibou-b", trust_remote_code=True)
hf_model = AutoModel.from_pretrained("histai/hibou-b", trust_remote_code=True)

In [None]:
hf_data = processor(images=image, return_tensors="pt").to(device)
hf_model = hf_model.to(device)
hf_model.eval()

with torch.no_grad():
    hf_output = hf_model(**hf_data)

print(hf_output.pooler_output.shape)

## Local Example

Download the model weights from [Google Drive](https://drive.google.com/file/d/12ICd_-yJWMYYo5OskMmc9SHJAQivAtS7/view?usp=sharing) and put them in the root of the hibou directory.

The cell below should work without installing anything, but if you'd like to use the model from anywhere, `cd` to the hibou directory and run:
```bash
pip install -r requirements.txt && pip install -e .
```

In [None]:
from hibou import build_model

model = build_model(weights_path="hibou-b.pth")

print("Total parameters:", sum(p.numel() for p in model.parameters()))

#### Get the features

In [None]:
transforms = torchvision.transforms.Compose([
    torchvision.transforms.Resize((224, 224), interpolation=torchvision.transforms.InterpolationMode.BICUBIC),
    torchvision.transforms.CenterCrop((224, 224)),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(mean=[0.7068, 0.5755, 0.7220], std=[0.1950, 0.2316, 0.1816]),
])

data = transforms(image).unsqueeze(0).to(device)
model = model.to(device)
model.eval()

with torch.no_grad():
    output = model(data)

print(output.shape)

#### Or let's say you're building a segmentation model and for that you want to get intermediate features

In [None]:
with torch.no_grad():
    extended_output = model.forward_features(data, return_intermediate=True)

print(extended_output.keys())
print(f"Total intermediate outputs: {len(extended_output['intermediate'])}", f"\nThe shape of the intermediate output: {extended_output['intermediate'][-1].shape}")

#### If you've run both the hugingface hub and the local installation then you can run the cell below to check that the outputs are very close (or similar). The difference might be due to rounding errors but it should be very small.

In [None]:
print((output - hf_output.pooler_output).mean())