# Running a neural network using a library

This continues what we did in the first notebook because we will continue using [Huggingface Transformers](https://github.com/huggingface/transformers). We could also use  [Huggingface Diffusers](https://github.com/huggingface/diffusers) that work with image generation and uses the same workflow.

We will try to understand more the code that we used.


First we install the library.

In [None]:
!pip install -q transformers

## Pipelines

This library can run different tasks, that they organize in what they call [pipelines](https://huggingface.co/docs/transformers/pipeline_tutorial), and they are based on tasks or input/output format, like DepthEstimation or TextToAudio ([list](https://huggingface.co/docs/transformers/main_classes/pipelines)).

We can create a pipeline with the task name

```
transcriber = pipeline(task="automatic-speech-recognition")
```
or by selecting the model
```
transcriber = pipeline(model="openai/whisper-large-v2")
```

In [None]:
from transformers import pipeline

captioner = pipeline(model="Salesforce/blip-image-captioning-large")

Once created we can give it an input and it will process it and give us an output.

In [None]:
output = captioner("https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png")

The output format varies with the pipeline.

In [None]:
print(output)

We can also provide some parameters to the processing, they vary with the pipeline. In this case we change the maximum ouptut length (not very useful in this case).

In [None]:
output = captioner("https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png", max_new_tokens=8)
print(output)

## Models

But we can also search for a specific model and use it directly without a piepline. The code is more complex, but we can have more control.

We can search for a model here https://huggingface.co/models and usually they provide the code needed.

In this example we are going to use the DPT model for Depth Estimation https://huggingface.co/Intel/dpt-large.


In [None]:
from transformers import DPTImageProcessor, DPTForDepthEstimation
import torch
import numpy as np
from PIL import Image
import requests

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

processor = DPTImageProcessor.from_pretrained("Intel/dpt-large")
model = DPTForDepthEstimation.from_pretrained("Intel/dpt-large")

# prepare image for the model
inputs = processor(images=image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)
    predicted_depth = outputs.predicted_depth

# interpolate to original size
prediction = torch.nn.functional.interpolate(
    predicted_depth.unsqueeze(1),
    size=image.size[::-1],
    mode="bicubic",
    align_corners=False,
)

# visualize the prediction
output = prediction.squeeze().cpu().numpy()
formatted = (output * 255 / np.max(output)).astype("uint8")
depth = Image.fromarray(formatted)

We display the ouput

In [None]:
display(depth)

# Finalizing

When you finish working you have to remember to **stop the runtime**, because there is a time limit and to avoid wasting resources. To stop the runtime click Manage Sessions on the Runtime menu. Once the dialog opens click terminate on the current runtime.

> But when you stop the runtime everything you have not saved is ⚠ **lost** ⚠, so be sure to **download** everything you want to keep before stopping it.
