# Stream Model To GPU

In this notebook we will demonstrate how to read tensors using the RunAI Model Streamer and copy them to the GPU memory.

## Prerequisite
Run this notebook on a Linux machine with GPU.

## Preperation
We will start by downloading an example `.safetensors` file. Feel free to use your own.

In [11]:
import subprocess

url = 'https://huggingface.co/vidore/colpali/resolve/main/adapter_model.safetensors?download=true'
local_filename = 'model.safetensors'

wget_command = ['wget', '--content-disposition', url, '-O', local_filename]
subprocess.run(wget_command, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

## Streaming

To load the tensors from the file we need to create `SafetensorsStreamer` instance, perform the request, and transfer the tensors to the GPU memory.

In [None]:
from runai_model_streamer import SafetensorsStreamer

file_path = "model.safetensors"

with SafetensorsStreamer() as streamer:
    streamer.stream_file(file_path)
    for name, tensor in streamer.get_tensors():
        gpu_tensor = tensor.to('CUDA:0')

Each yielded tensor is copied to the GPU, while in the background the streamer continues to read the next tensors. Therefore, reading from storage and copying to GPU are performed in parallel.