<img src='https://gitlab.eumetsat.int/eumetlab/oceans/ocean-training/tools/frameworks/-/raw/main/img/Standard_banner.png' align='right' width='100%'/>

<font color="#138D75">**Copernicus EUMETSAT**</font> <br>
**Copyright:** 2025 EUMETSAT <br>
**License:** MIT <br>
**Authors:** Hugues Sassier (Thales Alenia Space), Anna-Lena Erdmann (EUMETSAT)

<html>
  <div style="width:100%">
    <div style="float:left"><a href="https://jupyterhub.prod.wekeo2.eu/hub/user-redirect/lab/tree/public/wekeo4data/wekeo-gpu/ollama_image_description.ipynb"><img src="https://img.shields.io/badge/launch-WEKEO-1a4696.svg?style=flat&logo=" alt="Open in WEkEO"></a></div>
    <div style="float:left"><p>&emsp;</p></div>
  </div>    
</html>

<div class="alert alert-block alert-success">
<h3> Use Cases for the WEkEO Workspace GPUs </h3></div>

<div class="alert alert-block alert-warning">
    
<b>PREREQUISITES </b>
    
This notebook has the following prerequisites:
  - **<a href="https://my.wekeo.eu/user-registration" target="_blank">A WEkEO account</a>**
  - **Execution of the notebook under the WEkEO Workspace <a href="https://help.wekeo.eu/en/articles/7945473-which-are-the-computing-resources-of-the-wekeo-jupyterhub" target="_blank"> Machine Learning (GPU) server</a>**
  - minimum of **10 GB of free storage space** on your WEkEO Workspace/device
  

</div>
<hr>

# Ollama image description using the WEkEO Workspace GPUs

### Learning outcomes

At the end of this notebook you will know;

* how to set up a ollama server
* use the WEkEO Workspace GPUs to run the llava 7B multi-modal model 
* how to use the model to generate image descriptions of EO data


### Outline

This notebook introduces Ollama, a lightweight framework for running large language and vision models locally, and demonstrates the use of LLaVA 7B, a vision-language model designed to interpret both images and text. While LLaVA is commonly used for tasks like visual question answering or image captioning, this notebook explores its potential for analyzing satellite imagery and Earth Observation (EO) data. The example  acts as a demonstrator for leveraging GPU resources provided by WEkEO to run light-weight LLMs and multi-modal models efficiently on the Jupyter Hub.

<div class="alert alert-info" role="alert">

### Contents <a id='totop'></a>

</div>

 0. [Check Available Storage Space](#section00)   
 1. [Set up the Ollama Server](#section0)
 2. [Pull the Language Model](#section1)
 3. [Generate Image descriptions of EO data](#section2)


<hr>

<div class="alert alert-info" role="alert">

## 0. <a id='section00'></a>Check Available Storage Space
[Back to top](#totop)
    
</div>

A prerequisite of this notebook is the availability of 10 GB of free storage space. The cell below checks if you have enough free storage space in your WEkEO JupyterHub to execute this notebook. 

In [1]:
import shutil
import os

# Path to check
path = "/home/jovyan"

# Only perform the check if the path exists
if os.path.exists(path):
    # Get disk usage statistics
    total, used, free = shutil.disk_usage(path)

    # Convert bytes to gigabytes
    free_gb = free / (1024 ** 3)

    # Raise error if less than 10 GB are free
    if free_gb < 10:
        raise RuntimeError("❌ Please free some space. To execute this Notebook you need at least 10 GB of storage space available in your workspace.")
    else:
        print(f"✅ Disk check passed: {free_gb:.2f} GB free.")
else:
    print(f"⚠️ Path '{path}' does not exist. Skipping disk space check.")


⚠️ Path '/home/jovyan' does not exist. Skipping disk space check.


<div class="alert alert-info" role="alert">

## 1. <a id='section0'></a>Set up the Ollama Server
[Back to top](#totop)
    
</div>


To use the LLaVA 7B model, we first need to install the **Ollama server**, which allows you to run large multimodal models locally or on a cloud GPU. 

Run the following commands to install Ollama on a Linux system:

```bash
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
tar -C ~/.local -xzf ollama-linux-amd64.tgz


> ⚠️ **Note:** The download is approximately **1.6 GB**, so make sure you have sufficient disk space and a stable internet connection.

You can install Ollama directly from the notebook cell using `!`.


In [1]:
!curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz 

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 1621M  100 1621M    0     0   182M      0  0:00:08  0:00:08 --:--:--  209M


In [2]:
!tar -xzf ollama-linux-amd64.tgz

Finally, you can start the ollama server either in the terminal, or through the notebook

To start the ollama server in the terminal of a linux system, run the following commands: 


```bash
ollama serve

You can try to start it inside the Jupyter Notebooks, however, **Jupyter Notebooks do not support background processes**. So it is essential to execute the ´ollama serve´ command in the terminal.

In [None]:
#!ollama serve 

Once executed in the terminal, the output should state, that the GPUs are detected. 

<p align="center">
  <img src="img/img-description-ollamaserve.png" alt="Sentinel image description" style="width:90%;">
</p>

Finally, we have to install the python package ollama to run the next parts of the notebook. 

In [8]:
%pip install -q ollama

Note: you may need to restart the kernel to use updated packages.


<div class="alert alert-info" role="alert">

## 2. <a id='section1'></a>Pull the Language Model
[Back to top](#totop)
    
</div>

In the next step we decide on the model we want to use for describing the image. In this example, we have chosen the [llava-7b](https://ollama.com/library/llava:7b) model. 

In [9]:
!ollama pull llava:7b

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling 170370233dd5: 100% ▕██████████████████▏ 4.1 GB                         [K
pulling 72d6f08a42f6: 100% ▕██████████████████▏ 624 MB                         [K
pulling 43070e2d4e53: 100% ▕██████████████████▏  11 KB                         [K
pulling c43332387573: 100% ▕██████████████████▏   67 B                         [K
pulling ed11eda7790d: 100% ▕███████

<div class="alert alert-block alert-success">

### Explore

You can pull many different models from the ollama server. The llava-7b is just one example. As an exercise you can browse through the [model repository of ollama](https://ollama.com/search) and try out pulling different models. 

Be reminded, that the GPU on the WEkEO Workspace has approximately 7 GB of Memory. The models have to fit into memory in order to be used. 
</div>

<div class="alert alert-info" role="alert">

## 3. <a id='section2'></a>Generate Image descriptions of EO data
[Back to top](#totop)
    
</div>



Now we are ready to let the model generate an image description for a satellite image. The satellite image used is from the Copernicus Senitnel-2 Satellite. You can learn more about the data in the [WEkEO Data Catalog](https://wekeo.copernicus.eu/data?view=dataset&dataset=EO:ESA:DAT:SENTINEL-2).

<p align="center">
  <img src="img/img-description-sentinel.png" alt="Sentinel image description" style="width:50%;">
</p>

In [11]:
import ollama

stream = ollama.chat(
	model="llava:7b",
	messages=[
		{
			'role': 'user',
			'content': 'Describe this image:',
			'images': ['img/img-description-sentinel.png']
		}
	],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

 The image you've provided appears to be a satellite or aerial view of an area, likely a rural landscape given the presence of fields and what looks like a farmland. The land is divided into sections that are colored differently—orange and green patches stand out. There is text at the bottom of the image that reads "Satellite View Of Some Place." Additionally, there's a small inset map with a red outline of an area within a larger yellow-brown rectangle, suggesting this view might be a section of a larger map or satellite image. The overall quality and resolution are low, indicating that the image may not be high-resolution and could be for illustrative purposes rather than precise navigation or geographic analysis. 

⚠️ **Please note:** Language models, including LLaVA, may sometimes produce **hallucinations**—outputs that sound plausible but are factually incorrect. Always double-check the generated results, especially when used in analytical or operational contexts.

---

## Congratulations!

You have just successfully generated an image description using a language model—powered by **Ollama** and the **LLaVA 7B model**—running on **WEkEO GPUs**. This demonstrates how vision-language models can also be applied to **satellite imagery** and **Earth Observation (EO) data**.

To learn more about using **GPUs in the WEkEO JupyterLab Workspace**, visit the [WEkEO Help Centre article on GPU usage](https://help.wekeo.eu/en/articles/7945473-which-are-the-computing-resources-of-the-wekeo-jupyterhub)
