# EDITO / Copernicus LLM (chat bot) for Data Access and Manipulation

This Jupyter notebook was created as part of the EDITO / Copernicus project to provide an interface for accessing and manipulating Copernicus data through an LLM (ChatBot). It was developed during a hackathon held on [insert specific hackathon date].

## License
This notebook is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0). You are free to share and adapt the material as long as you provide appropriate credit, indicate if changes were made, and distribute your contributions under the same license. For details, please see the attached LICENSE file or visit [CC BY-SA 4.0 License](https://creativecommons.org/licenses/by-sa/4.0/).

## Authors
- Eugenio Cutolo - Post-Doc at IMT Atlantique - <eugenio.cutolo@imt-atlantique.fr>
- Francois Courteille - Solutions Architect at NVIDIA - <fcourteille@nvidia.com> 

## Acknowledgments
This work was developed during a hackathon and is supported by the EDITO project. Special thanks to the hackathon organizers and participants for their valuable input and collaboration.


## Required software and libraries installation

> **Important:**  
> To run this notebook, the Ollama server must be installed and running in a separate process.  
> Additionally, some language models (LLMs) need to be downloaded and installed beforehand.

> Make sure to follow the setup instructions in the documentation. In this example we used llama3.2:3b.


In [1]:
!curl -fsSL https://ollama.com/install.sh | sh

>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
######################################################################## 100.0%##O#-#                                                                                        37.0% 55.5%  70.2%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.


In [3]:
!pip install ollama
!pip install requests
!pip install copernicusmarine



## Libraries Import

In [None]:
import ollama

In [None]:
import xarray as xr
import json
import numpy as np

In [4]:
import copernicusmarine

  from .autonotebook import tqdm as notebook_tqdm


## Function Definitions 

In [5]:
def subset_dataset(file_path, latitude_variable, longitude_variable, latitude_range, longitude_range):
    """
    Extract a subset of a dataset from a NetCDF file based on latitude and longitude ranges.

    Parameters:
    - file_path: Path to the NetCDF file that contains the dataset.
    - latitude_variable: The name of the latitude variable in the dataset.
    - longitude_variable: The name of the longitude variable in the dataset.
    - latitude_range: A list or tuple containing [min_latitude, max_latitude].
    - longitude_range: A list or tuple containing [min_longitude, max_longitude].

    Returns:
    - A subset of the original dataset limited to the specified latitude and longitude ranges.
    """
    
    # Open the dataset using xarray
    dataset = xr.open_dataset(file_path)
    
    # Subset the dataset by latitude and longitude ranges
    subset = dataset.where(
        (dataset[latitude_variable] >= latitude_range[0]) & 
        (dataset[latitude_variable] <= latitude_range[1]) &
        (dataset[longitude_variable] >= longitude_range[0]) &
        (dataset[longitude_variable] <= longitude_range[1]),
        drop=True  # Drop any NaN values that result from the selection
    )
    
    return subset

In [30]:
def find_product_id(region_name,variable_name):
    msg = []
    msg.append({'role': 'user', 'content':f'Based on this list {COPERNICUS_CATALOG}. Where I can find the {variable_name} for the {region_name}? Just answer with product_id no further text.'})
    response = ollama.chat(
        model='llama3.2:3b',
        messages=msg,
    )
    output = response['message']['content']
    user_response = 'Product available:'+ output
    model_response = {"role": "user", "content": 'Now you can proceed to download the product_id:'+output}
    return user_response, model_response

In [54]:
def download_data(product_id):
    user_response = 'Downloading:'+ product_id
    model_response = {"role": "user", "content": 'Now you can use the product_id for plotting:'+product_id}
    return user_response, model_response

In [55]:
def plot_data(product_id, latitude_range, longitude_range):
    user_response = f'Plotting {product_id} within latitude {latitude_range} and longitude {longitude_range}'
    return user_response, None

In [56]:
AVAILABLE_TOOLS = [
  {
    "type": "function",
    "function": {
      "name": "find_product_id",
      "description": "Find a product ID based on the geographical region name and variable name.",
      "parameters": {
        "type": "object",
        "properties": {
          "region_name": {
            "type": "string",
            "description": "The name of the geographical region (e.g., 'Arctic', 'Antarctic', 'Baltic Sea')."
          },
          "variable_name": {
            "type": "string",
            "description": "The name of the variable (e.g., 'Sea Ice Extent', 'Sea Surface Temperature')."
          }
        },
        "required": ["region_name", "variable_name"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "download_data",
      "description": "Download dataset based on the provided product ID.",
      "parameters": {
        "type": "object",
        "properties": {
          "product_id": {
            "type": "string",
            "description": "The ID of the product to download (e.g., 'ARCTIC_OMI_SI_extent')."
          }
        },
        "required": ["product_id"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "plot_data",
      "description": "Plot the data for a specific product ID over a given geographical region defined by latitude and longitude range.",
      "parameters": {
        "type": "object",
        "properties": {
          "product_id": {
            "type": "string",
            "description": "The ID of the product to visualize (e.g., 'ARCTIC_OMI_SI_extent')."
          },
          "latitude_range": {
            "type": "array",
            "items": {
              "type": "number"
            },
            "description": "A two-element array defining the minimum and maximum latitudes (e.g., [50, 70])."
          },
          "longitude_range": {
            "type": "array",
            "items": {
              "type": "number"
            },
            "description": "A two-element array defining the minimum and maximum longitudes (e.g., [-40, 10])."
          }
        },
        "required": ["product_id", "latitude_range", "longitude_range"]
      }
    }
  }
]


In [57]:
AVAILABLE_FUNCTIONS = {
    'find_product_id': find_product_id,
    'download_data': download_data,
    'plot_data': plot_data,
}

Downloading the COPERNICUS catalog for products search

In [58]:
COPERNICUS_CATALOG = copernicusmarine.describe(overwrite_metadata_cache=True)
COPERNICUS_CATALOG = str([p['product_id']+', Content: '+p['title'] for p in COPERNICUS_CATALOG['products']])

Fetching catalog: 100%|██████████| 3/3 [00:06<00:00,  2.06s/it]


## Main Object definition

In [59]:
class EDITO_BOT:
    def __init__(self, model='llama3.2:3b'):
        """
        Initializes the chat manager with model details.
        :param model: The model to be used for the Ollama chat.
        """
        self.model = model
        self.conversation_history = []

    def chat(self, message: str):
        """
        Sends a message to the Ollama API and processes the response.
        If the response contains a function call, it executes the corresponding function.
        :param message: The message to send to Ollama.
        :param available_functions: A dictionary mapping function names to callable functions.
        :return: The response from Ollama or the result of a function call.
        """
        # Add user's message to conversation history
        self.conversation_history.append({"role": "user", "content": message})

        # Prepare the payload for the API call using ollama.chat
        response = ollama.chat(
            model=self.model,
            messages=self.conversation_history,
            tools=AVAILABLE_TOOLS  # This will be passed to support tool calls
        )

        # Check if the response contains a tool call
        if response['message'].get('tool_calls'):
            
            for tool in response['message']['tool_calls']:
                function_name = tool['function']['name']
                function_args = tool['function']['arguments']
                #print(tool)
                if function_name in AVAILABLE_FUNCTIONS:
                    function_to_call = AVAILABLE_FUNCTIONS[function_name]
                    user_response, model_response = function_to_call(**function_args)
                    self.conversation_history.append(model_response)
                    print(user_response)
                else:
                    print(f"Function {function_name} is not available.")
                
        else:
            # Handle normal message response
            reply = response.get("message", "")
            print("EDITO BOT Response:", reply)
            self.conversation_history.append(response)

    def get_conversation_history(self):
        """
        Returns the full conversation history.
        :return: The list of messages exchanged during the conversation.
        """
        return self.conversation_history

    def clear_conversation_history(self):
        """
        Clears the conversation history.
        """
        self.conversation_history = []

## Bot Test

In [74]:
ebot = EDITO_BOT()

In [75]:
ebot.chat("Can you find the temperature for the Mediterrenan Sea?")

Product available:`SST_MED_SST_L3S_NRT_OBSERVATIONS_010_012`, `SST_MED_PHY_SUBSKIN_L4_NRT_010_036`, `SST_MED_SST_L4_NRT_OBSERVATIONS_010_004`, `SST_MED_SST_L4_REP_OBSERVATIONS_010_021`


In [76]:
ebot.chat("Can you download them?")

Product available:`SST_MED_SST_L4_REP_OBSERVATIONS_010_021`
Downloading:SST_MED_SST_L3S_NRT_OBSERVATIONS_010_012
Downloading:SST_MED_PHY_SUBSKIN_L4_NRT_010_036
Downloading:SST_MED_SST_L4_NRT_OBSERVATIONS_010_004
Downloading:SST_MED_SST_L4_REP_OBSERVATIONS_010_021


In [77]:
ebot.chat("Can you plot them within latitude range 40-42 and longitude 1-5?")

Product available:`SST_MED_SST_L3S_NRT_OBSERVATIONS_010_012`, `SST_MED_SST_L4_NRT_OBSERVATIONS_010_004`, `SST_MED_PHY_SUBSKIN_L4_NRT_010_036`, `SST_MED_PHY_L3S_MY_010_042`


TypeError: can only concatenate str (not "list") to str