# Data Analysis Workshop

## Tutorial II: Add/Remove dataset to/from Freva Databrowser

<div style="
  border-left: 6px solid rgb(236, 114, 0);
  background-color:rgb(253, 231, 157);
  color:rgb(19, 19, 18);
  padding: 1em;
  font-size: 110%;
  border-radius: 4px;
  margin: 1em 0;
">
⚠️ <strong>ATTENTION</strong>: if the kernel is not already set, please change it to the <code>DA Workshop (python)</code>
</div>

<div style="border-left: 4px solid #0366d6; padding: 0.5em; background-color: #deecfc;">
  ℹ️ If you want to know more about these topics please refer to:
  <ul>
  <li> <a href='https://freva-org.github.io/freva-nextgen/auth/index.html#' target="_blank">authentication</a></li>
  <li>adding and deleting user data <a href="https://freva-org.github.io/freva-nextgen/databrowser/python-lib.html#freva_client.databrowser.userdata">via <code>databrowser.userdata</code> </a></li>
</ul>  
</div>

<br>
<br>

So far in this tutorial, we've learned how to search for datasets, inspect metadata, and access data directly from Freva's indexed resources. But what if you want to work with data that isn't yet indexed, like your own model simulations or derived statistics?

In this next section, we'll walk through how to register your own dataset with Freva's databrowser:

1. Prepare your data
   We'll use a sample file of extreme‐value statistics (dummy sea surface temperature extremes), but you can point to any local or remote dataset you've generated.

2. Open the dataset  
   Loading the file locally allows Freva to auto-extract core metadata attributes (e.g., dimensions, coordinate variables).

3. Define missing metadata
   For any required fields not present in the file (project, experiment, realm, etc.), we'll set the appropriate Freva attributes.

4. Ingest into Freva
   You'll see how to add the dataset to the databrowser index—and afterward, how to remove it again, to demonstrate both ingestion and deletion workflows.

By the end of this section, you'll be able to make your own simulations and analyses discoverable alongside Freva's curated archives.

In [None]:
import numpy as np, xarray as xr, os
from getpass import getuser
from freva_client import databrowser, authenticate
from pathlib import Path

Let's start creating a dummy surface sea temperature monthly file for the tropical Pacific region:

In [None]:
time = np.arange("2025-01","2026-01",dtype="datetime64[M]")
lat, lon = np.linspace(-30,30,121), np.linspace(120,290,171)
da = (xr.DataArray(28 - 0.006*(lon-230), dims=("lon",), coords={"lon":lon})
      .expand_dims(time=time, lat=lat))
da.name = "sst"; da.attrs.update(long_name="Idealized Pacific SST", units="°C")
da.to_netcdf(f"dummy_sst_{getuser()}.nc")
print(f"check if dummy_sst_{getuser()}.nc exists: {os.path.exists(f'dummy_sst_{getuser()}.nc')}")

In [None]:
import matplotlib.pyplot as plt, cartopy.crs as ccrs, cartopy.feature as cfeature

fig, ax = plt.subplots(figsize=(8,4), subplot_kw=dict(projection=ccrs.PlateCarree(180)))
da.mean("time").plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree(), cmap="coolwarm", add_colorbar=True)

ax.add_feature(cfeature.LAND, facecolor="white", zorder=2); ax.coastlines(zorder=3); ax.add_feature(cfeature.BORDERS, linestyle=":", zorder=3)
plt.title("Pacific SST"); plt.tight_layout(); plt.show()

Now we are going to add this dummy SST data to the Freva databrowser

For this purpose we need to first authenticate to be able to be know to the server to be able to add and remove data.
To do so, please follow the folowing instruction to do it:

1. Please head over to the https://www.gems.dkrz.de and then loging to the website.

    <img src="https://github.com/freva-org/Talks/raw/main/talks/DataSearchWorkshop2025/media/token1.png" width=600px>
2. When you are logged in, please click on your username button on top right and choose the `Token Management`.

    <img src="https://github.com/freva-org/Talks/raw/main/talks/DataSearchWorkshop2025/media/token2.png" width=600px>
3. from the buttons on the token management, please click on the `Copy`

    <img src="https://github.com/freva-org/Talks/raw/main/talks/DataSearchWorkshop2025/media/token3.png" width=600px>
4. Paste the copied token in the cell below here:

👇👇👇**TOKEN SHALL BE PASTED HERE**👇👇👇

In [None]:
# <copy-paste> the token content from the web site below
token = """ """
_ = (Path.home() / ".freva-access-token.json").write_text(token)

And using the token we can authenticate ourselves:

In [None]:
token = authenticate(token_file=Path.home() / ".freva-access-token.json")

<br>

Now lets add our newly created dummy data to Freva.

Freva is able to infer, from the filename and/or file metadata some of the attributes of this file, such as, e.g. `time_frequency` and `variable`. But I cannot make up for all of the attributes, however, we can explicitly define them before we want to index our file into the databrowser so they are not all empty. We will do it using the `metadata=` parameter:

In [None]:
global_attributes = {"project": "userdata", "product": "stats", "model": "IFS", "experiment": "ETCCDI", "realm": "atmos"}
databrowser.userdata(
    action="add",
    userdata_items=[f"{os.getcwd()}/dummy_sst_{getuser()}.nc"],
    metadata=global_attributes,
    host="https://www.gems.dkrz.de",
)

This response shows that the data has been successfully added to Freva. Now we can query the data to see if the data is on Freva

In [None]:
databrowser.metadata_search(flavour="user", user=getuser())

Since we are sure that the data has been added to Freva, we can remove it via the following command:

In [None]:
global_attributes = {"project": "userdata", "product": "stats", "model": "IFS", "experiment": "ETCCDI", "realm": "atmos"}
databrowser.userdata(
    action="delete",
    metadata=global_attributes,
    host="https://www.gems.dkrz.de",
)

For demonstration purposes we can check again:

In [None]:
databrowser.metadata_search(flavour="user", user=getuser())