# Combining tides with satellite data

**This guide demonstrates how to combine tide modelling with satellite Earth observation (EO) data using the [`tag_tides`](../../api/#eo_tides.eo.tag_tides) and [`pixel_tides`](../../api/#eo_tides.eo.pixel_tides) functions from [`eo_tides.eo`](../../api/#eo_tides.eo).**

Both these functions allow you to model the height of the tide at the exact moment of satellite image acquisition. 
This can then allow you to analyse satellite EO data by tidal conditions - for example, filter your data to satellite imagery collected during specific tidal stages (e.g. low or high tide).

Although both functions perform a similar function, they differ in complexity and performance. `tag_tides` assigns a single tide height to each timestep/satellite image, which is fast and efficient, and suitable for small-scale applications where tides are unlikely to vary across your study area. In constrast, `pixel_tide` models tides both through time *and* spatially, returning a tide height for every satellite pixel. This can be critical for producing seamless coastal EO datasets at large scale - however comes at the cost of performance.
<br><br>
> **Table 1.** Comparison of `tag_tides` and `pixel_tides`

| [`tag_tides`](../../api/#eo_tides.eo.tag_tides)                                                                 | [`pixel_tides`](../../api/#eo_tides.eo.pixel_tides)                                                                                              |
|-----------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
| Assigns a single tide height to each timestep/satellite image                         | Assigns a tide height to every individual pixel through time to capture spatial tide dynamics                                               |
| 🔎 Ideal for local or site-scale analysis                                      | 🌏 Ideal for regional to global-scale coastal product generation                                              |
| ✅ Fast, low memory use                                                        | ❌ Slower, higher memory use                                                                                  |
| ❌ Single tide height per image can produce artefacts in complex tidal regions | ✅ Produce spatially seamless results across large extents by applying analyses at the pixel level |

## Getting started
As in the previous example, our first step is to tell `eo-tides` the location of our tide model directory (if you haven't set this up, [refer to the setup instructions here](../../setup)):

In [None]:
directory = "../../tests/data/tide_models/"

## Load satellite data using odc-stac

Now we can load a time-series of satellite data over our area of interest using the Open Data Cube's `odc-stac` package.
This powerful package allows us to load open satellite data (e.g ESA Sentinel-2 or NASA/USGS Landsat) for any time period and location on the planet, and load our data into a multi-dimensional `xarray.Dataset` format dataset.

In this example, we will load **Landsat 8 and 9** satellite data from **2024** over the city of **Broome, Western Australia** - a macrotidal region with extensive intertidal coastal habitats.
We will load this data from the [Digital Earth Australia](https://knowledge.dea.ga.gov.au/guides/setup/gis/stac/) STAC catalogue.

<div class="admonition tip">
    <p class="admonition-title">Tip</p>
    <p>
        For a more detailed guide to using STAC metadata and <code>odc-stac</code> to find and load satellite data, refer to the <a href="https://knowledge.dea.ga.gov.au/guides/setup/gis/stac/">Digital Earth Australia STAC user guide</a>.
    </p>
</div>

In [None]:
import odc.stac
import pystac_client

# Connect to STAC catalog
catalog = pystac_client.Client.open("https://explorer.dea.ga.gov.au/stac")

# Set cloud access defaults
odc.stac.configure_rio(
    cloud_defaults=True,
    aws={"aws_unsigned": True},
)

# Build a query and search the STAC catalog for all matching items
bbox = [122.12, -18.25, 122.43, -17.93]
query = catalog.search(
    bbox=bbox,
    collections=["ga_ls8c_ard_3", "ga_ls9c_ard_3"],
    datetime="2024-01-01/2024-12-31",
    filter = "eo:cloud_cover < 5"  # Filter to images with <5% cloud
)

# Load data into xarray format
ds = odc.stac.load(
    items=list(query.items()),
    bands=["nbart_red", "nbart_green", "nbart_blue"],
    crs="utm",
    resolution=30,
    groupby="solar_day",
    bbox=bbox,
    fail_on_error=False,
    chunks={},
)

# Plot the first image
ds.isel(time=0).odc.explore(vmin=50, vmax=3000)

## Using tag_tides

We can pass our satellite dataset `ds` to the `tag_tides` function to model a tide for each timestep in our dataset.
This can help sort and filter images by tide height, allowing us to learn more about how coastal environments respond to the effect of changing tides.

The `tag_tides` function uses the time and date of acquisition and the geographic centroid of each satellite observation as inputs for the selected tide model (EOT20 by default). 
It returns an `xarray.DataArray` called `tide_height`, with a modelled tide for every timestep in our satellite dataset:

In [None]:
from eo_tides.eo import tag_tides

tides_da = tag_tides(
    data=ds,
    directory=directory,
)

# Print modelled tides
print(tides_da)

We can easily combine these modelled tides with our original satellite data for further analysis.
The code below adds our modelled tides as a new `tide_height` variable under **Data variables**.

In [None]:
ds["tide_height"] = tides_da
print(ds)

<div class="admonition tip">
    <p class="admonition-title">Tip</p>
    <p>
        You could also model tides and insert tide heights into <code>ds</code> in a single step via:<br>
        <code>ds["tide_height"] = tag_tides(ds, ...)</code>
    </p>
</div>

We can plot this new `tide_height` variable over time to inspect the tide heights observed by the satellites in our time series:

In [None]:
ds.tide_height.plot()

### Selecting and analysing satellite data by tide

Having `tide_height` as a variable allows us to select and analyse our satellite data using information about tides.
For example, we could sort by `tide_height`, then plot the lowest and highest tide images in our time series:

In [None]:
# Sort by tide and plot the first and last image
ds_sorted = ds.sortby("tide_height")
ds_sorted.isel(time=[0, -1]).odc.to_rgba(vmin=50, vmax=3000).plot.imshow(col="time")

## Using pixel_tides

The previous examples show how to model a single tide height for each satellite image using the centroid of the image as a tide modelling location.
However, in reality tides vary spatially – potentially by several metres in areas of complex tidal dynamics.
This means that an individual satellite image can capture a range of tide conditions.

We can use the `pixel_tides` function to capture this spatial variability in tide heights. 
For efficient processing, this function first models tides into a low resolution grid surrounding each satellite image in our time series.
This lower resolution data includes a buffer around the extent of our satellite data so that tides can be modelled seamlessly across analysis boundaries.

First, let's reload our satellite data for a fresh start:

In [None]:
# Load data into xarray format
ds = odc.stac.load(
    items=list(query.items()),
    bands=["nbart_red", "nbart_green", "nbart_blue"],
    crs="utm",
    resolution=30,
    groupby="solar_day",
    bbox=bbox,
    fail_on_error=False,
    chunks={},
)

Now run `pixel_tides`, passing our satellite dataset `ds` as an input:

In [None]:
from eo_tides.eo import pixel_tides

# Model tides spatially
tides_lowres = pixel_tides(
    data=ds,
    resample=False,
    directory=directory,
)

# Print output
print(tides_lowres)

If we plot the resulting data, we can see that we now have two-dimensional tide surfaces for each timestep in our data (instead of the single tide height per timestamp returned by the `tag_tides` function).

Blue values below indicate low tide pixels, while red indicates high tide pixels. If you look closely, you may see some spatial variability in tide heights within each timestep, with slight variations in tide heights along the north-west side of the study area:

In [None]:
# Plot the first four timesteps in our data
tides_lowres.isel(time=slice(0, 4)).plot.imshow(col="time", vmin=-1, vmax=1, cmap="RdBu")

### Reprojecting into original high-resolution spatial grid

By setting `resample=True`, we can use interpolation to re-project our low resolution tide data back into the resolution of our satellite image, resulting in an individual **tide height for every pixel** in our dataset through time and space:

In [None]:
# Model tides spatially
tides_highres = pixel_tides(
    data=ds,
    resample=True,
    directory=directory,
)

# Plot the first four timesteps in our data
tides_highres.isel(time=slice(0, 4)).plot.imshow(col="time", vmin=-1, vmax=1, cmap="RdBu")

`tides_highres` will have exactly the same dimensions as `ds`, with a unique tide height for every satellite pixel:

In [None]:
ds.sizes

In [None]:
tides_highres.sizes

Because of this, our stack of tides can be added as an additional 3D variable in our dataset:

In [None]:
ds["tide_height_pixel"] = tides_highres
print(ds)

### Calculating tide height min/max/median/quantiles for each pixel
Min, max or any specific quantile of all tide heights observed over a region can be calculated for each pixel by passing in a list of quantiles/percentiles using the `calculate_quantiles` parameter.

This calculation is performed on the low resolution modelled tide data before reprojecting to higher resolution, so should be faster than calculating min/max/median tide at high resolution:

In [None]:
# Model tides spatially
tides_highres_quantiles = pixel_tides(
    data=ds,
    calculate_quantiles=(0, 0.5, 1),
    directory=directory,
)

# Plot quantiles
tides_highres_quantiles.plot.imshow(col="quantile")

### Modelling custom times

Instead of using times contained in the `time` dimension of our dataset, we can also calculate pixel-based tides for a custom set of times:

In [None]:
import pandas as pd

custom_times = pd.date_range(
    start="2022-01-01", 
    end="2022-01-02", 
    freq="6H",
)

# Model tides spatially
tides_highres = pixel_tides(
    data=ds, 
    time=custom_times,
    directory=directory,
)

# Plot custom timesteps
tides_highres.plot.imshow(col="time")

## Next steps

Now that we have learnt to combine tide modelling with satellite data, we can learn how to [calculate statistics](../Tide_statistics) describing local tide dynamics, as well as biases caused by interactions between tidal processes and satellite orbits.