# Visualizing 4-dimensional Data with Altair and Plotly

This notebook is one of three notebooks that serve to show how the ESA CCI Toolbox can be used with D3.js. Here, we will show how CCI Ozone Data, which has a fourth dimension beyond time. latitude, and longitude, can be visualised using charts from Altair and Plotly. Several preprocessing steps are applied, partly using operations from the CCI Toolbox.

Install altair and plotly it into your Python environment if you haven't already done so:

In [1]:
# ! mamba install --yes altair plotly

In [2]:
import altair as alt
import matplotlib.pyplot as plt
%matplotlib inline
import math
import numpy as np
import pandas as pd

import plotly.graph_objects as go
from plotly.subplots import make_subplots

from esa_climate_toolbox.core import get_op
from esa_climate_toolbox.core import list_ecv_datasets
from esa_climate_toolbox.core import open_data



Once more, we start with listing the datasets that are potentially useful for our purpose.

In [3]:
list_ecv_datasets(ecv="OZONE")

[('esacci.OZONE.day.L3S.TC.multi-sensor.multi-platform.MERGED.fv0100.r1',
  'esa-cci'),
 ('esacci.OZONE.mon.L3.LP.GOMOS.Envisat.GOMOS_ENVISAT.v0001.r1', 'esa-cci'),
 ('esacci.OZONE.mon.L3.LP.MIPAS.Envisat.MIPAS_ENVISAT.v0001.r1', 'esa-cci'),
 ('esacci.OZONE.mon.L3.LP.OSIRIS.ODIN.OSIRIS_ODIN.v0001.r1', 'esa-cci'),
 ('esacci.OZONE.mon.L3.LP.SCIAMACHY.Envisat.SCIAMACHY_ENVISAT.v0001.r1',
  'esa-cci'),
 ('esacci.OZONE.mon.L3.LP.SMR.ODIN.MZM.v0001.r1', 'esa-cci'),
 ('esacci.OZONE.mon.L3.LP.SMR.ODIN.SMR_ODIN.v0001.r1', 'esa-cci'),
 ('esacci.OZONE.mon.L3.NP.multi-sensor.multi-platform.MERGED.fv0002.r1',
  'esa-cci')]

Here, no zarr or kerchunk datasets are available, so we use data from the odp store. We pick the multi-sensor multi-platform dataset in a monthly resolution. We delimit the number of opened variables, so we only pick the weighted ozone average, and choose a temporal subset.

In [4]:
ozone_ds, _ = open_data(
    "esacci.OZONE.mon.L3.NP.multi-sensor.multi-platform.MERGED.fv0002.r1", 
    var_names="O3_du",
    data_store_id="esa-cci"
)
ozone_ds

Unnamed: 0,Array,Chunk
Bytes,2.25 kiB,2.25 kiB
Shape,"(144, 2)","(144, 2)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray
"Array Chunk Bytes 2.25 kiB 2.25 kiB Shape (144, 2) (144, 2) Dask graph 1 chunks in 2 graph layers Data type datetime64[ns] numpy.ndarray",2  144,

Unnamed: 0,Array,Chunk
Bytes,2.25 kiB,2.25 kiB
Shape,"(144, 2)","(144, 2)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,569.53 MiB,3.96 MiB
Shape,"(144, 16, 180, 360)","(1, 16, 180, 360)"
Dask graph,144 chunks in 2 graph layers,144 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 569.53 MiB 3.96 MiB Shape (144, 16, 180, 360) (1, 16, 180, 360) Dask graph 144 chunks in 2 graph layers Data type float32 numpy.ndarray",144  1  360  180  16,

Unnamed: 0,Array,Chunk
Bytes,569.53 MiB,3.96 MiB
Shape,"(144, 16, 180, 360)","(1, 16, 180, 360)"
Dask graph,144 chunks in 2 graph layers,144 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


The variable `O3_du` has a fourth dimension, layers. The dataset still has a dimension `air_pressure` that is not needed anymore, as no variable makes use of it, so we drop it.

In [5]:
ozone_ds = ozone_ds.drop_dims("air_pressure")
ozone_ds

Unnamed: 0,Array,Chunk
Bytes,2.25 kiB,2.25 kiB
Shape,"(144, 2)","(144, 2)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray
"Array Chunk Bytes 2.25 kiB 2.25 kiB Shape (144, 2) (144, 2) Dask graph 1 chunks in 2 graph layers Data type datetime64[ns] numpy.ndarray",2  144,

Unnamed: 0,Array,Chunk
Bytes,2.25 kiB,2.25 kiB
Shape,"(144, 2)","(144, 2)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,569.53 MiB,3.96 MiB
Shape,"(144, 16, 180, 360)","(1, 16, 180, 360)"
Dask graph,144 chunks in 2 graph layers,144 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 569.53 MiB 3.96 MiB Shape (144, 16, 180, 360) (1, 16, 180, 360) Dask graph 144 chunks in 2 graph layers Data type float32 numpy.ndarray",144  1  360  180  16,

Unnamed: 0,Array,Chunk
Bytes,569.53 MiB,3.96 MiB
Shape,"(144, 16, 180, 360)","(1, 16, 180, 360)"
Dask graph,144 chunks in 2 graph layers,144 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


In [6]:
subset_time_index = get_op("subset_temporal_index")
ozone_sub_ds = subset_time_index(ozone_ds, 120, 144)
ozone_sub_ds

Unnamed: 0,Array,Chunk
Bytes,384 B,384 B
Shape,"(24, 2)","(24, 2)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray
"Array Chunk Bytes 384 B 384 B Shape (24, 2) (24, 2) Dask graph 1 chunks in 3 graph layers Data type datetime64[ns] numpy.ndarray",2  24,

Unnamed: 0,Array,Chunk
Bytes,384 B,384 B
Shape,"(24, 2)","(24, 2)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,94.92 MiB,3.96 MiB
Shape,"(24, 16, 180, 360)","(1, 16, 180, 360)"
Dask graph,24 chunks in 3 graph layers,24 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 94.92 MiB 3.96 MiB Shape (24, 16, 180, 360) (1, 16, 180, 360) Dask graph 24 chunks in 3 graph layers Data type float32 numpy.ndarray",24  1  360  180  16,

Unnamed: 0,Array,Chunk
Bytes,94.92 MiB,3.96 MiB
Shape,"(24, 16, 180, 360)","(1, 16, 180, 360)"
Dask graph,24 chunks in 3 graph layers,24 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


We create a time series for a point, given through a string encoded as "longitude, latitude".

In [7]:
tpoint = get_op("tseries_point")
ozone_point_ts = tpoint(ozone_sub_ds, point="10, 53.5")
ozone_point_ts

Unnamed: 0,Array,Chunk
Bytes,384 B,384 B
Shape,"(24, 2)","(24, 2)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray
"Array Chunk Bytes 384 B 384 B Shape (24, 2) (24, 2) Dask graph 1 chunks in 3 graph layers Data type datetime64[ns] numpy.ndarray",2  24,

Unnamed: 0,Array,Chunk
Bytes,384 B,384 B
Shape,"(24, 2)","(24, 2)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.50 kiB,64 B
Shape,"(24, 16)","(1, 16)"
Dask graph,24 chunks in 4 graph layers,24 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 1.50 kiB 64 B Shape (24, 16) (1, 16) Dask graph 24 chunks in 4 graph layers Data type float32 numpy.ndarray",16  24,

Unnamed: 0,Array,Chunk
Bytes,1.50 kiB,64 B
Shape,"(24, 16)","(1, 16)"
Dask graph,24 chunks in 4 graph layers,24 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


We now convert this to a dataframe that may be read from the visualisation libraries.

In [8]:
df_op  = get_op("to_dataframe")
ozone_point_ts_df = df_op(ozone_point_ts)
ozone_point_ts_df

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,O3_du,lat,lon,time_bnds
time,layers,bnds,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2007-01-16 12:00:00,1,0,10.509481,53.5,10.5,2007-01-01
2007-01-16 12:00:00,1,1,10.509481,53.5,10.5,2007-02-01
2007-01-16 12:00:00,2,0,11.720570,53.5,10.5,2007-01-01
2007-01-16 12:00:00,2,1,11.720570,53.5,10.5,2007-02-01
2007-01-16 12:00:00,3,0,13.219014,53.5,10.5,2007-01-01
...,...,...,...,...,...,...
2008-12-16 12:00:00,14,1,0.235282,53.5,10.5,2009-01-01
2008-12-16 12:00:00,15,0,0.109375,53.5,10.5,2008-12-01
2008-12-16 12:00:00,15,1,0.109375,53.5,10.5,2009-01-01
2008-12-16 12:00:00,16,0,0.007546,53.5,10.5,2008-12-01


The resulting dataframe has a multi-index that is inconvenient to use, so we resolve it to be able to address the columns stating the index. 

In [9]:
ozone_point_ts_df = ozone_point_ts_df.reset_index().rename(columns={'index': 'timestamp'})
ozone_point_ts_df

Unnamed: 0,time,layers,bnds,O3_du,lat,lon,time_bnds
0,2007-01-16 12:00:00,1,0,10.509481,53.5,10.5,2007-01-01
1,2007-01-16 12:00:00,1,1,10.509481,53.5,10.5,2007-02-01
2,2007-01-16 12:00:00,2,0,11.720570,53.5,10.5,2007-01-01
3,2007-01-16 12:00:00,2,1,11.720570,53.5,10.5,2007-02-01
4,2007-01-16 12:00:00,3,0,13.219014,53.5,10.5,2007-01-01
...,...,...,...,...,...,...,...
763,2008-12-16 12:00:00,14,1,0.235282,53.5,10.5,2009-01-01
764,2008-12-16 12:00:00,15,0,0.109375,53.5,10.5,2008-12-01
765,2008-12-16 12:00:00,15,1,0.109375,53.5,10.5,2009-01-01
766,2008-12-16 12:00:00,16,0,0.007546,53.5,10.5,2008-12-01


We now create a chart from this dataframe, where each layer dimension receives a different color, allowing to visualize them in the same chart.

In [10]:
alt.Chart(ozone_point_ts_df).mark_line().encode(
    x='time:T',
    y='O3_du:Q',
    color='layers:N'
)

Like in the notebook on visualizing uncertainties, we are not only interested in showing the layer dimension of a time-series without spatial extent, but also of a grid for a single timestep. For this, we select the last time step ot the ozone dataset. For this, we again use plotly. Whilst we created a single plot in the previous notebook, we will now go on to create one surface area per layer dimension.

In [11]:
subset_time_index = get_op("subset_temporal_index")
ozone_sub_ds = subset_time_index(ozone_ds, 143, 144)
ozone_sub_ds

Unnamed: 0,Array,Chunk
Bytes,16 B,16 B
Shape,"(1, 2)","(1, 2)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray
"Array Chunk Bytes 16 B 16 B Shape (1, 2) (1, 2) Dask graph 1 chunks in 3 graph layers Data type datetime64[ns] numpy.ndarray",2  1,

Unnamed: 0,Array,Chunk
Bytes,16 B,16 B
Shape,"(1, 2)","(1, 2)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.96 MiB,3.96 MiB
Shape,"(1, 16, 180, 360)","(1, 16, 180, 360)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 3.96 MiB 3.96 MiB Shape (1, 16, 180, 360) (1, 16, 180, 360) Dask graph 1 chunks in 3 graph layers Data type float32 numpy.ndarray",1  1  360  180  16,

Unnamed: 0,Array,Chunk
Bytes,3.96 MiB,3.96 MiB
Shape,"(1, 16, 180, 360)","(1, 16, 180, 360)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


We want to remove the time dimension from the dataset so we have a leaner data array to work with.

In [12]:
ozone_sub_ds = ozone_sub_ds.squeeze()
ozone_sub_ds

Unnamed: 0,Array,Chunk
Bytes,16 B,16 B
Shape,"(2,)","(2,)"
Dask graph,1 chunks in 4 graph layers,1 chunks in 4 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray
"Array Chunk Bytes 16 B 16 B Shape (2,) (2,) Dask graph 1 chunks in 4 graph layers Data type datetime64[ns] numpy.ndarray",2  1,

Unnamed: 0,Array,Chunk
Bytes,16 B,16 B
Shape,"(2,)","(2,)"
Dask graph,1 chunks in 4 graph layers,1 chunks in 4 graph layers
Data type,datetime64[ns] numpy.ndarray,datetime64[ns] numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.96 MiB,3.96 MiB
Shape,"(16, 180, 360)","(16, 180, 360)"
Dask graph,1 chunks in 4 graph layers,1 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 3.96 MiB 3.96 MiB Shape (16, 180, 360) (16, 180, 360) Dask graph 1 chunks in 4 graph layers Data type float32 numpy.ndarray",360  180  16,

Unnamed: 0,Array,Chunk
Bytes,3.96 MiB,3.96 MiB
Shape,"(16, 180, 360)","(16, 180, 360)"
Dask graph,1 chunks in 4 graph layers,1 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Good. Now this is done, we can prepare the plots. There are sixteen layers, so we set up a structure of 8 rows and 2 columns. 

In [13]:
fig = make_subplots(
    rows=8,     
    cols=2,
    subplot_titles=(
        "Layer 1", "Layer 2", "Layer 3", "Layer 4",
        "Layer 5", "Layer 6", "Layer 7", "Layer 8",
        "Layer 9", "Layer 10", "Layer 11", "Layer 12",
        "Layer 13", "Layer 14", "Layer 15", "Layer 16"
    ),
    specs=[[{'type': 'surface'}, {'type': 'surface'}],
           [{'type': 'surface'}, {'type': 'surface'}],
           [{'type': 'surface'}, {'type': 'surface'}],
           [{'type': 'surface'}, {'type': 'surface'}],
           [{'type': 'surface'}, {'type': 'surface'}],
           [{'type': 'surface'}, {'type': 'surface'}],
           [{'type': 'surface'}, {'type': 'surface'}],
           [{'type': 'surface'}, {'type': 'surface'}]
          ]    
)

We can now create the surfaces. In each surface, latitude and longitude will stand for the x and y dimensions. The z-values will show the ozone average for the layer in question.

In [None]:
y = ozone_sub_ds.lat.values
x = ozone_sub_ds.lon.values

for i in range(len(ozone_sub_ds.layers)):
    r = math.floor(i / 2) + 1    
    c = math.floor(i % 2) + 1
    print(f"Creating plot for layer at {r}, {c}")
    z = ozone_sub_ds.O3_du.isel(layers=i).values
    layer_number = ozone_sub_ds.layers.isel(layers=i).values
    fig.add_trace(
        go.Surface(x=x, y=y, z=z, showscale=False, name=f"Layer {layer_number}"),
        row=r, 
        col=c,        
    )
fig.update_layout(
    title_text='Different layers of mole content of ozone in the different atmosphere layers',
    height=1600,
    width=1200,
    showlegend=True
)
fig.update_layout(margin=dict(t=100))
fig.show()

Creating plot for layer at 1, 1
Creating plot for layer at 1, 2
Creating plot for layer at 2, 1
Creating plot for layer at 2, 2
Creating plot for layer at 3, 1
Creating plot for layer at 3, 2
Creating plot for layer at 4, 1
Creating plot for layer at 4, 2
Creating plot for layer at 5, 1
Creating plot for layer at 5, 2
Creating plot for layer at 6, 1
Creating plot for layer at 6, 2
Creating plot for layer at 7, 1
Creating plot for layer at 7, 2
Creating plot for layer at 8, 1
Creating plot for layer at 8, 2
