# Sample of Met Office AWS earth data in a 'rolling' zarr format

This is just a small snap shot of some the data that we hope to expose as a rolling zarr. It's currently avaliable as a individual NetCDF files  via [the Met Office on AWS Earth](https://registry.opendata.aws/uk-met-office/). 

In this note book we've take those files and combined in to a Zarr array. By using the offset we are able to 'roll' this data set as new forecasts come in.

*Warning* data access from S3 via my-binder appears very slow so don't try access to much data at once. Co-locating the compute with the data in the same AWS region significantly bosts performance.

In [1]:
import rolling_zarr_array_store
import s3fs
import zarr
from bokeh.plotting import figure, output_notebook, show
output_notebook()

ZARR_PATH = "informatics-lab-rolling-zarr-demo/mo-atmospheric-mogreps-uk-prd-air_temperature-at_heights.zarr/air_temperature"


In [12]:
s3fs.S3FileSystem(anon=True).exists('informatics-lab-rolling-zarr-demo/mo-atmospheric-mogreps-uk-prd-air_temperature-at_heights.zarr')

ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

In [2]:
class MyS3OffsetStore(rolling_zarr_array_store.OffsetArrayStoreMixin, s3fs.S3Map):
    pass

In [4]:
store = MyS3OffsetStore(root=ZARR_PATH, s3=s3fs.S3FileSystem(anon=True), check=False,create=False)
data = zarr.open(store)
data

ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

In [4]:
# current offset
data.attrs['_offset']

[145, 0, 0, 0, 0, 0]

In [5]:
y = data[-1,0:5,5,10,100,100].tolist()
x = list(range(len(y)))

In [6]:
p = figure(plot_width=400, plot_height=400)

# add a line renderer
p.line(x,y, line_width=2)

show(p)

We can see the effect of the offset by accesing the same index with and without the `OffsetArrayStoreMixin`

In [7]:
non_offset_store = s3fs.S3Map(root=ZARR_PATH, s3=s3fs.S3FileSystem(anon=True), check=False, create=False)
non_offset_data = zarr.open(non_offset_store)
non_offset_data

<zarr.core.Array (20, 55, 12, 33, 970, 1042) float32>

In [8]:
data[0,0,0,0,0,0], non_offset_data[0,0,0,0,0,0]

(287.625, nan)

We can 'manually' apply the offset to get the same answer

In [13]:
offset = '.'.join(str(i) for i in data.attrs['_offset'])
offset

'145.0.0.0.0.0'

In [14]:
data._decode_chunk(non_offset_store[offset])[0,0,0,0,0,0]

287.625