## Converting NOAA NDVI CDR data to Zarr

This notebook explore the conversion of NOAA NDVI Climate Data Record (CDR) from native data format (i.e., NetCDF4) to Zarr format. [NOAA NDVI CDR](https://www.ncei.noaa.gov/products/climate-data-records/normalized-difference-vegetation-index) is a daily product on a 0.05° by 0.05° grid using AVHRR data from 1981–present. The data is available both from [NOAA NCEI](https://doi.org/10.7289/V5ZG6QH9) and different cloud platforms.

This notebook access NDVI CDR data from [AWS Open Data Registry](https://registry.opendata.aws/noaa-cdr-terrestrial/) provided by [NOAA Big Data Program](https://www.noaa.gov/information-technology/big-data).

### Step 1 – Access NDVI CDR data on AWS

First, let's check an sample NDVI CDR data on AWS using `xarray`.

Example file for explorating:  

HTTPS: `https://noaa-cdr-ndvi-pds.s3.amazonaws.com/data/1982/AVHRR-Land_v005_AVH13C1_NOAA-07_19820101_c20170610044559.nc`  

S3: `s3://noaa-cdr-ndvi-pds/data/1982/AVHRR-Land_v005_AVH13C1_NOAA-07_19820101_c20170610044559.nc`

In [4]:
import xarray as xr
import boto3
import fsspec
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import cartopy, cartopy.crs as ccrs
from botocore import UNSIGNED
from botocore.client import Config

In [22]:
## Define the function to get the url from Amazon S3 for the GOES-16 data
def get_url_for_prefix(prefix):
    s3 = boto3.client("s3", config=Config(signature_version=UNSIGNED))
    paginator = s3.get_paginator('list_objects_v2')
    page_iterator = paginator.paginate(Bucket = 'noaa-cdr-ndvi-pds', Prefix = prefix)
    files_mapper = ["s3://noaa-cdr-ndvi-pds/" + file['Key'] for page in page_iterator for file in page['Contents']]
    return files_mapper[1]

get_url_for_prefix('data/1982/AVHRR-Land_v005_AVH13C1')

's3://noaa-cdr-ndvi-pds/data/1982/AVHRR-Land_v005_AVH13C1_NOAA-07_19820102_c20170610050433.nc'

In [13]:
%%time

ndvi_url = 's3://noaa-cdr-ndvi-pds/data/1982/AVHRR-Land_v005_AVH13C1_NOAA-07_19820101_c20170610044559.nc'
ds = xr.open_dataset(fsspec.open(ndvi_url, anon=True).open())
ds

CPU times: user 186 ms, sys: 60.7 ms, total: 247 ms
Wall time: 3.31 s


### Step 2 – Define file pattern