# Extracting a Time Series

## **Overview**

In this tutorial, we will take MODIS Vegetation Indices Version 6.1 data which are generated every 16 days at 250 meter (m) spatial resolution for 2020 year to get the Normalised Differnece Vegetation Index (NDVI) at different point locations.

Data Credit:

*   Didan, K. (2015). MOD13Q1 MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061. NASA EOSDIS Land Processes DAAC. Accessed 2023-05 from https://doi.org/10.5067/MODIS/MOD13Q1.006




## Setup and Data Download

The following blocks of code will install the required packages and download the datasets to your Colab environment.

In [1]:
%%capture
if 'google.colab' in str(get_ipython()):
    !apt install libspatialindex-dev
    !pip install fiona shapely pyproj rtree
    !pip install geopandas
    !pip install rioxarray

In [2]:
import datetime
import glob
import os
import re
import pandas as pd
import geopandas as gpd
import xarray as xr
import rioxarray as rxr
import zipfile

In [3]:
data_folder = 'data'
output_folder = 'output'

if not os.path.exists(data_folder):
    os.mkdir(data_folder)
if not os.path.exists(output_folder):
    os.mkdir(output_folder)

In [4]:
def download(url):
    filename = os.path.join(data_folder, os.path.basename(url))
    if not os.path.exists(filename):
        from urllib.request import urlretrieve
        local, _ = urlretrieve(url, filename)
        print('Downloaded ' + local)


filename = 'modis_vegetation_indices_2020.zip'
url = 'https://storage.googleapis.com/spatialthoughts-public-data/' + filename

download(url)

# Data Pre-Processing

First we unzip and extract the images to a folder.

In [5]:
zipfile_path = os.path.join(data_folder, filename)
with zipfile.ZipFile(zipfile_path) as zf:
  zf.extractall(data_folder)

In [6]:
def path_to_datetimeindex(filepath):
  filename = os.path.basename(filepath)
  pattern = r'doy(\d+)'
  match = re.search(pattern, filepath)
  if match:
      doy_value = match.group(1)
      timestamp = datetime.datetime.strptime(doy_value, '%Y%j')
      return timestamp
  else:
    print('Could not extract DOY from filename', filename)


timestamps = []
filepaths = []

files = os.path.join(data_folder, 'modis_vegetation_indices_2020', '*.tif')
for filepath in glob.glob(files):
  timestamp = path_to_datetimeindex(filepath)
  filepaths.append(filepath)
  timestamps.append(timestamp)

unique_timestamps = set(timestamps)

In [8]:
scenes = []

for timestamp in unique_timestamps:
  ndvi_filepattern = r'NDVI_doy{}'.format(timestamp.strftime('%Y%j'))
  evi_filepattern = r'EVI_doy{}'.format(timestamp.strftime('%Y%j'))

  ndvi_filepath = [filepath for filepath in filepaths if re.search(ndvi_filepattern, filepath)][0]
  evi_filepath = [filepath for filepath in filepaths if re.search(evi_filepattern, filepath)][0]

  ndvi_band = rxr.open_rasterio(ndvi_filepath, chunks={'x':512, 'y':512})
  ndvi_band.name = 'NDVI'
  evi_band = rxr.open_rasterio(evi_filepath, chunks={'x':512, 'y':512})
  evi_band.name = 'EVI'
  bands = [ndvi_band, evi_band]
  scene = xr.merge(bands)
  scenes.append(scene)

In [9]:
time_var = xr.Variable('time', list(unique_timestamps))

time_series_scenes = xr.concat(scenes, dim=time_var)
time_series_scenes

Unnamed: 0,Array,Chunk
Bytes,75.74 MiB,512.00 kiB
Shape,"(22, 1, 1656, 1090)","(1, 1, 512, 512)"
Dask graph,264 chunks in 67 graph layers,264 chunks in 67 graph layers
Data type,int16 numpy.ndarray,int16 numpy.ndarray
"Array Chunk Bytes 75.74 MiB 512.00 kiB Shape (22, 1, 1656, 1090) (1, 1, 512, 512) Dask graph 264 chunks in 67 graph layers Data type int16 numpy.ndarray",22  1  1090  1656  1,

Unnamed: 0,Array,Chunk
Bytes,75.74 MiB,512.00 kiB
Shape,"(22, 1, 1656, 1090)","(1, 1, 512, 512)"
Dask graph,264 chunks in 67 graph layers,264 chunks in 67 graph layers
Data type,int16 numpy.ndarray,int16 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,75.74 MiB,512.00 kiB
Shape,"(22, 1, 1656, 1090)","(1, 1, 512, 512)"
Dask graph,264 chunks in 67 graph layers,264 chunks in 67 graph layers
Data type,int16 numpy.ndarray,int16 numpy.ndarray
"Array Chunk Bytes 75.74 MiB 512.00 kiB Shape (22, 1, 1656, 1090) (1, 1, 512, 512) Dask graph 264 chunks in 67 graph layers Data type int16 numpy.ndarray",22  1  1090  1656  1,

Unnamed: 0,Array,Chunk
Bytes,75.74 MiB,512.00 kiB
Shape,"(22, 1, 1656, 1090)","(1, 1, 512, 512)"
Dask graph,264 chunks in 67 graph layers,264 chunks in 67 graph layers
Data type,int16 numpy.ndarray,int16 numpy.ndarray


In [None]:
time_series_scenes.sel(band=1).plot.imshow(col='time', robust=True)