In [1]:
from rich import print
import logging

logging.basicConfig(level=logging.INFO)

# MODIS / AρρEEARS

Retrieve MODIS data via [AρρEEARS](https://appeears.earthdatacloud.nasa.gov/).
You can also use the web portal to explore & download data, it is worth
exploring. You can also monitor ongoing springtime requests in the web portal.
You cannot (currently) download data through the web interface (without
springtime) and then load that with springtime.

The retrieve data you need a NASA Earthdata account. You can create one
[here](https://urs.earthdata.nasa.gov/users/new) and save your credentials in
`~/.config/springtime/credentials.json` as `{"username": "<your username>",
"password": "<your password>"}`.

Here we will download data from the MODIS Land Cover Dynamics dataset. For more information, see https://lpdaac.usgs.gov/products/mcd12q2v061/


## Explore AppEEARS data

Before downloading anything, use the `products` and `layers` utility functions.


In [12]:
from springtime.datasets.appeears import products

products()

[ProductInfo(Product='GPW_DataQualityInd', Platform='GPW', Description='Quality of Input Data for Population Count and Density Grids', Resolution='1000m', Version='411', ProductAndVersion='GPW_DataQualityInd.411', DOI='10.7927/H42Z13KG', Available=True, RasterType='Tile', TemporalGranularity='Quinquennial', DocLink='https://doi.org/10.7927/H42Z13KG', Source='SEDAC', TemporalExtentStart='2000-01-01', TemporalExtentEnd='2020-12-31', Deleted=False),
 ProductInfo(Product='GPW_UN_Adj_PopCount', Platform='GPW', Description='UN-adjusted Population Count', Resolution='1000m', Version='411', ProductAndVersion='GPW_UN_Adj_PopCount.411', DOI='10.7927/H4PN93PB', Available=True, RasterType='Tile', TemporalGranularity='Quinquennial', DocLink='https://doi.org/10.7927/H4PN93PB', Source='SEDAC', TemporalExtentStart='2000-01-01', TemporalExtentEnd='2020-12-31', Deleted=False),
 ProductInfo(Product='GPW_UN_Adj_PopDensity', Platform='GPW', Description='UN-adjusted Population Density', Resolution='1000m', 

We are interested in product MCD12Q2 on "Land Cover Dynamics". The products has
several layers that we can retrieve.


In [1]:
from springtime.datasets.appeears import layers

layers("MCD12Q2.061")

{'FparExtra_QC': LayerInfo(AddOffset='', Available=True, DataType='byte', Description='Extra detail Quality for Lai and Fpar', Dimensions=['time', 'YDim', 'XDim'], FillValue=255, IsQA=True, Layer='FparExtra_QC', OrigDataType='byte', OrigValidMax=254, OrigValidMin=0, QualityLayers='', QualityProductAndVersion='', ScaleFactor='', Units='class-flag', ValidMax=254, ValidMin=0, XSize=2400, YSize=2400),
 'FparLai_QC': LayerInfo(AddOffset='', Available=True, DataType='byte', Description='Quality for Lai and Fpar', Dimensions=['time', 'YDim', 'XDim'], FillValue=255, IsQA=True, Layer='FparLai_QC', OrigDataType='byte', OrigValidMax=254, OrigValidMin=0, QualityLayers='', QualityProductAndVersion='', ScaleFactor='', Units='class-flag', ValidMax=254, ValidMin=0, XSize=2400, YSize=2400),
 'FparStdDev_500m': LayerInfo(AddOffset=0.0, Available=True, DataType='float32', Description='Standard deviation of Fpar', Dimensions=['time', 'YDim', 'XDim'], FillValue=255, IsQA=False, Layer='FparStdDev_500m', Ori

From the [metadata](https://lpdaac.usgs.gov/products/mcd12q2v061/) we learn that these variables are reported for up to 2 growing seasons per year, depending on vegetation type.

## Retrieving point data

There are two main ways to download AρρEEARS data: as points or as area. The springtime behaviour depends on whether the settings for points and area:

- Points given, area not given: use the point download of AρρEEARS
- Points not given, area given: use the area download of AρρEEARS
- Points and area given: download an area but extract points during load
- Points nor area given: invalid.


In [4]:
from springtime.datasets import Appeears

dataset = Appeears(
    years=[2009, 2011],
    product="MCD12Q2",
    version="061",
    layers=["Greenup", "Dormancy"],
    points=[(10.691330, 48.085350), (8.892998, 47.097801)],
)

print(dataset)

In [5]:
dataset.raw_load_points()

INFO:springtime.datasets.appeears:Found /home/peter/.cache/springtime/appeears/MCD12Q2-2009-2011-Dormancy-Greenup-50f0093a3994764340e8bb6f70797f854dd3a4eb-MCD12Q2-061-results.csv


Unnamed: 0,Latitude,Longitude,Date,MODIS_Tile,MCD12Q2_061_Line_Y_500m,MCD12Q2_061_Sample_X_500m,MCD12Q2_061_Dormancy_0,MCD12Q2_061_Dormancy_1,MCD12Q2_061_Greenup_0,MCD12Q2_061_Greenup_1,...,MCD12Q2_061_QA_Detailed_1_Dormancy,MCD12Q2_061_QA_Detailed_1_Dormancy_Description,MCD12Q2_061_QA_Detailed_1_Unused,MCD12Q2_061_QA_Detailed_1_Unused_Description,MCD12Q2_061_QA_Overall_0_bitmask,MCD12Q2_061_QA_Overall_0_Name,MCD12Q2_061_QA_Overall_0_Name_Description,MCD12Q2_061_QA_Overall_1_bitmask,MCD12Q2_061_QA_Overall_1_Name,MCD12Q2_061_QA_Overall_1_Name_Description
0,47.097801,8.892998,2009-01-01,h18v04,696.0,1452.0,14572.0,32767.0,14353.0,32767.0,...,0b11,Poor,0b01,,0b0000000000000000,0b0000000000000000,Best,0b0111111111111111,0b0111111111111111,
1,47.097801,8.892998,2010-01-01,h18v04,696.0,1452.0,14899.0,32767.0,14724.0,32767.0,...,0b11,Poor,0b01,,0b0000000000000001,0b0000000000000001,Good,0b0111111111111111,0b0111111111111111,
2,47.097801,8.892998,2011-01-01,h18v04,696.0,1452.0,15286.0,32767.0,15075.0,32767.0,...,0b11,Poor,0b01,,0b0000000000000001,0b0000000000000001,Good,0b0111111111111111,0b0111111111111111,
3,48.08535,10.69133,2009-01-01,h18v04,459.0,1714.0,14571.0,32767.0,14326.0,32767.0,...,0b11,Poor,0b01,,0b0000000000000000,0b0000000000000000,Best,0b0111111111111111,0b0111111111111111,
4,48.08535,10.69133,2010-01-01,h18v04,459.0,1714.0,14933.0,32767.0,14665.0,32767.0,...,0b11,Poor,0b01,,0b0000000000000000,0b0000000000000000,Best,0b0111111111111111,0b0111111111111111,
5,48.08535,10.69133,2011-01-01,h18v04,459.0,1714.0,15297.0,32767.0,15046.0,32767.0,...,0b11,Poor,0b01,,0b0000000000000000,0b0000000000000000,Best,0b0111111111111111,0b0111111111111111,


Notice that there's a lot of columns that are not necessarily of interest. The geometry is available. As anticipated, there are two greenup cycles, however the second one only contains the value `32767` which we recognize as the fill_value specified in the metadata (the output of `layers()`). The units ('day') are a bit cryptic, but hypothesizing that they represent days since a certain offset we can deduct that the offset is probably the "default" offset i.e. 01-01-1970 00:00.


In [6]:
# guess: the values of greenup represent days since default offset??
from pandas import Timestamp

print(Timestamp(0, unit="D"))
print(Timestamp(14353.0, unit="D"))
print(Timestamp(14353.0, unit="D") - Timestamp("20090101"))

### Harmonization

So, in order to get the DOY we need to the number of days between 1970 and the present year.

The `load_points()` method, as opposed to the raw load, does the following:

- Remove unnecessary columns (filter the requested layers), and rename them to something more manageable.
- Convert fill-value to NaN and drop columns with only fill value
- Reconstruct the DOY and convert datetime index to year
- Extract geometry and convert to geopandas


In [7]:
modis_df = dataset.load()
modis_df

INFO:springtime.datasets.appeears:Found /home/peter/.cache/springtime/appeears/MCD12Q2-2009-2011-Dormancy-Greenup-50f0093a3994764340e8bb6f70797f854dd3a4eb-MCD12Q2-061-results.csv


Unnamed: 0,datetime,Dormancy_0,Greenup_0,geometry
0,2009-01-01,327,108,POINT (8.89300 47.09780)
1,2010-01-01,289,114,POINT (8.89300 47.09780)
2,2011-01-01,311,100,POINT (8.89300 47.09780)
3,2009-01-01,326,81,POINT (10.69133 48.08535)
4,2010-01-01,323,55,POINT (10.69133 48.08535)
5,2011-01-01,322,71,POINT (10.69133 48.08535)


## Loading raster data


In [1]:
from springtime.datasets import Appeears

dataset = Appeears(
    years=[2009, 2011],
    product="MCD12Q2",
    version="061",
    layers=["Greenup", "Dormancy"],
    points=[(9.1, 49.1), (9.6, 49.6), (9.9, 49.9)],
    area={"name": "eastfrankfurt2", "bbox": [9.0, 49.0, 10.0, 50.0]},
)

dataset.download_area()  # TODO raw_load should call download
dataset.load()

File /home/peter/.cache/springtime/appeears/eastfrankfurt2/MCD12Q2.061_500m_aid0001.nc exists, not downloading again
  datetimeindex = ds.indexes["time"].to_datetimeindex()


Unnamed: 0,Dormancy,Greenup,geometry,year
0,322,66,POINT (9.10000 49.10000),2009
1,318,83,POINT (9.10000 49.10000),2010
2,318,26,POINT (9.10000 49.10000),2011
3,220,78,POINT (9.60000 49.60000),2009
4,320,81,POINT (9.60000 49.60000),2010
5,213,41,POINT (9.60000 49.60000),2011
6,307,81,POINT (9.90000 49.90000),2009
7,278,81,POINT (9.90000 49.90000),2010
8,250,76,POINT (9.90000 49.90000),2011


In [3]:
from springtime.datasets import Appeears

dataset = Appeears(
    years=[2009, 2011],
    product="MCD15A2H",
    version="061",
    layers=["Fpar_500m", "Lai_500m"],
    points=[(9.1, 49.1), (9.6, 49.6), (9.9, 49.9)],
    area={"name": "eastfrankfurt2", "bbox": [9.0, 49.0, 10.0, 50.0]},
    infer_date_offset=False,
)

dataset.load()

  datetimeindex = ds.indexes["time"].to_datetimeindex()


Unnamed: 0,year,geometry,Fpar_500m|1,Fpar_500m|9,Fpar_500m|17,Fpar_500m|25,Fpar_500m|33,Fpar_500m|41,Fpar_500m|49,Fpar_500m|57,...,Lai_500m|289,Lai_500m|297,Lai_500m|305,Lai_500m|313,Lai_500m|321,Lai_500m|329,Lai_500m|337,Lai_500m|345,Lai_500m|353,Lai_500m|361
0,2008,POINT (9.10000 49.10000),,,,,,,,,...,,,,,,,,,,25.0
1,2008,POINT (9.60000 49.60000),,,,,,,,,...,,,,,,,,,,1.3
2,2008,POINT (9.90000 49.90000),,,,,,,,,...,,,,,,,,,,1.1
3,2009,POINT (9.10000 49.10000),2.5,2.5,2.5,2.5,2.5,2.5,2.5,2.5,...,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0
4,2009,POINT (9.60000 49.60000),0.44,0.11,0.25,0.35,0.36,0.07,0.21,0.33,...,0.9,0.6,0.3,0.0,0.6,1.8,1.1,1.1,1.4,1.3
5,2009,POINT (9.90000 49.90000),0.41,0.04,0.08,0.32,0.17,0.09,0.15,0.13,...,0.7,0.3,0.3,0.2,0.1,1.1,1.1,0.2,1.2,1.0
6,2010,POINT (9.10000 49.10000),2.5,2.5,2.5,2.5,2.5,2.5,2.5,2.5,...,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0
7,2010,POINT (9.60000 49.60000),0.04,0.01,0.05,0.01,0.01,0.02,0.35,0.34,...,0.6,0.8,0.8,1.0,0.5,0.2,0.0,0.1,0.0,0.0
8,2010,POINT (9.90000 49.90000),0.02,0.0,0.01,0.02,0.0,0.03,0.17,0.26,...,0.3,0.3,0.4,0.5,0.1,0.1,0.0,0.1,0.0,0.0
9,2011,POINT (9.10000 49.10000),2.5,2.5,2.5,2.5,2.5,2.5,2.5,2.5,...,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0


### TODO:

- [WIP] Different products have different (time) resolutions, e.g. yearly greenup but daily LAI. Make loading those more flexible.
