## Xarray

Converting the NetCDF4 file that stores the information for the machine-learning Human Footprint Index (ml-HFI) results into a DataFrame for inspection and analysis.

Data from: 
Keys, P. W., Barnes, E. A., & Carter, N. Dataset associated with "A machine-learning approach to human footprint index estimation with applications to sustainable development." Colorado State University. Libraries. http://dx.doi.org/10.25675/10217/216207

Method inspiration from https://stackoverflow.com/questions/66169106/transform-part-of-a-netcdf-file-into-a-dataframe-with-xarray:

In [None]:
# necessary installations to run this notebook to be aware of: xarray, dask, bottleneck, rasterio

In [1]:
import pandas as pd
import numpy as np
import xarray as xr 
import rioxarray as rio 
import rasterio

In [2]:
dp = xr.open_dataset('./data/ml_hfi_v1_2000.nc')

ml = dp.to_dataframe()
ml = ml.dropna().reset_index()
ml.head()

         lat        lon  __xarray_dataarray_variable__
0 -55.609663 -68.108226                   3.328762e-06
1 -55.609663 -68.098333                   1.328018e-08
2 -55.609663 -68.088440                   1.276196e-13
3 -55.609663 -68.078547                   7.129802e-14
4 -55.609663 -68.068655                   6.915253e-18


In [3]:
ml.tail()

Unnamed: 0,lat,lon,__xarray_dataarray_variable__
140472840,69.988495,171.830159,1.399456e-07
140472841,69.988495,171.840051,0.000279031
140472842,69.988495,171.849944,3.740434e-05
140472843,69.988495,171.859837,0.0412801
140472844,69.988495,171.86973,0.0008492768


### Created desired subset for AOI:
(40,30N & -110,-100W) which aligns with the Hansen Global Forest Cover tile that I've downloaded.

In [4]:
# Using the 2019 ml-HFI predictions: 
dp = xr.open_dataset('./data/ml_hfi_v1_2019.nc')

In [5]:
dp

In [6]:
dp['__xarray_dataarray_variable__']

In [7]:
# From https://stackoverflow.com/questions/29135885/netcdf4-extract-for-subset-of-lat-lon

ds = xr.open_dataset('./data/ml_hfi_v1_2019.nc')
lat_bnds, lon_bnds = [30, 40], [-110, -100]
subset = ds.sel(lat=slice(*lat_bnds), lon=slice(*lon_bnds))
subset

In [8]:
type(subset)

xarray.core.dataset.Dataset

In [9]:
# Saving dataarray to netCDF
subset.to_netcdf('./data/ml_hfi_subset.nc')

### Convert .nc to geoTIFF
from: https://help.marine.copernicus.eu/en/articles/5029956-how-to-convert-netcdf-to-geotiff

In [10]:
nc_file = xr.open_dataset('./data/ml_hfi_subset.nc')
nc_file

In [11]:
ml_hfi = nc_file['__xarray_dataarray_variable__']

In [12]:
ml_hfi = ml_hfi.rio.set_spatial_dims(x_dim='lon', y_dim='lat')
ml_hfi.rio.crs

In [13]:
# Define the CRS projection
ml_hfi.rio.write_crs("epsg:4326", inplace=True)

In [14]:
ml_hfi.rio.to_raster(r"./data/ml_hfi2019.tiff")

In [15]:
# Confirming that saving to raster was successful:

image_file = "./data/ml_hfi2019.tiff"

mlhfi_image = rasterio.open(image_file)

In [16]:
mlhfi_image

<open DatasetReader name='./data/ml_hfi2019.tiff' mode='r'>