## Demo notebook for accessing FIA data on Azure

This notebook provides an example of accessing USFS Forest Inventory and Analysis (FIA) data from blob storage on Azure. The data is stored in a collection of Parquet datasets.

FIA data are stored in the West Europe Azure region, so this notebook will run most efficiently on Azure compute located in West Europe. We recommend that substantial computation depending on FIA data also be situated in West Europe. If you are using FIA data for environmental science applications, consider applying for an [AI for Earth grant](http://aka.ms/ai4egrants) to support your compute requirements.

### Imports and constants

In [None]:
import dask.dataframe as dd
from adlfs import AzureBlobFileSystem

storage_account_name = 'cpdataeuwest'
folder_name = 'cpdata/raw/fia'

### Listing the data files

The full set of FIA data tables are available (e.g. tree, plot, condition).

We can use `adlfs` to list the files:

In [None]:
fs = AzureBlobFileSystem(account_name=storage_account_name)
parquet_files = fs.glob(folder_name + '/*parquet')
print('Found {} Parquet files'.format(len(parquet_files)))
for k in range(0,10):
    print(parquet_files[k])
print('...')

### Opening one data file

Here we demonstrate how to open the `condition` dataset:

In [None]:
df = dd.read_parquet('az://' + folder_name + '/cond.parquet',
                     storage_options={'account_name':storage_account_name}).compute()
df.head()

### A quick plot

Here we make a quick plot comparing the stand age to the alive basal area.

In [None]:
ax = df[::100].plot.hexbin('STDAGE', 'BALIVE', gridsize=(300, 100),
                           vmax=20, cmap='viridis', colorbar=False)
ax.set_ylim(0, 300)
ax.set_xlim(0, 100)