<img width="100" src="https://carbonplan-assets.s3.amazonaws.com/monogram/dark-small.png" style="margin-left:0px;margin-top:20px"/>

# Download FIA Database

_by Joe Hamman (CarbonPlan), June 29, 2020_

This notebook downloads local copies of the FIA database for processing.

**Inputs:**
- sources.yaml

**Outputs:**
- Local copies of the FIA database

**Notes:**
- No reprojection or processing of the data is done in this notebook.

In [None]:
from .._utils import process_sources

In [1]:
import os

import pathlib
import urlpath
import yaml
import wget
import zipfile


workdir = pathlib.Path('/Users/jhamman/workdir/carbonplan_data_downloads/fia/')
workdir.mkdir(parents=True, exist_ok=True)
workdir

PosixPath('/Users/jhamman/workdir/carbonplan_data_downloads/fia')

In [2]:
with open('../../intake-catalogs/sources.yaml') as f:
    sources = yaml.load(f, Loader=yaml.FullLoader)['fia']

In [3]:
sources

{'description': 'Raw datasets from Forest Inventory Analysis',
 'metadata': {'url': 'https://apps.fs.usda.gov/fia/datamart/datamart.html'},
 'data': {'entire': {'actions': ['download', 'unzip'],
   'urlpath': ['https://apps.fs.usda.gov/fia/datamart/CSV/ENTIRE.zip']}}}

In [6]:
for key, dset in sources['data'].items():
    if 'download' in dset['actions']:
        for url in dset['urlpath']:
            url = urlpath.URL(url)
            out = workdir / url.name
            if not out.exists():
                print(f'downloading {url}')
                wget.download(str(url), out=str(out))
            
            if 'unzip' in dset['actions']:
                outdir = workdir / out.stem
                if not outdir.exists():
                    outdir.mkdir(parents=True)
                    with zipfile.ZipFile(out, 'r') as f:
                        print(f'extracting contents of {out}')
                        f.extractall(outdir)

downloading https://apps.fs.usda.gov/fia/datamart/CSV/ENTIRE.zip
extracting contents of /Users/jhamman/workdir/carbonplan_data_downloads/fia/ENTIRE.zip


In [7]:
print('here')

here


In [8]:
import pandas as pd

In [10]:
df = pd.read_csv('/Users/jhamman/workdir/carbonplan_data_downloads/fia/ENTIRE/BOUNDARY.csv')
df.head()

Unnamed: 0,CN,PLT_CN,INVYR,STATECD,UNITCD,COUNTYCD,PLOT,SUBP,SUBPTYP,BNDCHG,...,DISTCORN,AZMRIGHT,CYCLE,SUBCYCLE,CREATED_BY,CREATED_DATE,CREATED_IN_INSTANCE,MODIFIED_BY,MODIFIED_DATE,MODIFIED_IN_INSTANCE
0,357483133489998,264159792489998,2015,1,6,79,21,3,1,0.0,...,,200,10,3,,2015-11-17,489998,,,
1,357483134489998,264159792489998,2015,1,6,79,21,3,2,2.0,...,,4,10,3,,2015-11-17,489998,,,
2,357483135489998,264159792489998,2015,1,6,79,21,4,1,1.0,...,8.0,169,10,3,,2015-11-17,489998,,,
3,357483136489998,264159792489998,2015,1,6,79,21,4,2,1.0,...,6.0,290,10,3,,2015-11-17,489998,,,
4,357483487489998,264159804489998,2015,1,6,79,48,1,1,1.0,...,,156,10,3,,2015-11-17,489998,,,


In [None]:
df.to_parquet()