# Set up and load packages

To run the NEON data stacker in Python, use the rpy2 package, which creates an R environment you can interact with via Python. If you are only interested in remote sensing data files, you will need to do the setup in this section, but then can skip ahead to the remote sensing section.

These instructions cover running the functions in the NEON utilities package via Python. For information about running the package in R directly, see the readme for the package on the [NEON-utilities GitHub repo](https://github.com/NEONScience/NEON-utilities/tree/master/neonDataStackR), and a short tutorial for the `stackByTable()` function on the [NEON Data Skills page](http://www.neonscience.org/neonDataStackR).

If you already have rpy2 installed, skip this step. If you don't have it installed, you'll need to run this step from the command line. Error message shown for clarity.

In [1]:
pip install rpy2


The following command must be run outside of the IPython shell:

    $ pip install rpy2

The Python package manager (pip) can only be used from outside of IPython.
Please reissue the `pip` command in a separate terminal or command prompt.

See the Python documentation for more informations on how to install packages:

    https://docs.python.org/3/installing/


.

Now import rpy2 into your session.

In [2]:
import rpy2
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr

Import the base R functionality.

In [3]:
base = importr('base')
utils = importr('utils')

Import the NEON data stacker package from GitHub. Note the `importr` function in rpy2 has its own method for this, rather than using the devtools package from R.

In [4]:
stackr = importr(name='neonDataStackR', lib_loc='NEONScience/NEON-utilities/neonDataStackR')

Suppress R warnings. This step can be skipped, but will result in messages getting passed through from R that Python will interpret as warnings.

In [5]:
import warnings
from rpy2.rinterface import RRuntimeWarning
warnings.filterwarnings("ignore", category=RRuntimeWarning)

# Stack data files

The function `stackByTable()` in neonDataStackR merges the monthly files the [NEON Data Portal](http://data.neonscience.org/home) provides. Start by downloading the dataset you're interested in from the Portal. It will download as a single zip file. Note the file path it's saved to and proceed.

The data stacker package comes with a data file, table_types. The data file is needed for the package to work, and rpy2 doesn't load data by default. So we need to load it to the session and then pass it back to the R environment.

First, load the data file:

In [6]:
stackrdata = stackr.__rdata__
table_types = stackrdata.fetch('table_types')['table_types']

Now, pass it back to the R environment:

In [7]:
robjects.globalenv['table_types'] = table_types

Now run the `stackByTable()` function to stack the data. Its only input is a file path, the path to the zip file you downloaded from the NEON Data Portal.

(The semicolon at the end of the line (here, and in the other function calls below) can be omitted. It suppresses a note indicating the output of the function is null. This note appears because these functions download or modify files on your local drive, but none of the data are read into the Python or R environments.)

In [8]:
stackr.stackByTable('/Users/clunch/Downloads/NEON_isotope-soil-distrib-periodic.zip');

Unpacked  NEON.D02.SCBI.DP1.10100.001.2014-08.expanded.20180308T180515Z.zip

Unpacked  NEON.D07.ORNL.DP1.10100.001.2014-07.expanded.20180308T183441Z.zip

Unpacked  NEON.D10.CPER.DP1.10100.001.2014-07.expanded.20180308T181823Z.zip

Unpacked  NEON.D09.WOOD.DP1.10100.001.2014-08.expanded.20180308T182715Z.zip

Unpacked  NEON.D10.STER.DP1.10100.001.2014-07.expanded.20180308T180231Z.zip

Unpacked  NEON.D15.ONAQ.DP1.10100.001.2014-08.expanded.20180308T182837Z.zip

Unpacked  NEON.D08.TALL.DP1.10100.001.2014-07.expanded.20180308T182745Z.zip

Unpacked  NEON.D01.HARV.DP1.10100.001.2014-07.expanded.20180308T182420Z.zip

Unpacked  NEON.D05.UNDE.DP1.10100.001.2014-07.expanded.20180308T180317Z.zip

Unpacked  NEON.D01.BART.DP1.10100.001.2014-08.expanded.20180308T175912Z.zip

Finished: All of the data are stacked into  1  tables!

Copied the first available NEON.University_of_Wyoming_Stable_Isotope_Facility.bgc_CNiso_externalSummary.csv to /stackedFiles
Copied the first available variable definition fi

Check the folder containing the original zip file from the Data Portal; you should now have a subfolder containing the unzipped and stacked files.

# Download files to be stacked

The function `zipsByProduct()` uses the [NEON API](http://data.neonscience.org/data-api) to programmatically download data files for a given product. The files downloaded by `zipsByProduct()` can then be fed into `stackByTable()`.

The function will create a new folder in the R working directory and write the files there. Set the working directory if it isn't set to where you want it.

In [9]:
base.setwd('/Users/clunch');

Run the downloader with these inputs: a data product ID, a site (or "all" for all sites), a package (either basic or expanded), and an indicator to check the size of your download before proceeding, or not.

There are two differences relative to running this function in R directly: 

1. check.size becomes check_size, because dots have programmatic meaning in Python
2. TRUE (or T) becomes "TRUE" because the values TRUE and FALSE don't have special meaning in Python the way they do in R, so it interprets them as variables if they're unquoted.

When running code in a programmatic workflow, rather than line by line, set check_size="FALSE".

In [10]:
stackr.zipsByProduct(dpID='DP1.10023.001', site='HARV', package='basic', check_size='TRUE');

Continuing will download files totaling approximately 0.165245 MB. Do you want to proceed y/n: y
6 zip files downloaded to /Users/clunch/filesToStack10023



The message output by `zipsByProduct()` indicates the file path where the files have been downloaded.

Now take that file path and pass it to `stackByTable()`. The file structure is slightly different from the zip file returned by the Portal, so we need an additional input, folder="TRUE".

In [11]:
stackr.stackByTable("/Users/clunch/filesToStack10023", folder="TRUE");

Unpacked  NEON.D01.HARV.DP1.10023.001.2013-07.basic.20180226T180545Z.zip

Unpacked  NEON.D01.HARV.DP1.10023.001.2014-07.basic.20180226T174946Z.zip

Unpacked  NEON.D01.HARV.DP1.10023.001.2015-06.basic.20180226T174941Z.zip

Unpacked  NEON.D01.HARV.DP1.10023.001.2015-07.basic.20180226T175005Z.zip

Unpacked  NEON.D01.HARV.DP1.10023.001.2016-07.basic.20180226T174902Z.zip

Unpacked  NEON.D01.HARV.DP1.10023.001.2017-07.basic.20180226T174924Z.zip

Finished: All of the data are stacked into  2  tables!

Copied the first available variable definition file to /stackedFiles and renamed as variables.csv
Copied the first available validation file to /stackedFiles and renamed as validation.csv
Stacked  hbp_massdata
Stacked  hbp_perbout



# Download remote sensing files

The function `byFileAOP()` uses the [NEON API](http://data.neonscience.org/data-api) to programmatically download data files for remote sensing (AOP) data products. These files cannot be stacked by `stackByTable()` because they are not tabular data. The function simply creates a folder in your working directory and writes the files there. It preserves the folder structure for the subproducts.

The inputs to `byFileAOP()` are a data product ID, a site, a year, and an indicator to check the size of the download before proceeding, or not.

In [12]:
stackr.byFileAOP(dpID='DP3.30015.001', site='HOPB', year='2017', check_size='TRUE');

Continuing will download 36 files totaling approximately 140.3 MB . Do you want to proceed y/n: y
Successfully downloaded  36  files.

NEON_D01_HOPB_DP3_718000_4709000_CHM.tif downloaded to /Users/clunch/DP3.30015.001/2017/FullSite/D01/2017_HOPB_2/L3/DiscreteLidar/CanopyHeightModelGtif
NEON_D01_HOPB_DP3_717000_4707000_CHM.tif downloaded to /Users/clunch/DP3.30015.001/2017/FullSite/D01/2017_HOPB_2/L3/DiscreteLidar/CanopyHeightModelGtif
NEON_D01_HOPB_DP3_720000_4708000_CHM.tif downloaded to /Users/clunch/DP3.30015.001/2017/FullSite/D01/2017_HOPB_2/L3/DiscreteLidar/CanopyHeightModelGtif
NEON_D01_HOPB_DP3_716000_4706000_CHM.tif downloaded to /Users/clunch/DP3.30015.001/2017/FullSite/D01/2017_HOPB_2/L3/DiscreteLidar/CanopyHeightModelGtif
NEON_D01_HOPB_DP3_716000_4708000_CHM.tif downloaded to /Users/clunch/DP3.30015.001/2017/FullSite/D01/2017_HOPB_2/L3/DiscreteLidar/CanopyHeightModelGtif
NEON_D01_HOPB_DP3_716000_4704000_CHM.tif downloaded to /Users/clunch/DP3.30015.001/2017/FullSite/D01/2017