# Postprocessing Directory Setup 

This notebook populates the working directory with the correct subdirectories and **should be run once at the beginning of the project**. This expedites creating direcotries with similar structure, organizes the notebooks and data in a sensible way, and adds a layer of consistency. Once the directories have been created and the input data has been standardized, the analyst may simply copy and modify the template notebooks and place them in the correct directory to proceed with the analysis while remaining organized.

In particular, this notebook creates
1. One directory for every year of unvented data, notebooks, and logs --> /unvented_YYYY
2. One dirctory for unvented data, notebooks, and logs --> /vented
3. One directory for previously compiled or published data --> /compiled_data
4. One directory for rating curve notebooks --> /ratingcurve
5. One directory for notebooks and data used to create the stitched discharge time series --> /stitch_discharge

Author of Template and Underlying Code: Joe Ammatelli | (jamma@uw.edu) | July 2022

## Import Relevant Libraries
**Analyst TODO**: Nothing

In [7]:
import numpy as np
import os
import sys

sys.path.insert(0, os.path.abspath(os.path.join('..', 'src')))

import config

sys.path.remove(os.path.abspath(os.path.join('..', 'src')))

## Specify Data Years
**Analyst TODO**: 
* assign an integer (format 'YYYY') representing the first year of data collection to `start_year`
* assign an integer (format 'YYYY') representing the last year of data collection to `end_year`

Note: if just processing one year, assign `None` or `start_year` to `end_year`

In [None]:
# TODO: specify start and end year of postprocesing
# example
# start_year = 2019
# end_year = 2021

start_year = 1234
end_year = 1234

In [None]:
# creates list of all years of data
years = None
if (end_year is None):
    years = [str(start_year)]
else:
    years = np.arange(start_year, end_year + 1)
    years = [str(year) for year in years]
years

## Create directory for each year of unvented data, notebooks, and logs

**Analyst TODO**: Nothing

In [None]:
for year in years:    
    loc_year_dir_path = os.path.join('..', f'unvented_{year}')
    if not os.path.exists(loc_year_dir_path):
        os.mkdir(loc_year_dir_path)
        os.mkdir(os.path.join(loc_year_dir_path, 'data'))
        os.mkdir(os.path.join(loc_year_dir_path, 'notebooks'))
        os.mkdir(os.path.join(loc_year_dir_path, 'logs'))

## Create directory for vented data, notebooks, and logs

**Analyst TODO**: Nothing

In [None]:
loc_vented_dir_path = os.path.join('..', f'vented')
if not os.path.exists(loc_vented_dir_path):
    os.mkdir(loc_vented_dir_path)
    os.mkdir(os.path.join(loc_vented_dir_path, 'data'))
    os.mkdir(os.path.join(loc_vented_dir_path, 'notebooks'))
    os.mkdir(os.path.join(loc_vented_dir_path, 'logs'))

## Create directory for previously compiled or published data

**Analyst TODO**: Nothing

In [None]:
data_path = os.path.join('..', f'compiled_data')
if not os.path.exists(data_path):
    os.mkdir(data_path)
    os.mkdir(os.path.join(data_path, 'published'))
    os.mkdir(os.path.join(data_path, 'stage'))
    os.mkdir(os.path.join(data_path, 'stageQ'))

## Create directory for rating curve notebooks

**Analyst TODO**: Nothing

In [None]:
ratingcurve_path = os.path.join('..', f'ratingcurve')
if not os.path.exists(ratingcurve_path):
    os.mkdir(ratingcurve_path)

## Create dirctory for final discharge time series data and notebooks

**Analyst TODO**: Nothing

In [None]:
discharge_path = os.path.join('..', f'output')
if not os.path.exists(discharge_path):
    os.mkdir(discharge_path)
    os.mkdir(os.path.join(discharge_path, 'data'))
    os.mkdir(os.path.join(discharge_path, 'metadata'))
    os.mkdir(os.path.join(discharge_path, 'notebooks'))
    