# Welcome to the flow quickstarter!

## Introduction

flow is an open-source python package for collecting, calculating, organizing, and visualizing cross-sectoral resource interdependencies and flows.

This Jupyter Notebook serves a ready-to-go introduction to the package using sample data for 3,145 United States counties for the year 2015. The sample data is used to analyze the interdependencies between water and energy across various sectors.

## Importing the package

Click on the cell below and hit ctrl-enter to import the flow package and it's modules.

In [1]:
import flow

## Get sample data
The flow package comes with sample data for all counties in the United States for the year 2015. To load the input data, run the cell below.

In [2]:
data_input = flow.read_sample_data()

## Run the Model
Now that our input data is prepared, we can run some or all of it through the model to start collecting, computing, and organizing our water and energy flows.

#### Selecting an Region
The US sample data comes with data for analyzing over 3,000 different regions (US counties). The flow package is capable of running an individual county at a time or the entire dataset of counties. The cell below selects a single county to run through the model. The counties are presented here under their Federal Information Processing Standards (FIPS) code rather than a name. The cell below sets the region for analysis equal to the FIPS code for Manhattan County, NY (36061). 

To select other counties from this dataset, any FIPS code can be chosen from the input dataset or retrieved from the list presented here: [Link to page with FIPS]

#### Run the model for a single region
Run the cell below to run the model for the select region

In [3]:
# set the region equal to the FIPS code for New York County, NY
region = '36061'

In [4]:
# run the model for the select region
output = flow.calculate(data=data_input, region_name=region)

## Observe the output dataset
The output dataset is a Pandas DataFrame of flow values between the source node (S1 through S5) to the target node (T1 through T5) in indicated unites. The cell below shows the first five rows of the output.

In [5]:
output.head()

Unnamed: 0,region,S1,S2,S3,S4,S5,T1,T2,T3,T4,T5,units,value
0,36061,EPD,biomass,total,total,total,COM,biomass,demand,total,total,bbtu,2.31993
1,36061,EGD,total,total,total,total,COM,electricity,demand,total,total,bbtu,59.80044
2,36061,EPD,geothermal,total,total,total,COM,geothermal,demand,total,total,bbtu,0.170928
3,36061,EPD,natgas,total,total,total,COM,natgas,demand,total,total,bbtu,73.153253
4,36061,EPD,petroleum,total,total,total,COM,petroleum,demand,total,total,bbtu,18.340491


## Visualize the flows between sectors
The output dataset itself provides the values between nodes for both water and energy, however, it is not very intuitive on its own for understanding the relationships between nodes and how resources pass from one to the next. The various visualization tools integrated into the model can help with this.

### Sankey Diagrams
Sankey diagrams show flows between nodes and are able to represent how resources are passed along in a network. The cell below will produce two sankey diagrams with the sample data run output, one for water flows (given in million gallons per day) and one for energy flows (given in billion british thermal units per day).

In [7]:
viz = flow.plot_sankey(data=output, region_name= region, 
                       unit_type1 = 'mgd', unit_type2='bbtu', output_level=2, strip='total')

### Stacked Sector Bar Charts

### Regional Shaded Maps