# [CE-110] Lab 2: Analyzing Water Networks and Energy Use in CA

*Estimated Time: 50 minutes*

### Table of Contents:

1. [Visualizing the Network](#section_network)
2. [Sankey Diagrams: Visualizing Water Flow](#section_sankey)

In this lab, we will be visualizing and analyzing water networks for specific utilities in CA. First off, we will simply visualize the network associated with you utility (i.e. the sources that lead into it). Then, we will analyze the flow of water in these networks and how they have changed over time. Finally, we will perform energy analysis calculations in order to see in more depth how the flow of water has changed and what effects this has had. At this time you should have been assigned a utility to visualize and analyze. Here is a mapping of utiliy names to their respective codes in the graphs and data.

Utility Code|Utility Name
-|-
1805003E|Alameda County Water District
1803033E|Fresno City of
1810019E|Hi-Desert Water District
1809027E|Palmdale Water District
1805085E|San Jose Water Company
1807341E|Santa Monica City of

<div class="alert alert-info"> 

**QUESTION** cells are in blue and ask you to make graphs, answer conceptual questions, or do other lab tasks. To receive full credit for your lab, you must complete all **QUESTION** cells.

</div>

Note: There is a folder named "output" located in the same directory where you opened this lab. Currently it is empty, but later on you will call functions that will ouput to this folder and you will need to copy and paste that ouput to a link. Try to quickly locate that folder in the tab where you opened this lab.

To start off the lab, we have to import some tools for us to be able to select and visualize certain utilities. These tools will build graphs for us which show which water sources and which types of water are upstream from a certain utility. The tools will also enable us to perform energy analysis on the utilities.Therefore, just run the cell and this will set up our environment (this will take a minute to run).

In [2]:
!pip install pydot
!pip install -q ipywidgets
!pip install -q ipysankeywidget
!pip install -q widgetsnbextension
!jupyter nbextension enable --py --sys-prefix ipysankeywidget

from sub_network import subWESTnet
import pandas as pd
import numpy as np
import json, urllib, ast
import pydot
from ipysankeywidget import SankeyWidget
from ipywidgets import Layout



You are using pip version 19.0.2, however version 19.0.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
You are using pip version 19.0.2, however version 19.0.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
You are using pip version 19.0.2, however version 19.0.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
You are using pip version 19.0.2, however version 19.0.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
Enabling notebook extension jupyter-sankey-widget/extension...
      - Validating: ok


## Visualizing the Network<a id='section_network'></a>

In this section, we will be completing our first task, which is to make a bare-bones visualization of our network. 
1. First run the cells of code and replace anything that it says to replace (Don't worry about understanding what the code does)
2. Then locate the output in the 'ouput' folder
3. Copy and Paste it, to the link found later on in this section which takes you to GraphViz.com

Here, we create what is called an instance of a graph object. This is what will enable us to be able to visualize the water network associated with a certain utility, all the sources and the types of water that are leading into it. We enter in the number 2010, because we will be looking at the your utility network for that year.

In [3]:
new_upstream_example = subWESTnet('data/fixed_times_erl.csv', 2010)

This line of code specifies that we want to look at all the sources upstream from the utility. The point of this is that we want to analyze what are the types of water coming into these utilities, how much is coming in, and from where is it coming. In the spot that has "Your Utility Name Here" put your **utility name** in the quotes. To see what your specific utility name is, look at the table at the top of this page and put that name specifically in the quotes. **Note: do not put your utility code in the text, put it's actual name there, if you make a spelling mistake, make sure to rerun the above cell because that will refresh the object we are using and allow us to run the below cell again**

In [4]:
new_upstream_example.upstream("YOUR UTILITY NAME HERE")

Unweighted upstream graph built.


This line of code is here to compute all the amounts of water going into the different nodes in the network, all you have to do is run the following cell

In [5]:
new_upstream_example.balance_graph()

Graph is weighted.


The following attributes of our upstream example show the tables of the actual network (by using source and target labels) and the energies associated with each node in the graph. Just browse the tables and try to see what each column is referring to and how the rows relate certain utilities/sources with energy levels and the amount of water moving through each point in the network.

In [6]:
new_upstream_example.table()

Unnamed: 0,source,target,cumulative_volume_af,transmission_kwh/af,treatment_kwh/af,used_vol_af
0,SW_CVPFKC,Fresno City of,53121.0,23.22,0.0,53121.0
1,1803033GW,1803033PD,30799.03923,399.0,26.0,29572.94
2,FRES_ID,1803033PD,19023.93641,87.0,224.0,18266.6
3,SW_CVPFKC,1803033PD,14701.51218,23.22,224.0,14116.25
4,RES_MLRTN,SW_CVPFKC,67956.99412,0.0,0.0,67371.23
5,1803033PD,Fresno City of,61955.79,17.6,0.0,61955.79
6,1803033REC,1803033NPD,88.0,0.0,236.0,88.0
7,1803033NPD,Fresno City of,88.0,21.12,0.0,88.0
8,KINGS,1803033GW,30799.03923,0.0,0.0,29572.94
9,SAN JOAQUIN VALLEY,KINGS,51162.00619,0.0,0.0,49779.01


In [7]:
new_upstream_example.energy

Unnamed: 0,node,kwh/af
0,WWT1803033,0.0
1,FRES_ID,0.0
2,RES_MLRTN,0.0
3,1803033REC,0.642618
4,1803033NPD,236.642618
5,SAN JOAQUIN VALLEY,0.0
6,KINGS,0.0
7,GWR1803-5-22.08,1844.642618
8,SW_CVPFKC,0.0
9,1803033GW,604.416306


This folowing cell takes the upstream table, and turns it into a dot file. The dotfile is placed in a folder called 'ouptut'.

There are some things to note about the nodes in the graph. The ones that start with "R_" are a river, ones that contain "SWP" or "CVP" are parts the state and of federal government water management projects in CA, respectively, ones with "LK" are lakes, ones with "RES" are reservoirs, SW are unsepcified surface water sources, GW are groudwater aquifers. The nodes which contain your utility's code and then end with PD combine all treatable water while, NPD contains all non-potable sources flowing into your utility's distribution systems.

In [8]:
new_upstream_example.to_dot()

dot saved


Go to the dot file (inside the ouput directory where the lab2 notebook is located) and paste the code in the dot file [here](https://dreampuf.github.io/GraphvizOnline/). Make sure to replace the entire code block that is already in the link with the entire code block in our output.dot files.

Here are the steps to get to the dotfile:
 - Go to the tab where you opened this lab, on datahub
 - In the same file directory, where this lab is located, there is a file called 'output'
 - Click on that folder and in it should be the dotfile with your utility name in the filename
 - Open that file and then copy and paste the entire contents of the file into the leftside of the page that the link brings you to

<div class="alert alert-info"> 

**QUESTION** What are some of the sources connected to your utility (which would be located at the bottom of the graph)?  **QUESTION** 

</div>

Your Answer Here (Double click the cell to replace the text)

## The Sankey Diagram: Visualizing Water Flow<a id='section_sankey'></a>

This next section assigns the task of visualizing our networks with a lot more information embedded into them. We will be looking at the water network associated with your utility but this time we will see how much water is actually flowing between certain nodes contained in the network. You will have to run some of the following ceels in order to again modify our environment and make everything work, as well as input your utility code to see the sankey diagrams which provide us with this useful visualization.

The next cell is quite complicated, don't bother with trying to understand what it is doing. All it does is try to format our network so that we can use something called a sankey diagram. This diagram will allow us to visualize the water network associated with our utility in a way which shows how much water is actually travelling between utilities and sources and other things like reservoirs, rivers, and aqueducts.

In [9]:
def get_sankey_file(data_path, year, utility, upstream=True):
    gi = subWESTnet(data_path, year)
    gi.upstream(utility)
    
    sank_dicts = []
    for v in gi.edges.values():
        sank_dict = {}
        sank_dict['source'] = v['source']
        sank_dict['target'] = v['target']
        sank_dict['value'] = v['used_vol_af']
        sank_dict['color'] = 'steelblue'
        if v['used_vol_af'] == 0:
            sank_dict['color'] = 'goldenrod'
            sank_dict['value'] = .00001
        sank_dicts.append(sank_dict)
    
    df = pd.DataFrame(sank_dicts)
    df = df[df['color'] != 'goldenrod']
    data = pd.read_csv(data_path)
    
    nodes = []
    for i in df['source'].unique():
        sdict = {}
    
        # check if resource/end or not
        check_resource = data[data['target'] == i].shape
        if check_resource[0] == 0:
            sdict['is resource'] = True
        else:
            sdict['is resource'] = False
        if i[-1:] == 'E':
            sdict['is end'] = True
        else:
            sdict['is end'] = False
    
        # compute in/out volumes
        outv = df[df['source'] == i]['value'].sum()
        inv = df[df['target'] == i]['value'].sum()
    
        # get values to compensate for missing volumes
        extra_case = {}
        if outv < inv:
            if not sdict['is end']:
                extra_case['value'] = inv - outv
                extra_case['source'] = i
                extra_case['target'] = i + " to Other"
                nodes.append(extra_case)
        elif outv > inv:
            if not sdict['is resource']:
                extra_case['value'] = outv - inv
                extra_case['source'] = "Other to " + i
                extra_case['target'] = i
                nodes.append(extra_case)
    
    extra_ = df.drop('color', axis=1).append(pd.DataFrame(nodes)).reset_index(drop=True)
    a = []
    count = 0
    break_count = 0
    break_break = []
    while 'break' != break_break:
        drop = []
        for i in extra_['target']:
            if extra_[extra_['source'] == i].shape[0] == 0:
                if i[-5:] != 'Other' and i[-1:] != 'E':
                    drop.append(i)
        extra_ = extra_.loc[[k for k, v in extra_.iterrows() if v['target'] not in drop]]
        count += len(drop)
        a.extend(drop)
        if count == len(a):
            break_count += 1
        if break_count == 50:
            break_break = 'break'
    
    return extra_


Run the following cell but make sure to put **your utility's code, not its name**, in the quotes that say "Your Utility Code Here". Remember what all the different codes mean from the previous graph we made. These encodings still apply to the Sankey graph we are going to make in the following cells, except this time your utility will be on the right hand side of the sankey diagram encoded by its utility code, not it's actual name. The graph below represents the flow of water involved with this utility in the year 2010.

In [13]:
table_2010 = get_sankey_file('data/links_erl.csv', 2010, 'YOUR UTILITY CODE HERE', upstream=True)

Unweighted upstream graph built.



Next we are going to save this table as a csv in order to visualize the diagram on a website [here](https://jasonsjiang.github.io/sankey). The following line will save a file called 2010_sankey.csv to the output folder where your dot file was saved previously. This time you will follow these steps to visualize the graph:

### Sankey Instructions <a id='sankey_instructions'></a>
1. Run the cell below
2. Go to the output folder on datahub, where you were before
3. Click on the small checkbox to the left of your file
4. After that happens, a few options will appear at the top of the directory, click on the download option
5. This will download the file to your downloads folder on your computer
6. Go to the website linked above
7. Click on the 'Choose file' button
8. Choose the csv file, 2010_sankey.csv

In [17]:
table_2010.to_csv('output/2010_sankey.csv', index=False)

<div class="alert alert-info"> 

**QUESTION** Examine the graph. Your utility should be on the right hand side and the sources that have water flowing into it to the left of it. You can run your mouse over the graph and see how much water is flowing in each connection. How much water of each type is flowing directly into your utility? **QUESTION**

</div>

Your answer here.

The next graph we are making is the predicted flow of water associated with your utility in 2015. These predictions were made in 2010. Make sure to fill in the "Your Utility Code Here" portion of the cell, so that it can be run, and use your utiliy's code just like in the last Sankey Diagram you made. You will follow the same steps you did for the last sankey diagram in order to visualize it. You will want to have multiple tabs open in order to compare the different sankey diagrams. [Here](#sankey_instructions) are the directions again.

In [18]:
table_2015_predicted = get_sankey_file('data/links_erl.csv', 2015, 'YOUR UTILITY CODE HERE', upstream=True)

Unweighted upstream graph built.


In [19]:
table_2015_predicted.to_csv('output/2015_predicted.csv', index = False)

Finally the last graph is the actual 2015 flow of water associated with your utility. We are going to zoom into just the portion that contains your utility and the types of water going into it. Make sure to fill in the "Your utility code here" part of the cell. Again, follow the same steps as before, and you will want to have another tab open for the actual 2015 sankey diagram.

In [21]:
table_2015_actual = get_sankey_file('data/links_2015module.csv', 2015, 'YOUR UTILITY CODE HERE', upstream=True)

Unweighted upstream graph built.


In [22]:
table_2015_actual.to_csv('output/2015_actual.csv', index = False)

<div class="alert alert-info"> 

**QUESTION** Examine the two graphs above and look at the direct connections to your utility. How much water of each type seems to be flowing into your utility in both the predicted 2015 data and the actual 2015 data? How do these levels of water flow compare to 2010 and how do they compare to each other. Do the tyes of water flowing into your uility change at all? Given the change in the flow of water between 2010 and 2015 and how 2015 compares with what was predicted are there any recent events in California which might explain these differences?  **QUESTION**

</div>

Your answers here.