<div style="color:dark green"> **All of my edits will be in green. I've made this revision version to write on here but make your edits in your original notebook!**</div>

# [CE-110] Lab 2: Analyzing Water Networks and Energy Use in CA

*Estimated Time: 50 minutes*

### Table of Contents:

1. [Visualizing the Network](#section_network)
2. [Sankey Diagrams: Visualizing Water Flow](#section_sankey)

In this lab, we will be visualizing and analyzing water networks for specific utilities in CA. <div style="color:dark green">**Elaborate a little more here to set up the lab: e.g. "We will first visualize networks surrouding your utility and water flow from immediate sources, then conduct energy analysis to see ..."**</div> At this time you should have been assigned a utility to visualize and analyze. Here is a mapping of utiliy names to their respective codes in the graphs and data.

Utility Code|Utility Name
-|-
1805003E|Alameda County Water District
1803033E|Fresno City of
1810019E|Hi-Desert Water District
1809027E|Palmdale Water District
1805085E|San Jose Water Company
1807341E|Santa Monica City of

<div class="alert alert-info"> 

**QUESTION** cells are in blue and ask you to make graphs, answer conceptual questions, or do other lab tasks. To receive full credit for your lab, you must complete all **QUESTION** cells.

</div>

Note: There is a folder named "output" located in the same directory where you opened this lab. Currently it is empty, but later on you will call functions that will ouput to this folder and you will need to copy and paste that ouput to a link. Try to quickly locate that folder in the tab where you opened this lab.

To start off the lab, we have to import some tools for us to be able to select and visualize certain utilities. These tools will build graphs for us which show which water sources and which types of water are upstream from a certain utility. The tools will also enable us to perform energy analysis on the utilities.Therefore, just run the cell and this will set up our environment (this will take a minute to run).

In [None]:
!pip install pydot
!pip install -q ipywidgets
!pip install -q ipysankeywidget
!pip install -q widgetsnbextension
!jupyter nbextension enable --py --sys-prefix ipysankeywidget

from sub_network import subWESTnet
import pandas as pd
import numpy as np
import json, urllib, ast
import pydot
from ipysankeywidget import SankeyWidget
from ipywidgets import Layout

## Visualizing the Network<a id='section_network'></a>


1.  <div style="color:dark green"> **Instead of going straight into explaining what the code is doing, briefly introduce what we are doing in this section. What are we trying to visualize? I think it would also help to provide a very simple outline of the steps students will take to get to the graphs (i.e. You will: 1. Run the following lines of code (descriptions provided but don't worry too much about what they do); 2. follow the steps to access something called a "dotfile"; 3. Use a website called Graphviz to display your network)**</div>


2. <div style="color:dark green"> **I don't think students will understand how paths work: when instructing students how to access their dotfile, I think it's worth taking a moment to put down a bullet list of steps to take in order to access their dot file (i.e. 1. Go back to Datahub where you opened this lab; 2. locate a folder titled "output"; etc.)**</div>


3. <div style="color:dark green">**Jennifer missed that you're supposed to input the utility NAME for networks, and utility CODE for Sankey and got confused. Make this a littler clearer (use bold/capitalization?) if you can.**</div>

Here, we create what is called an instance of a graph object. This is what will enable us to be able to visualize the water network associated with a certain utility, all the sources and the types of water that are leading into it. We enter in the number 2010, because we will be looking at the your utility network for that year.

In [None]:
new_upstream_example = subWESTnet('data/fixed_times_erl.csv', 2010)

This line of code specifies that we want to look at all the sources upstream from the utility. The point of this is that we want to analyze what are the types of water coming into these utilities, how much is coming in, and from where is it coming. In the spot that has "Your Utility Name Here" put your utility name in the quotes. To see what your specific utility name is, look at the table at the top of this page and put that name specifically in the quotes.

In [None]:
new_upstream_example.upstream("Your Utility Name Here")

This line of code is here to compute all the amounts of water going into the different nodes in the network, all you have to do is run the following cell

In [None]:
new_upstream_example.balance_graph()

The following attributes of our upstream example show the tables of the actual network (by using source and target labels) and the energies associated with each node in the graph. Just browse the tables and try to see what each column is referring to and how the rows relate certain utilities/sources with energy levels and the amount of water moving through each point in the network.

In [None]:
new_upstream_example.table()

In [None]:
new_upstream_example.energy

This folowing cell takes the upstream table, and turns it into a dot file. The dotfile is placed in 

`ouput/sub_(your_utility_name)_2010__upstream/downstream.dot` This folder is on the same level as where this lab is located

<div style="color:dark green">**It might be easier for students to understand this if this info is organized in a table? **</div>

There are some things to note about the nodes in the graph. The ones that start with "R_" are a river, ones that contain "SWP" or "CVP" are parts of federal government water management projects in CA, ones with "LK" are lakes, ones with "RES" are reservoirs, and finally the nodes which contain your utility's code and then end with PD, NPD, and GW are potable, non-potable, and ground water types flowing into your utility, repectively.

In [None]:
new_upstream_example.to_dot()

Go to the dot file (inside the ouput directory where the lab2 notebook is located) and paste the code in the dot file [here](https://dreampuf.github.io/GraphvizOnline/). Make sure to replace the entire code block that is already in the link with the entire code block in our output.dot files.

<div class="alert alert-info"> 

**QUESTION:** What are some of the sources connected to your utility (which whould be located at the bottom of the graph)?. What are the types of water flowing in to your utility?

</div>

Your Answer Here (Double click the cell to replace the text)

## The Sankey Diagram: Visualizing Water Flow<a id='section_sankey'></a>


1. <div style="color:dark green">**Again, introduce the objectives of this section. Provide a transition from the networks section, e.g. "Now that you know what the water network around your utility looks like, we'll zoom in to its immediate connections to visualize the volume of water transported to your utility from different sources."**</div>

The next cell is quite complicated, don't bother with trying to understand what it is doing. All it does is try to format our network so that we can use something called a sankey diagram. This diagram will allow us to visualize the water network associated with our utility in a way which shows how much water is actually travelling between utilities and sources and other things like reservoirs, rivers, and aqueducts.

In [None]:
def get_sankey_file(data_path, year, utility, upstream=True):
    gi = subWESTnet(data_path, year)
    gi.upstream(utility)
    
    sank_dicts = []
    for v in gi.edges.values():
        sank_dict = {}
        sank_dict['source'] = v['source']
        sank_dict['target'] = v['target']
        sank_dict['value'] = v['used_vol_af']
        sank_dict['color'] = 'steelblue'
        if v['used_vol_af'] == 0:
            sank_dict['color'] = 'goldenrod'
            sank_dict['value'] = .00001
        sank_dicts.append(sank_dict)
    
    df = pd.DataFrame(sank_dicts)
    df = df[df['color'] != 'goldenrod']
    data = pd.read_csv(data_path)
    
    nodes = []
    for i in df['source'].unique():
        sdict = {}
    
        # check if resource/end or not
        check_resource = data[data['target'] == i].shape
        if check_resource[0] == 0:
            sdict['is resource'] = True
        else:
            sdict['is resource'] = False
        if i[-1:] == 'E':
            sdict['is end'] = True
        else:
            sdict['is end'] = False
    
        # compute in/out volumes
        outv = df[df['source'] == i]['value'].sum()
        inv = df[df['target'] == i]['value'].sum()
    
        # get values to compensate for missing volumes
        extra_case = {}
        if outv < inv:
            if not sdict['is end']:
                extra_case['value'] = inv - outv
                extra_case['source'] = i
                extra_case['target'] = i + " to Other"
                nodes.append(extra_case)
        elif outv > inv:
            if not sdict['is resource']:
                extra_case['value'] = outv - inv
                extra_case['source'] = "Other to " + i
                extra_case['target'] = i
                nodes.append(extra_case)
    
    extra_ = df.drop('color', axis=1).append(pd.DataFrame(nodes)).reset_index(drop=True)
    a = []
    count = 0
    break_count = 0
    break_break = []
    while 'break' != break_break:
        drop = []
        for i in extra_['target']:
            if extra_[extra_['source'] == i].shape[0] == 0:
                if i[-5:] != 'Other' and i[-1:] != 'E':
                    drop.append(i)
        extra_ = extra_.loc[[k for k, v in extra_.iterrows() if v['target'] not in drop]]
        count += len(drop)
        a.extend(drop)
        if count == len(a):
            break_count += 1
        if break_count == 50:
            break_break = 'break'
    
    return extra_


Run the following cell but make sure to put your utility's code, not its name, in the quotes that say "Your Utility Code Here". Remember what all the different codes mean from the previous graph we made. These encodings still apply to the Sankey graph we are going to make in the following cells, except this time your utility will be on the right hand side of the sankey diagram encoded by its utility code, not it's actual name. The graph below represents the flow of water involved with this utility in the year 2010.

In [None]:
SankeyWidget(layout=Layout(width="1500", height="1000"), 
             margins=dict(top=10, bottom=0, left=100, right=100),
             links=get_sankey_file('data/links_erl.csv', 2010, 'Your Utility CODE Here', upstream=True).to_dict('records'))

<div class="alert alert-info"> 

**QUESTION** Examine the above graph. Your utility should be on the right hand side and the sources that have water flowing into it to the left of it. You can run your mouse over the graph and see how much water is flowing in each connection. How much water of each type is flowing directly into your utility? **QUESTION**

</div>

Your answer here.

The next graph we are making is the predicted flow of water associated with your utility in 2015. These predictions were made in 2010. Make sure to fill in the "Your Utility Code Here" portion of the cell, so that it can be run, and use your utiliy's code just like in the last Sankey Diagram you made.

In [None]:
SankeyWidget(layout=Layout(width="1500", height="1000"), 
             margins=dict(top=10, bottom=0, left=100, right=100),
             links=get_sankey_file('data/links_erl.csv', 2015, 'Your Utility Code Here', upstream=True).to_dict('records'))

Finally the last graph is the actual 2015 flow of water associated with your utility. We are going to zoom into just the portion that contains your utility and the types of water going into it. Make sure to fill in the "Your utility code here" part of the cell.

In [None]:
SankeyWidget(layout=Layout(width="1000", height="500"), 
             margins=dict(top=10, bottom=0, left=100, right=100),
             links=get_sankey_file('data/links_2015module.csv', 2015, 'Your Utility Code Here', upstream=True).to_dict('records'))

<div style="color:dark green">**Jennifer's notes on your discussion questions:**</div>
![image](./images/Jennifer_discnotes1.PNG)

<div class="alert alert-info"> 

**QUESTION** Examine the two graphs above and look at the direct connections to your utility. How much water of each type seems to be flowing into your utility in both the predicted 2015 data and the actual 2015 data? How do these levels of water flow compare to 2010 and how do they compare to each other. Do the tyes of water flowing into your uility change at all. Given the change in the flow of water between 2010 and 2015 and how 2015 compares with what was predicted are there any recent events in California which might explain these differences?  **QUESTION**

</div>

Your answers here.