# **Phase 1 Exploratory Analysis: Water Usage in Utah**

In this section I conduct an exploratory analysis of water usage in Utah. I start by looking at overall water usage across the state of Utah from 2015 - 2019 to visualize any significant changes (if any) in water usage across the years, and how much of the water usage in the state goes toward potable versus secondary uses. I also breakdown how much of the water across the state is used for residential, commercial, institutional, or industrial use. 

## Outline

* State Water Usage
    - table of state-wide GPCD data and summary statistics across different years
    - time series of total GPCD, potable GPCD and secondary GPCD 
    - time series of potable GPCD for residential, commercial, industrial and institutional
        - if total doesn't change much over time, look at average in bar plot
    - time series of secondary GPCD for residential, commercial, industrial and institutional
        - if total doesn't change much over time, look at average in bar plot


* County Water Usage
    - table of GPCD data per county 
    - table of GPCD data per county summary statistics across each year
    - bar plot of counties total water usage w/ mean/median/state-wide GPCD (select year)
    - bar plot of counties potable vs secondary water usage w/ mean/median/state-wide GPCD (select year)
    - bar plot of counties w/ each type of potable water usage w/ mean/median/state-wide GPCD aggregated mean across all years (select potable type)
    - bar plot of counties w/ each type of potable water usage w/ mean/median/state-wide GPCD aggregated mean across all years (select secondary type)
    - investigate factors for top rank and bottom rank
        - population density (map visualization)
        - number of vacation homes
        - metering policies
        - temperature and precipitation (map visualization)
        - landscaping
        - presence of institutional properties
        - presence of industrial properties
        - presence of commercial properties
        
* Basin Water Usage


In [77]:
%run 'dataCleaning.ipynb'

## **State of Utah Water Usage**

First, I wanted to visualize overall water usage trends across the state of Utah between 2015 and 2019. Below is a table with the Total, Potable and Secondary gallons per capita per day (GPCD) used between 2015 and 2019 for each type of property. Notice that GPCD data for each type of secondary water use is missing for 2018 and 2019. My best guess is that the Utah Open Water Data team is behind on estimating those values. Secondary data is more difficult to acquire since most counties do not meter secondary water. Therefore, the methodology involves estimating based on average lot sizes and evapotranspiration data. 

In [78]:
%run 'utils.py'
format_table(df_state, title="Utah Water Usage in Gallons per Capita per Day (GPCD) Between 2015-2019")

Visualizing the total, total potable, and total secondary water usage in GPCD between 2015 and 2019, the water usage seems to remain relatively constant, with potable water usage accounting for the majority of the water usage. For that reason, I broke down the water usage by property type for both potable and secondary water usage as averages across the 5 year time period. 

_TBD: get feedback on the best way to visualize, consider normalizing as % of total_

In [79]:
%run 'utils.py'
plot_state_totals(df_state, color1, color2, color3)

Potable water is treated and meets EPA regulations for drinking or culinary use and is typically metered, making estimates relatively accurate across the different property types. On the other hand, Secondary, or untreated, water is used as irrigation for lawns, landscapes, golf courses, gardens and other open areas. This water typically come from a water company's pressurized pipelines or open ditch systems, and is typically unmetered in the state of Utah. Therefore, the methodology employed to estimate the Secondary water is based on the average lot size, a green space estimate, estimated Evapotranspiration rates, and the % of customers connected to the water system that use secondary water. Keep in mind that agricultural water use was excluded from this dataset. 

When we breakdown the water usage in the state, averaged across all 5 years, we see that most of the water usage for both Potable and Secondary water comes from Residential properties (over 60% for both). Therefore, when excluding agricultural water use, irrigation for residential homes uses the most water of the four property types. Investigating this on a county-to-county level will be interesting, given that the climate of the State is primarily a desert.

The property type with the second highest average use across the state is Commercial for Potable water, and Institutional for Secondary water. 

Note that for years 2018 and 2019, Secondary water use per property type was not recorded, only total Secondary water use. Therefore, the averages for Secondary water use truly only consider 2015-2017. 

_Some things to look at: bring in agricultural water use, breakdown by residential property type, # of commercial vs industrial vs institutional facilities, vs homes_

In [82]:
plot_avg_property_type(df_state_means,title="Percent of Water Usage per Property Type for Potable and Secondary Use in Utah<br><sup>Averaged GPCD across 2015-2019</sup>")




The methodology for this dataset assumes that all Industrial water use is for indoor use only.

In [83]:
plot_avg_water_type(df_state_means,
                    title="Percent of Water Usage per Water Type for each Property Type in Utah<br><sup>Averaged GPCD across 2015-2019</sup>")

