# In  2023 there was major Flooding in the Pajaro Valley, CA
![Pajaro Valley Flooding - houses partially submerged in brown water](https://s.hdnux.com/photos/01/31/67/75/23560057/3/1200x0.jpg)

> Image source: <a src=https://www.sfchronicle.com/weather/article/monterey-county-pajaro-flood-17833831.php> The San Fransisco Chronicle: 2023 Pajaro Valley Flooding </a>

In [1]:
import subprocess
from io import BytesIO

import folium
import hvplot.pandas
import pandas as pd
import requests

### Pajaro Valley, roads were flooded and levees failed 

![The Pajaro levee failure](https://s.hdnux.com/photos/01/31/70/04/23560452/3/1200x0.jpg)

> Image Source: [The San Fransisco Chronicle](https://s.hdnux.com/photos/01/31/70/04/23560452/3/1200x0.jpg/)


# Site Description:

### Climate of Watsonville, CA: 

 The city of Watsonville, CA sits in the Pajaro Valley circa 10 miles south of Santa Cruz. In general, it has cool, relatively wet winters and mild, dry summers. The average temperature in the summer months is around 66-77 degrees Fahrenheit (19-25 degrees Celsius), and in the winter months it drops to around 40-50 degrees Fahrenheit (4-10 degrees Celsius). Fog and low overcast is common in the night and morning, especially in the summer, when warmer air from inland areas mixes with the cool, moist air near Monterey Bay. 
 
 Watsonville, CA has a mild and pleasant climate throughout the year and is most often classified as a mediterranean climate.  
 
 The city receives an annual average of 17.5 inches of rain spread out over 80 days, making Watsonville one of the driest regions in California.


### Watsonville Ecology, Habitat & Wildlife

Watsonville is home to one of the last remaining large coastal freshwater ecosystems in California. The Watsonville Sloughs include approximately 800 acres of freshwater marsh, seasonal wetland, and estuarine habitat with six major slough branches, which drain to the Pajaro River. The sloughs are pivotal in providing a crucial resting place for many species of migrating birds. Among the thousands of birds and other abundant wildlife frequenting the sloughs are a variety of rare species, including 23 native plant and animal species that are listed as threatened, endangered, or of special concern. 


### Communities and Infrastructure:

The city of Watsonville is classified as a "disadvantaged community". A disadvantaged community (DAC) in California is defined in Water Code 79505.5 as a community with an annual median household income that is less than 80% of the Statewide annual median household income, or $56,982. This also generally refers to the areas that are disproportionately impacted by, or vulnerable to, environmental pollution. In the case of Pajaro Valley, environmental pollution often stems from large scale Agricultual operations and contaminated groundwater (sea water intrusion).

A new infrastructure bill was passed in late 2022 shortly before the levee failed in 2023. The Pajaro River Flood Risk Management Project received a sizeable federal investment to rebuild the river’s dilapidated levees, which are intended to protect the surrounding crops as well as the Watsonville and Pajaro communities from flooding. Unfortunatley, these levees were not strenghened in time to survive the massive amounts of water the atmospheric rivers of 2022 and 2023 delivered to CA. 

In [2]:
# Define stream gauge latitude and steam gauge longitue
sg_lat = 36.9002307
sg_lon = -121.5977215

# Initialize map and tweak settings

m = folium.Map(
    # Location to display
    location=(sg_lat, sg_lon),
    # Turns off annoying zooming while trying to scroll to the next cell
    scrollWheelZoom=False)

# Put a marker at the stream gauge location
folium.Marker(
    [sg_lat, sg_lon], 
    popup="Stream Gauge on the Pajaro River at Chittenden Gap"
    ).add_to(m)

# Display the map
m

### US streamflow data are available from the National Water Information Service (NWIS) 

### Date range from October 1, 1928 to September 18, 2023
    > NOTE: For hydrologic data, we often use the **Water Year**, which starts the  October before in order to capture the full snow season. In this case, we are downloading WY1928-WY2023

&#9998; USGS streamflow URL: https://waterdata.usgs.gov/nwis/dv?cb_00060=on&format=rdb&site_no=11159000&legacy=&referred_module=sw&period=&begin_date=1928-10-01&end_date=2023-09-30

### Data description and citation

&#9998; In the cell below, describe your data. Include the following information:
  1. A 1-2 sentence description of the data
  2. Data citation
  3. What are the units?
  4. What is the time interval for each data point?
  5. Is there a "no data" value, or a value used to indicate when the sensor was broken or didn't detect anything? (These are also known as NA, N/A, NaN, nan, or nodata values)

&#128214; The [NWIS data format page](https://waterdata.usgs.gov/nwis/?tab_delimited_format_info) might be helpful.

  1. The data we are interested in (and that is linked above) documents streamflow values at the USGS guaging station (no 1115900). These data are recorded in cubic feet per second and span the time frame 10-01-1928 to 09-18-2023. We chose these dates as they correspond with water years so as to caputre complete precipiation profiles over time (rainfall or snow frequently  happens at the end or very beginning of a calendar year). This is particularly important in the mediteranean climate of the central coast of California. 

  2. Pajaro R a Chittenden CA. USGS Water Data for the Nation. (n.d.). https://waterdata.usgs.gov/monitoring-location/11159000/#parameterCode=00065&amp;period=P7D&amp;showMedian=true 

  3. The units for streamflow/discharge are cubic feet per second or cfs (ft^3/sec)

  4. The time interval for each data point is 1 day. (That is the time in between each data point is one day)
  
  5. In the data collected for Chittenden, CA, there does not appear to be a "no data" values. There is, however an "A:e" data qualification code suggesing that, for a handful of days, the streamflow data was approved for publication (processing and review completed) but is an estimated value.  

In [3]:
# Let's use Chat GPT to help us download our data 

nwis_url = (
    "https://waterdata.usgs.gov/nwis/dv"
    "?cb_00060=on"
    "&format=rdb"
    "&site_no=11159000"
    "&legacy="
    "&referred_module=sw"
    "&period="
    "&begin_date=1928-10-01"
    "&end_date=2023-09-30")

# Send an HTTP GET request to the URL
nwis_response = requests.get(nwis_url)
nwis_response.raise_for_status()

nwis_response

<Response [200]>

In [4]:
# Print the top of the data
for i, line in enumerate(nwis_response.content.splitlines()[:10]):
    print(i, line)

1 b'# Some of the data that you have obtained from this U.S. Geological Survey database'
2 b"# may not have received Director's approval. Any such data values are qualified"
3 b'# as provisional and are subject to revision. Provisional data are released on the'
4 b'# condition that neither the USGS nor the United States Government may be held liable'
5 b'# for any damages resulting from its use.'
6 b'#'
7 b'# Additional info: https://help.waterdata.usgs.gov/policies/provisional-data-statement'
8 b'#'
9 b'# File-format description:  https://help.waterdata.usgs.gov/faq/about-tab-delimited-output'


In [5]:
# Let's take a look at the data and see what got downloaded?

for i, line in enumerate(nwis_response.content.splitlines()[:35]):
    # Skip commented lines
    if not line.startswith(b'#'):
        print(i, line)

28 b'agency_cd\tsite_no\tdatetime\t8757_00060_00003\t8757_00060_00003_cd'
29 b'5s\t15s\t20d\t14n\t10s'
30 b'USGS\t11159000\t1939-10-01\t8.00\tA'
31 b'USGS\t11159000\t1939-10-02\t8.00\tA'
32 b'USGS\t11159000\t1939-10-03\t8.00\tA'
33 b'USGS\t11159000\t1939-10-04\t8.00\tA'
34 b'USGS\t11159000\t1939-10-05\t8.00\tA'


In [6]:
# Now let's import the data with pandas. 
pajaro_q_df = pd.read_csv(
    BytesIO(nwis_response.content),
    comment='#',
    delimiter='\t',
    skiprows=[28, 29],
    names=['agency', 'site_no', 'datetime', 'streamflow_cfs', 'code'],
    index_col='datetime',
    parse_dates=True,
)
pajaro_q_df

Unnamed: 0_level_0,agency,site_no,streamflow_cfs,code
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1939-10-01,USGS,11159000,8.0,A
1939-10-02,USGS,11159000,8.0,A
1939-10-03,USGS,11159000,8.0,A
1939-10-04,USGS,11159000,8.0,A
1939-10-05,USGS,11159000,8.0,A
...,...,...,...,...
2023-09-14,USGS,11159000,16.7,P
2023-09-15,USGS,11159000,17.1,P
2023-09-16,USGS,11159000,18.0,P
2023-09-17,USGS,11159000,18.0,P


In [7]:
# Lets check our data types in our `pd.DataFrame`

pajaro_q_df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 30669 entries, 1939-10-01 to 2023-09-18
Data columns (total 4 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   agency          30669 non-null  object 
 1   site_no         30669 non-null  int64  
 2   streamflow_cfs  30669 non-null  float64
 3   code            30669 non-null  object 
dtypes: float64(1), int64(1), object(2)
memory usage: 1.2+ MB


In [8]:
# Subset the stream discharge data to the same timeframe that you are interested in: October 2018 - September, 2020.

pajaro_flood_df = pajaro_q_df['2018-10':'2023-09']
pajaro_flood_df

Unnamed: 0_level_0,agency,site_no,streamflow_cfs,code
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2018-10-01,USGS,11159000,2.37,A
2018-10-02,USGS,11159000,2.31,A
2018-10-03,USGS,11159000,2.33,A
2018-10-04,USGS,11159000,2.39,A
2018-10-05,USGS,11159000,2.12,A
...,...,...,...,...
2023-09-14,USGS,11159000,16.70,P
2023-09-15,USGS,11159000,17.10,P
2023-09-16,USGS,11159000,18.00,P
2023-09-17,USGS,11159000,18.00,P


In [9]:
# Let's plot the subsetted data (2018-2023)

pajaro_flood_df.streamflow_cfs.hvplot(
    title='Discharge of the Pajaro River at Chittenden, CA',
    xlabel='Year', ylabel='Discharge (cfs)'
    )  

  return dataset.data.dtypes[idx].type
  return dataset.data.dtypes[idx].type


In [10]:
# Let's now plot ALL the data.
pajaro_q_df.streamflow_cfs.hvplot(
    title='Discharge of the Pajaro River at Chittenden, CA',
    xlabel='Year', ylabel='Discharge (cfs)'
    )

  return dataset.data.dtypes[idx].type
  return dataset.data.dtypes[idx].type


In [11]:
#  One way to improve this is by **resampling** the data to annual maxima. That way we still get the same peak streamflows,
# but the computer will be able to plot all the values without overlapping.

pajaro_ann_max_q_df = pajaro_q_df[['streamflow_cfs']].resample('AS').max()
pajaro_ann_max_q_df

Unnamed: 0_level_0,streamflow_cfs
datetime,Unnamed: 1_level_1
1939-01-01,11.0
1940-01-01,9530.0
1941-01-01,9550.0
1942-01-01,4560.0
1943-01-01,8010.0
...,...
2019-01-01,4320.0
2020-01-01,844.0
2021-01-01,1050.0
2022-01-01,1780.0


In [12]:
# Plot resampled data

pajaro_ann_max_q_df.hvplot(
     title='Annual Maxima Discharge of the Pajaro River at Chittenden, CA',
    xlabel='Year', ylabel='Discharge (cfs)'
    ) 


  return dataset.data.dtypes[idx].type
  return dataset.data.dtypes[idx].type


  # Historical Trends Suggest Large Streamflow Events in the Pajaro Valley Every Decade

Floods have been recorded in the Pajaro Valley since the 1940's. By the turn of the century, there was a major flooding event every decade, and in 1949, exhausted by flooding and the need to protect a growing population and agricultural land, the Army Corps of Engineers built the present-day levee on the Pajaro River. The trend in the figure above suggests that high streamflow years are often followed by years of drought on the central coast of California. While not pictured in this graph, the atmospheric rivers that hit California increased streamflow such that levees failed.

In [13]:
%%capture
%%bash
jupyter nbconvert watsonville_time_series.ipynb --to html --no-input