---
title: Exploring water conflicts in the Colorado River Basin
subtitle: Week 2 - Discussion section 
week: 2
image: images/Tecopa_site2.JPG
sidebar: false
---

This discussion section will guide you through answering questions about water-related conflicts at the Colorado River Basin using data from the [U.S. Geological Survey (USGS)](https://www.usgs.gov). In this discussion section, you will:

- Practice version control using git via the terminal
- Obtain information about a dataset from an online data repository
- Use core `pandas.DataFrame` methods to answer questions 
- Discuss advantages and disadvantages about different methods of data loading
- Practice best practices for clean code

## Setup

:::{.callout-tip appearance="minimal"}
1. In the Taylor server, start a new JupyterLab session or access an active one.

2. In the terminal, use `cd` to navigate into the `eds-220-sections` directory. Use `pwd` to verify `eds-220-sections` is your current working directory.

3. Create a new Python Notebook inside your `eds-220-sections` directory and rename it to `section-2-co-basin-water-conflicts.ipynb`. 

4. Use the terminal to stage, commit, and push this file to the remote repository. Remember:
    1. `git status` : check git status
    2. `git add FILE-NAME` : stage updated file
    3. `git status` : check git status again to confirm
    4. `git commit -m "Commit message"` : commit with message
    5. `git pull` : check local repo is up to date (best practice)
    5. `git push` : push changes to upstream repository

<p style="text-align: center;">
**CHECK IN WITH YOUR TEAM** 
</p>
<p style="text-align: center;">
**MAKE SURE YOU'VE ALL SUCCESSFULLY SET UP YOUR NOTEBOOKS BEFORE CONTINUING**
</p>
:::

## General directions
:::{.callout-tip appearance="minimal"}
- Add comments in each one of your code cells. 
- On each exercise, include markdown cells in between your code cells to add titles and information.
- Indications about when to commit and push changes are included, but you are encouraged to commit and push more often. 
:::

## About the data
For these exercises we will use data about [Water Conflict and Crisis Events in the Colorado River Basin](https://www.sciencebase.gov/catalog/item/63acac09d34e92aad3ca1480) @holloman_coded_2023. This dataset is stored at [ScienceBase](https://www.sciencebase.gov/catalog/),a digital repository from the U.S. Geological Survey (USGS) created to share scientific data products and USGS resources. 

The dataset is a CSV file containing conflict or crisis around water resource management in the Colorado River Basin. 
The Colorado River Basin, inhabited by several Native American tribes for centuries, is a crucial water source in the southwestern United States and northern Mexico, supporting over 40 million people, extensive agricultural lands, and diverse ecosystems. 
Its management is vital due to the region's arid climate and the competing demands for water, leading to significant challenges related to water allocation and conservation. 

![Colorado River Basin.  U.S. Bureau of Reclamation. ](/discussion-sections-upcoming/images/co-river-basin.png)

<!-- 10 minutes -->
## 1. Archive exploration
Take some time to look through the dataset's description in the ScienceBase epository. Discuss the following questions with your team:

a. Where was the data collected from?
<!-- 
articles from newspapers describing water-related events in geographic areas in the Basin
-->
b. During what time frame were the observations in the dataset collected?
<!--
2005-2021
-->
c. Whta was the author's perceived value of this dataset?
<!--
 examining crisis on a continual basis toward identification of hotspots from conflict, identifying primary stakeholders, and who experiences crises.
-->
e. Briefly discuss anything else that seems like relevant information.

In your notebook, use a markdown cell to add a brief description of the dataset, including a citation, date of access, and a link to the archive. 

<p style="text-align: center;">
**check git status -> stage changes -> check git status -> commit with message -> pull -> push  changes**
</p>

<!-- 3 minutes -->
## 2. Data loading

a. In class we have (so far) loaded data into our workspace both by downloading the file and storing a copy of the dataset in our computer and by accessing directly through a URL. With your team, discuss what can be, in general, the advantages and disadvantages of these two methods of data access. 

b. Import the `Colorado River Basin Water Conflict Table.csv` file [from the Science Base repository](https://www.sciencebase.gov/catalog/item/63acac09d34e92aad3ca1480) into your workspace using its URL and store it as a variable named `df`.

<p style="text-align: center;">
**check git status -> stage changes -> check git status -> commit with message -> pull -> push  changes**
</p>

<p style="text-align: center;">
**CHECK IN WITH YOUR TEAM** 
</p>
<p style="text-align: center;">
**MAKE SURE YOU'VE ALL SUCCESSFULLY LOADED THE DATA BEFORE CONTINUING**
</p>

In [2]:
import pandas as pd

df = pd.read_csv('data/Colorado River Basin Water Conflict Table.csv')
df.head(10)

Unnamed: 0,Event,Search Source,Newspaper,Article Title,Duplicate,Report Date,Report Year,Event Date,Event Day,Event Month,...,Article Text Search - water rights,Article Text Search - intergovernmental,Article Text Search - water transfers,Article Text Search - navigation,Article Text Search - fish,Article Text Search - invasive,Article Text Search - diversion,Article Text Search - water diversion,Article Text Search - instream,Article Text Search - aquatic
0,1,USGS1-50.docx,The Durango Herald (Colorado),Tribes assert water rights on Colorado River B...,False,7-Apr-22,2022.0,,,4.0,...,17,0,0,0,0,0,0,0,0,0
1,2,USGS1-50.docx,"Journal, The (Cortez, Dolores, Mancos, CO)",Native American tribes assert water rights on ...,False,7-Apr-22,2022.0,,,4.0,...,17,0,0,0,0,0,0,0,0,0
2,3,USGS1-50.docx,The Salt Lake Tribune,'Very positive change.' New Utah law will be a...,False,17-Mar-22,2022.0,,,3.0,...,12,0,0,0,1,0,0,0,12,1
3,4,USGS1-50.docx,Casa Grande Dispatch (AZ),Legislation would let an Arizona tribe lease C...,False,11-Dec-21,2021.0,,,12.0,...,6,0,0,0,0,0,0,0,0,0
4,5,USGS1-50.docx,The Aspen Times (Colorado),Historically excluded from Colorado River poli...,False,19-Dec-21,2021.0,,,11.0,...,18,0,0,0,0,0,0,0,0,0
5,6,USGS1-50.docx,The Arizona Republic (Phoenix),Everyone loses if we cannot agree on how we us...,False,22-Apr-17,2017.0,,,4.0,...,1,0,0,0,0,0,0,0,0,0
6,7,USGS1-50.docx,Arizona Daily Star (Tucson),Long-term solutions are needed to keep Colorad...,False,15-Jun-18,2018.0,,,6.0,...,0,0,0,0,0,0,0,0,0,0
7,8,USGS1-50.docx,"Navajo Times (Window Rock, Arizona)","Colorado River, stolen by law; Indigenous nati...",False,17-Mar-22,2022.0,,,3.0,...,15,0,0,0,0,0,0,0,0,0
8,9,USGS1-50.docx,The Arizona Republic (Phoenix),Tribes seek a greater role in managing Colorad...,False,18-Dec-21,2021.0,"December 14-16, 2021",,12.0,...,7,0,0,0,0,0,0,0,0,0
9,10,USGS1-50.docx,"Journal, The (Cortez, Dolores, Mancos, CO)",Colorado's water guardians agree on one thing:...,False,29-Jan-21,2021.0,"January 25 - 26, 2021",,1.0,...,7,0,0,0,0,0,0,0,0,0


Preliminary data exploration

In [5]:
df.shape

(268, 48)

In [6]:
df.columns

Index(['Event', 'Search Source', 'Newspaper', 'Article Title', 'Duplicate',
       'Report Date', 'Report Year', 'Event Date', 'Event Day', 'Event Month',
       'Event Year', 'Conflict Present', 'Crisis Present', 'Basin', 'HUC6',
       'HUC2', 'Place', 'County', 'County FIPS', 'State', 'State FIPS',
       'Urban or Rural', 'Issue Type', 'Event Summary', 'Stakeholders',
       'Intensity Value', 'Comments', 'Related Observation Themes',
       'Article Text Search - water quality',
       'Article Text Search - invasive species',
       'Article Text Search - conservation', 'Article Text Search - drought',
       'Article Text Search - flood',
       'Article Text Search - ground water depletion',
       'Article Text Search - depletion',
       'Article Text Search - infrastructure',
       'Article Text Search - fish passage',
       'Article Text Search - instream water rights',
       'Article Text Search - water rights',
       'Article Text Search - intergovernmental',
       '

In [10]:
df.Stakeholders.unique()

array(['Tribal Nations, State Government, Federal Government',
       'Southern Ute Indian Tribe, Ute Mountain Tribe, State Government, Federal Government',
       'State Government, Any Water Rights Holder, Agriculture',
       'Colorado River Indian Tribes, State Government, Federal Government, Agriculture',
       'Sothern Ute Indian Tribe, Ute Mountain Tribe, State Government, Federal Government, All Water Users',
       'Water Managers, All Water Users, Conservation District',
       'State Government, Federal Government, Water Managers, All Water Users',
       'Colorado River Indian Tribes, State Government, Federal Government',
       'Ten Tribes Partnership, State Government, Conservation District, Federal Government, NGO',
       'State Government, Federal Government, Private Utility, Local Government, County Government,  Business, Other Interest Groups, Agriculture',
       'State Government, Federal Government, Conservation District, Private Utility, Any Water Rights Holder

In [14]:
df['Urban or Rural'].value_counts()

Urban or Rural
Both       184
Rural       45
Urban       13
Both         6
Urban        5
Urban        1
Name: count, dtype: int64

In [16]:
df['Urban or Rural'].unique()

array(['Both', 'Rural', 'Urban  ', nan, 'Urban ', 'Urban', 'Both '],
      dtype=object)

In [18]:
df['State'].unique()

array(['CO', 'UT', nan, 'AZ', 'OH; UT', 'AZ; CO; NM; UT', 'CA', 'AZ; UT',
       'AZ; NV', 'CO; UT; WY; NM', 'AZ; CA', 'UT; AZ', 'CO; WY', 'NV; AZ',
       'CO; AZ', 'AZ; CA; CO; NV; NM; UT; WY', 'AZ; CA; NV', 'NV', 'NM',
       'UT; CO; WY', 'CA; NV; AZ', 'AZ; NM', 'WY; UT; CO', 'TX'],
      dtype=object)

In [19]:
for x in df['State'].unique():
    print(x)

CO
UT
nan
AZ
OH; UT
AZ; CO; NM; UT
CA
AZ; UT
AZ; NV
CO; UT; WY; NM
AZ; CA
UT; AZ
CO; WY
NV; AZ
CO; AZ
AZ; CA; CO; NV; NM; UT; WY
AZ; CA; NV
NV
NM
UT; CO; WY
CA; NV; AZ
AZ; NM
WY; UT; CO
TX
