# Nevada Mines and Water Exploratory Data Analysis
*Class project for Data Science for GIS course at Claremont Graduate University, Fall 2022*

By: Ainslee Archibald

This code is meant to be run in an ArcGIS Online Jupyter Notebook, with access to computing resources. It may be possible to run locally on a machine with ArcGIS installed, but it was not optimized for that and may require edits in order to run properly.

#### Installations
These are the installations required for the code to run in the ArcGIS Online Jupyter Notebooks in Fall 2022.

In [None]:
!pip install pandasql
!pip install -U pandas-profiling[notebook,unicode]

#### Imports

In [2]:
import os
import pandas as pd
import numpy as np
import arcgis
from arcgis.gis import GIS
from arcgis.mapping import WebMap, WebScene
from arcgis import features
import arcpy 
from IPython.display import display
from IPython.display import Image
from IPython.core.display import HTML
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.transforms
from pandasql import PandaSQL
pdsql = PandaSQL()
from pandas_profiling import ProfileReport
from datetime import datetime as dt

#### ArcGIS Online connection

In [None]:
# https://developers.arcgis.com/python/guide/working-with-different-authentication-schemes/
# gis = GIS() # Connect to ArcGIS Online as an anonymous user
gis = GIS("home")
print("Successfully logged in as: " + gis.properties.user.username)

## Introduction

There were 161 mines in Nevada with some kind of activity in 2019. Of those, 40 were gold mines (Muntean, 2021). Mines were key to establishing Nevada as a state, and continue to be essential to the state’s economy. Mining is the state’s largest export industry (Mining, 2022), it provides the highest average salary in the state (Mining in Nevada, 2022), and Nevada is home to the only major lithium mine in the United States (Milman, 2022). It is difficult to overstate how central mining is to Nevada’s history, economy, and identity. But the industry does not come without its downsides, and it leaves a long trail of environmental consequences in its wake.

### Literature Review

One of the most prominent effects mining has had is on water quality. Toxic releases from mining are especially a problem in Nevada because of the quantity of gold, silver, and zinc mining in the state (Environmental Protection Agency, 2022). Gold mines have been found to have an effect on groundwater, resulting in high acidity and high metal concentrations near tailings (Vega, 2004). The only Superfund site in Nevada is the Carson River Mercury Superfund site, also a consequence of ore processing from mines in the area (Nevada Department of Conservation and Natural Resources, 2020).

 The historic consequences of mining, specifically from gold mines, can be seen in the forty waterbodies identified as not supporting consumption of fish due to mercury in the water dataset used in this project. According to the Nevada 2020-2022 Water Quality Integrated Report the data was produced for, “most of the waterbodies with mercury-impaired fish reflect legacy pollution from historical mining operations where mercury was used during the processing of gold ore,” (Nevada Department of Conservation and Natural Resources, 2022). Groundwater contamination isn’t the only concern. The tailings from mines “are associated with the surface impacts which greatly affect surface and ground water quality” (Fashola, 2016). While mines have to abide by higher environmental standards now than they did in the early days of the state, the legacy of mining pollution looms large in Nevada.

### Organizational Understanding

#### Organizational Objectives

This semester, a non-profit I’ve worked for in the past, The Progressive Leadership Alliance of Nevada, approached me with an idea for a GIS project. The project as initially proposed was creating a gold mine tracker for the state of Nevada that includes all current and proposed gold mines, which is a resource that doesn't yet exist. One of the goals of the non-profit is to decrease the number of new gold mines in Nevada as much as possible, and having good, spatially organized data about current and proposed gold mines would be a great resource for that goal.

Over the course of the project, my objectives evolved as I discovered what was and wasn’t possible for me to do with the data and time that I had. My focus widened from just gold mines to all mines, active and historical. My initial goals, based on what the non-profit requested from me, were as follows:

**Objective 1.** Create a reference map in ArcGIS Pro that contains the locations of all existing gold mine sites and some data associated with them. Share this map with the non-profit. Extension: Add proposed gold mine sites.

**Objective 2.** Gather data on a specific aspect of the possible effects of gold mines and perform analysis to determine if a correlation exists between proximity to a mine and the other data point, and if so, what is its nature. This analysis will likely focus on water quality due to the availability of data. Extension: Investigate an additional data point.

**Objective 3.** Create a web app in ArcGIS Web AppBuilder that presents the reference map and my analysis and allows users to filter information about the gold mines via layers. Extension: Make this web app public in collaboration with the non-profit.

My goals for further work, specifically coordinating with the non-profit and meeting their needs, were to include proposed gold mine sites on the reference map, build out a more robust dataset about the existing and proposed mines, sharpen the data analysis so it can be used by the non-profit, and make the web app public.

After assessing the situation, I updated my success criteria for this project to, by the end of the semester, have completed Objective 1 (reference map with existing gold mine sites with minimal associated data), Objective 2 (basic correlation analysis with another dataset using a few methods from class), and enough of Objective 3 to be able to coherently present it back to the class. As you will see throughout this notebook, these objectives continued to evolve as I handled challenges.

#### Academic Objectives

My research question was to what extent does the proximity of a mine have environmental effects, specifically on water quality in the state of Nevada. In addition to the interest in helping the non-profit, I myself am interested in and have worked on mining policy. I think that it’s important to have better data on the effects of mines. Nevada’s economy is controlled mainly by tourism and mining, and therefore there exists very little political support for doing anything that might impact the success of mining operations. I think this makes it more difficult for communities to have real input in the decisions that affect their local environment, and makes the state rather unlikely to investigate the environmental effects of mines.

## Understanding the Data

### Set the Environment to our Geodatabase 

In [9]:
gdb_path = './project.gdb'
arcpy.env.workspace = gdb_path

In [None]:
print(arcpy.env.workspace)

In [11]:
arcpy.ListFeatureClasses()

['Assessed_Sample_Sites_2022',
 'Historic_Production__NDOM_',
 'Active_Mines_and_Energy_Producers_2021__NBMG_',
 'water_quality']

### Create SDFs and Data Description

The water data I used was collected as part of the Nevada 2020-2022 Water Quality Integrated Report prepared by the Nevada Division of Environmental Protection, Bureau of Water Quality Planning (Nevada Division of Environmental Protect, 2022). It can be downloaded from their open data website (Nevada Department of Environmental Protection, BWQP).

In [12]:
water_locations_sdf = pd.DataFrame.spatial.from_featureclass(os.path.join(gdb_path,'Assessed_Sample_Sites_2022'))

The two mines datasets for this project were found through the tool Nevada Mineral Explorer (The University of Nevada, Reno, 2020). This tool was created by the Nevada Bureau of Mines and Geology, University of Nevada Reno. It was later changed to remove much of the data from view, including the active mines dataset used in this project. That dataset can still be downloaded from the tool, but not viewed in it. The datasets can be downloaded from the links in References.

In [13]:
active_mines_sdf = pd.DataFrame.spatial.from_featureclass(os.path.join(gdb_path,'Active_Mines_and_Energy_Producers_2021__NBMG_'))

In [14]:
historic_mines_sdf = pd.DataFrame.spatial.from_featureclass(os.path.join(gdb_path,'Historic_Production__NDOM_'))

The first dataset was called just Assessment 2022, and it contains assessments of the water quality of the waterbodies that were measured (Most, 2022). The waterbodies are identified with and the assessments are split up by various uses. This will be merged with the second dataset, shown below. This is the BWQP Assessed Sample Sites 2022, and it contains the codes, names, and locations the samples were collected from, as well as administrative information, geologic information, and hydrographic information (Most, 2022).

In [15]:
water_locations_sdf.head()

Unnamed: 0,OBJECTID_1,OBJECTID,Organizati,StationID,StationCod,StationNam,WaterbodyC,WaterbodyN,AlternateC,LocationDe,...,GlobalID,Waterbody_Code,Attainment,Cause,ATTAINMENT_and_CAUSE,Waterbody_Code_1,Attainment_1,Cause_1,ATTAINMENT_and_CAUSE_1,SHAPE
0,1,672,NDEP,19,GC1,Golconda Canyon Creek @ House,UNK,Unspecified,,,...,{5BFC00DB-4486-4B64-8439-B0B442D9E476},,,,,,,,,"{""x"": 460877.7647000002, ""y"": 4529717.3892, ""s..."
1,2,673,NDEP,117,RPRA,Rye Patch Reservoir near Dam - Surface,NV04-HR-81_00,Rye Patch Reservoir,,The entire reservoir,...,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00,"Aquatic Life, Not Supporting","Phosphorus Total SA Apr to Nov AQL, Selenium ...","Aquatic Life, Not Supporting, Phosphorus Total...",NV04-HR-81_00,"Aquatic Life, Not Supporting","Phosphorus Total SA Apr to Nov AQL, Selenium ...","Aquatic Life, Not Supporting, Phosphorus Total...","{""x"": 388956.9402999999, ""y"": 4480801.9739, ""s..."
2,3,674,NDEP,42,SB6,Steamboat Ditch @ Rhodes Road,NV06-SC-73_00,Steamboat Ditch,,STEAMBOAT CREEK LONG TERM MONITORING SITE. SI...,...,{F3FA89FC-E687-471C-BE6E-747FA0EF82CC},,,,,,,,,"{""x"": 263659.5251000002, ""y"": 4362257.7042, ""s..."
3,4,675,NDEP,70,HS74,Thomas Creek East of 2 Drumlin-Like Hills,NV04-HR-173_00,Thomas Creek,,,...,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00,"Aquatic Life, Fully Supporting",,"Aquatic Life, Fully Supporting,",NV04-HR-173_00,"Aquatic Life, Fully Supporting",,"Aquatic Life, Fully Supporting,","{""x"": 439672.04760000017, ""y"": 4528543.5872000..."
4,5,676,NDEP,55,p-NVW04485-0196,Boone Creek Upper &[BIOP-0135],UNK,Unspecified,BIOP-0135,BIOASSESSMENT SAMPLING SITE SAMPLED 8/13/2013,...,{CAFFBCDF-532C-471C-9B37-C1475BB5A288},,,,,,,,,"{""x"": 501875.8312999997, ""y"": 4400162.4877, ""s..."


The first of the two mining datasets I used was the Active Mines and Energy Producers 2021 dataset, which can be found on ArcGIS Online (Muntean, 2021). This dataset was created by the Nevada Bureau of Mines and Geology. It contains names, locations, operators, commodities, and counties for mines that had activity in 2021.

In [16]:
active_mines_sdf.head()

Unnamed: 0,OBJECTID,TYPE,NAME,OPERATOR,COMMODITY,COUNTY,Y_U83N,X_U83E,SHAPE
0,1,M,Arturo Mine Project (open pit),"Nevada Gold Mines, LLC (joint venture: Nevada ...","Gold, silver",Eureka,4543001.0,548221.0,"{""x"": 548221, ""y"": 4543001, ""spatialReference""..."
1,2,M,Aurora Mine (reprocessing),"Klondex Aurora Mine, Inc.","Gold, silver",Esmeralda,4240220.0,334720.0,"{""x"": 334720, ""y"": 4240220, ""spatialReference""..."
2,3,M,Bald Mountain Mine North Operations Area (open...,"KG Mining (Bald Mountain), Inc.","Gold, silver",White Pine,4422221.0,624582.0,"{""x"": 624582, ""y"": 4422221, ""spatialReference""..."
3,4,M,Bald Mountain Mine South Operations Area (open...,"KG Mining (Bald Mountain), Inc.","Gold, silver",White Pine,4402001.0,626802.0,"{""x"": 626802, ""y"": 4402001, ""spatialReference""..."
4,5,M,Betze-Post (open pit),"Nevada Gold Mines, LLC (joint venture: Barrick...","Gold, silver",Eureka,4537038.0,551878.0,"{""x"": 551878, ""y"": 4537038, ""spatialReference""..."


The second mining dataset I downloaded was Historic Production, which can be found on ArcGIS Online under the name Production of Minerals in Nevada 1987-Present (Impatterson_NDOM, 2022). This dataset was created by the Nevada Division of Minerals. It contains the names, locations, commodities, and production in tons by year from 1987 to 2019.

In [17]:
historic_mines_sdf.head()

Unnamed: 0,OBJECTID,MshaNo,Opname,F1987,F1988,F1989,F1990,F1991,F1992,F1993,...,F2015,F2016,F2017,F2018,F2019,NAD83_N,NAD83_E,Commodity,Units,SHAPE
0,1,26-00411,Greystone Mine,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,145367.0,373815.0,221700.0,712200.0,222000.0,4457767.0,510921.0008,Barite,Tons,"{""x"": 510921.0007999996, ""y"": 4457767.4498, ""s..."
1,2,26-02239,Rossi Mine And Dunphy Mill,0.0,0.0,0.0,0.0,0.0,0.0,571917.0,...,193046.0,0.0,0.0,0.0,0.0,4547148.0,548535.1061,Barite,Tons,"{""x"": 548535.1061000004, ""y"": 4547147.5704, ""s..."
2,3,26-01152,Argenta Mine/Mill,0.0,0.0,0.0,150907.0,115631.0,75361.0,84112.0,...,0.0,10000.0,81000.0,105000.0,53749.5,4498523.0,522828.7986,Barite,Tons,"{""x"": 522828.79860000033, ""y"": 4498523.3342, ""..."
3,4,26-02730,Slaven Canyon Mine,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,38360.0,0.0,0.0,95260.0,211695.5,4480635.0,520901.1667,Barite,Tons,"{""x"": 520901.16669999994, ""y"": 4480635.4669, ""..."
4,5,26-02603,Big Ledge Mine/Osino Mill,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,4595988.0,663231.7094,Barite,Tons,"{""x"": 663231.7094, ""y"": 4595987.9604, ""spatial..."


I'm not sure why these sdfs don't register as having spatial information initially, but I figured it was worth demonstrating.

In [18]:
water_locations_sdf.spatial.validate()

False

In [19]:
historic_mines_sdf.spatial.validate()

False

In [20]:
active_mines_sdf.spatial.validate()

False

In [21]:
water_locations_fl = water_locations_sdf.spatial.to_featurelayer(f"Water_Locations_FL_{dt.now().strftime('%Y%m%d%H%M%S')}")

In [22]:
water_locations_fl

In [23]:
testmap = gis.map("Nevada")
testmap.add_layer(water_locations_fl)
testmap

MapView(layout=Layout(height='400px', width='100%'))

In [24]:
water_locations_sdf.spatial.validate()

True

And, now the water locations SDF does register as having spatial information.

### Data Exploration

Now that I’d looked through the data on the web viewers of the various sources, I downloaded it and brought it into a notebook. This was the first major hurdle. I struggled a bit with the file system management in ArcGIS Online, and failed to organize things properly the first few times, but eventually I was able to set up the folders how I wanted, import the data, and unzip everything. I did my best to keep track of where everything was by making variables for the various paths. This became difficult and unwieldy as I kept changing the organization, but it’s something I cleaned up in the final product and something I’ll try to be better about in future work. Eventually, I elected to just upload it into My Content and import the data from there, sidestepping the file management issues going forward.

My initial approach did not involve a GeoDatabase, as I hoped I could just work directly with the data I’d imported by transforming them into spatial dataframes. I made decent progress exploring the data in this format, and I got to know what kind of issues I might encounter and what sort of cleaning I would have to do.

For the mines data, the most pressing issue was that many mines that were in the historical dataset also appeared in the active mines dataset, but they appeared under different names and with slightly different locations. In order to have a single layer of mine information, these datasets would need to be combed through for duplicates. Additionally, there were discrepancies between the two datasets in terms of how names were recorded and what ID information was available. The active mines listed the name and operator of the mines separately, while the historic mines dataset included just a single variable “Opname”. The historic mines dataset included valuable yield data from 1987 to 2019 in tons, but the active mines dataset had no yield data at all. There was no apparent key to merge the data, so dealing with duplicates and non-standard data collection would be a problem.

For the water data, the first and most obvious issue was that the water quality data did not come with location data. Instead, the waterbodies were assigned codes, and these codes were also present in the sample sites dataset. There, they were listed with location data, as well as names and other important information. When I dug into the data more, I realized that the sample sites data was pretty messy and, in addition to many variables with poorly organized data, included more than one location for many of the waterbody codes. This was of concern, because I was hoping to have one data point per waterbody to allow me to do better analysis, and I didn’t know how to fix the multiple locations problem.

Additional problems presented themselves as well. The assessment dataset reports if a waterbody fully supports or does not support a particular use. If it doesn’t support the use, a cause is recorded, usually some kind of contaminant(s). However, not every waterbody has records for all uses. Also, some assessments are recorded as not measured or insufficient information.

### Data Quality

After I realized I didn’t fully understand what the water datasets were conveying or how they were collected enough to make an informed decision on how to solve the multiple locations problem, I reached out to the department that put together the report the datasets were created for. The Nevada Department of Conservation and Natural Resources person I communicated with put me in touch with Dave Simpson, an environmental scientist and supervisor at the Bureau of Water Quality Planning. When I called Mr. Simpson, he helped me understand that the water quality assessments were made based on all the data collected from the waterbody. Therefore, I could merge the datasets and have the one water quality assessment collection attached to each of the sample sites without misrepresenting the data. Additionally, he explained that some of the discrepancies in what data was reported were because if the agency didn’t have enough data for a certain waterbody, they would carry forward assessments made in previous years with slightly different collection approaches, or would get information from a different source.

## Data Preparation

### Integrate data

Early versions of this project saw me selecting, cleaning, and constructing new variables for the water quality data before I merged it with the sampling locations dataset. In this final version, the merge is done with ArcPy and exported as a feature class in a GeoDatabase.

In [25]:
water_locations_sdf.columns

Index(['OBJECTID_1', 'OBJECTID', 'Organizati', 'StationID', 'StationCod',
       'StationNam', 'WaterbodyC', 'WaterbodyN', 'AlternateC', 'LocationDe',
       'Latitude', 'Longitude', 'PLSSTown', 'PLSSRange', 'PLSSSectio', 'PLSSQ',
       'PLSSQQ', 'FieldOffic', 'CountyName', 'Basin', 'Level4Ecor',
       'HUC12Code', 'Designated', 'CatchmentA', 'OwnershipN', 'Elevation',
       'FirstMonDa', 'ReferenceC', 'PrimaryLan', 'SecondaryL', 'PrimeGeolo',
       'PrimeSoilT', 'Channelize', 'Dam', 'StreamOrig', 'StreamSubs',
       'Comment_', 'ResultsSum', 'ContinousD', 'HasProfile', 'WQXExport',
       'WQXMonLocC', 'CreatedBy', 'CreatedDat', 'ModifiedBy', 'ModifiedDa',
       'Reporting_', 'Station_Co', 'Station_Na', 'Organiza_1', 'Hydrograph',
       'Waterbody_', 'Waterbod_1', 'Descriptio', 'NAC', 'GlobalID',
       'Waterbody_Code', 'Attainment', 'Cause', 'ATTAINMENT_and_CAUSE',
       'Waterbody_Code_1', 'Attainment_1', 'Cause_1', 'ATTAINMENT_and_CAUSE_1',
       'SHAPE'],
      dtype='ob

In [26]:
arcpy.management.JoinField("Assessed_Sample_Sites_2022", "WaterbodyC", "WaterQuality", "Waterbody_Code", "Waterbody_Code;Attainment;Cause;ATTAINMENT_and_CAUSE", "NOT_USE_FM", None)

In [27]:
arcpy.ListFeatureClasses()

['Assessed_Sample_Sites_2022',
 'Historic_Production__NDOM_',
 'Active_Mines_and_Energy_Producers_2021__NBMG_',
 'water_quality']

In [None]:
try:
    arcpy.conversion.ExportFeatures("Assessed_Sample_Sites_2022", "water_quality", '', "USE_ALIAS", 'OBJECTID "OBJECTID" true true false 2 Short 0 0,First,#,Assessed_Sample_Sites_2022,OBJECTID,-1,-1;Organizati "Organizati" true true false 10 Text 0 0,First,#,Assessed_Sample_Sites_2022,Organizati,0,10;StationID "StationID" true true false 2 Short 0 0,First,#,Assessed_Sample_Sites_2022,StationID,-1,-1;StationCod "StationCod" true true false 20 Text 0 0,First,#,Assessed_Sample_Sites_2022,StationCod,0,20;StationNam "StationNam" true true false 98 Text 0 0,First,#,Assessed_Sample_Sites_2022,StationNam,0,98;WaterbodyC "WaterbodyC" true true false 15 Text 0 0,First,#,Assessed_Sample_Sites_2022,WaterbodyC,0,15;WaterbodyN "WaterbodyN" true true false 103 Text 0 0,First,#,Assessed_Sample_Sites_2022,WaterbodyN,0,103;AlternateC "AlternateC" true true false 20 Text 0 0,First,#,Assessed_Sample_Sites_2022,AlternateC,0,20;LocationDe "LocationDe" true true false 254 Text 0 0,First,#,Assessed_Sample_Sites_2022,LocationDe,0,254;Latitude "Latitude" true true false 8 Double 0 0,First,#,Assessed_Sample_Sites_2022,Latitude,-1,-1;Longitude "Longitude" true true false 8 Double 0 0,First,#,Assessed_Sample_Sites_2022,Longitude,-1,-1;PLSSTown "PLSSTown" true true false 2 Short 0 0,First,#,Assessed_Sample_Sites_2022,PLSSTown,-1,-1;PLSSRange "PLSSRange" true true false 2 Short 0 0,First,#,Assessed_Sample_Sites_2022,PLSSRange,-1,-1;PLSSSectio "PLSSSectio" true true false 2 Short 0 0,First,#,Assessed_Sample_Sites_2022,PLSSSectio,-1,-1;PLSSQ "PLSSQ" true true false 2 Text 0 0,First,#,Assessed_Sample_Sites_2022,PLSSQ,0,2;PLSSQQ "PLSSQQ" true true false 2 Text 0 0,First,#,Assessed_Sample_Sites_2022,PLSSQQ,0,2;FieldOffic "FieldOffic" true true false 11 Text 0 0,First,#,Assessed_Sample_Sites_2022,FieldOffic,0,11;CountyName "CountyName" true true false 13 Text 0 0,First,#,Assessed_Sample_Sites_2022,CountyName,0,13;Basin "Basin" true true false 24 Text 0 0,First,#,Assessed_Sample_Sites_2022,Basin,0,24;Level4Ecor "Level4Ecor" true true false 49 Text 0 0,First,#,Assessed_Sample_Sites_2022,Level4Ecor,0,49;HUC12Code "HUC12Code" true true false 8 Double 0 0,First,#,Assessed_Sample_Sites_2022,HUC12Code,-1,-1;Designated "Designated" true true false 1 Text 0 0,First,#,Assessed_Sample_Sites_2022,Designated,0,1;CatchmentA "CatchmentA" true true false 1 Text 0 0,First,#,Assessed_Sample_Sites_2022,CatchmentA,0,1;OwnershipN "OwnershipN" true true false 25 Text 0 0,First,#,Assessed_Sample_Sites_2022,OwnershipN,0,25;Elevation "Elevation" true true false 2 Short 0 0,First,#,Assessed_Sample_Sites_2022,Elevation,-1,-1;FirstMonDa "FirstMonDa" true true false 8 Date 0 0,First,#,Assessed_Sample_Sites_2022,FirstMonDa,-1,-1;ReferenceC "ReferenceC" true true false 1 Text 0 0,First,#,Assessed_Sample_Sites_2022,ReferenceC,0,1;PrimaryLan "PrimaryLan" true true false 1 Text 0 0,First,#,Assessed_Sample_Sites_2022,PrimaryLan,0,1;SecondaryL "SecondaryL" true true false 1 Text 0 0,First,#,Assessed_Sample_Sites_2022,SecondaryL,0,1;PrimeGeolo "PrimeGeolo" true true false 65 Text 0 0,First,#,Assessed_Sample_Sites_2022,PrimeGeolo,0,65;PrimeSoilT "PrimeSoilT" true true false 50 Text 0 0,First,#,Assessed_Sample_Sites_2022,PrimeSoilT,0,50;Channelize "Channelize" true true false 5 Text 0 0,First,#,Assessed_Sample_Sites_2022,Channelize,0,5;Dam "Dam" true true false 5 Text 0 0,First,#,Assessed_Sample_Sites_2022,Dam,0,5;StreamOrig "StreamOrig" true true false 1 Text 0 0,First,#,Assessed_Sample_Sites_2022,StreamOrig,0,1;StreamSubs "StreamSubs" true true false 1 Text 0 0,First,#,Assessed_Sample_Sites_2022,StreamSubs,0,1;Comment_ "Comment_" true true false 92 Text 0 0,First,#,Assessed_Sample_Sites_2022,Comment_,0,92;ResultsSum "ResultsSum" true true false 100 Text 0 0,First,#,Assessed_Sample_Sites_2022,ResultsSum,0,100;ContinousD "ContinousD" true true false 4 Long 0 0,First,#,Assessed_Sample_Sites_2022,ContinousD,-1,-1;HasProfile "HasProfile" true true false 3 Text 0 0,First,#,Assessed_Sample_Sites_2022,HasProfile,0,3;WQXExport "WQXExport" true true false 5 Text 0 0,First,#,Assessed_Sample_Sites_2022,WQXExport,0,5;WQXMonLocC "WQXMonLocC" true true false 15 Text 0 0,First,#,Assessed_Sample_Sites_2022,WQXMonLocC,0,15;CreatedBy "CreatedBy" true true false 2 Short 0 0,First,#,Assessed_Sample_Sites_2022,CreatedBy,-1,-1;CreatedDat "CreatedDat" true true false 8 Date 0 0,First,#,Assessed_Sample_Sites_2022,CreatedDat,-1,-1;ModifiedBy "ModifiedBy" true true false 2 Short 0 0,First,#,Assessed_Sample_Sites_2022,ModifiedBy,-1,-1;ModifiedDa "ModifiedDa" true true false 8 Date 0 0,First,#,Assessed_Sample_Sites_2022,ModifiedDa,-1,-1;Reporting_ "Reporting_" true true false 13 Text 0 0,First,#,Assessed_Sample_Sites_2022,Reporting_,0,13;Station_Co "Station_Co" true true false 20 Text 0 0,First,#,Assessed_Sample_Sites_2022,Station_Co,0,20;Station_Na "Station_Na" true true false 98 Text 0 0,First,#,Assessed_Sample_Sites_2022,Station_Na,0,98;Organiza_1 "Organiza_1" true true false 43 Text 0 0,First,#,Assessed_Sample_Sites_2022,Organiza_1,0,43;Hydrograph "Hydrograph" true true false 24 Text 0 0,First,#,Assessed_Sample_Sites_2022,Hydrograph,0,24;Waterbody_ "Waterbody_" true true false 15 Text 0 0,First,#,Assessed_Sample_Sites_2022,Waterbody_,0,15;Waterbod_1 "Waterbod_1" true true false 103 Text 0 0,First,#,Assessed_Sample_Sites_2022,Waterbod_1,0,103;Descriptio "Descriptio" true true false 229 Text 0 0,First,#,Assessed_Sample_Sites_2022,Descriptio,0,229;NAC "NAC" true true false 2 Short 0 0,First,#,Assessed_Sample_Sites_2022,NAC,-1,-1;GlobalID "GlobalID" true true false 38 Text 0 0,First,#,Assessed_Sample_Sites_2022,GlobalID,0,38;Waterbody_Code "Waterbody_Code" true true false 8000 Text 0 0,First,#,Assessed_Sample_Sites_2022,Waterbody_Code,0,8000;Attainment "Attainment" true true false 8000 Text 0 0,First,#,Assessed_Sample_Sites_2022,Attainment,0,8000;Cause "Cause" true true false 8000 Text 0 0,First,#,Assessed_Sample_Sites_2022,Cause,0,8000;ATTAINMENT_and_CAUSE "ATTAINMENT_and_CAUSE" true true false 8000 Text 0 0,First,#,Assessed_Sample_Sites_2022,ATTAINMENT_and_CAUSE,0,8000', None)
except:
    print("Already exists")

In [29]:
arcpy.ListFeatureClasses()

['Assessed_Sample_Sites_2022',
 'Historic_Production__NDOM_',
 'Active_Mines_and_Energy_Producers_2021__NBMG_',
 'water_quality']

In [30]:
water_quality_sdf = pd.DataFrame.spatial.from_featureclass('water_quality')

In [31]:
water_quality_sdf.columns

Index(['OBJECTID_1', 'OBJECTID', 'Organizati', 'StationID', 'StationCod',
       'StationNam', 'WaterbodyC', 'WaterbodyN', 'AlternateC', 'LocationDe',
       'Latitude', 'Longitude', 'PLSSTown', 'PLSSRange', 'PLSSSectio', 'PLSSQ',
       'PLSSQQ', 'FieldOffic', 'CountyName', 'Basin', 'Level4Ecor',
       'HUC12Code', 'Designated', 'CatchmentA', 'OwnershipN', 'Elevation',
       'FirstMonDa', 'ReferenceC', 'PrimaryLan', 'SecondaryL', 'PrimeGeolo',
       'PrimeSoilT', 'Channelize', 'Dam', 'StreamOrig', 'StreamSubs',
       'Comment_', 'ResultsSum', 'ContinousD', 'HasProfile', 'WQXExport',
       'WQXMonLocC', 'CreatedBy', 'CreatedDat', 'ModifiedBy', 'ModifiedDa',
       'Reporting_', 'Station_Co', 'Station_Na', 'Organiza_1', 'Hydrograph',
       'Waterbody_', 'Waterbod_1', 'Descriptio', 'NAC', 'GlobalID',
       'Waterbody_Code', 'Attainment', 'Cause', 'ATTAINMENT_and_CAUSE',
       'SHAPE'],
      dtype='object')

In [32]:
water_quality_sdf.head()

Unnamed: 0,OBJECTID_1,OBJECTID,Organizati,StationID,StationCod,StationNam,WaterbodyC,WaterbodyN,AlternateC,LocationDe,...,Waterbody_,Waterbod_1,Descriptio,NAC,GlobalID,Waterbody_Code,Attainment,Cause,ATTAINMENT_and_CAUSE,SHAPE
0,1,672,NDEP,19,GC1,Golconda Canyon Creek @ House,UNK,Unspecified,,,...,UNK,Unspecified,Waterbody Not Specified,0,{5BFC00DB-4486-4B64-8439-B0B442D9E476},,,,,"{""x"": 460877.7647000002, ""y"": 4529717.3892, ""s..."
1,2,673,NDEP,117,RPRA,Rye Patch Reservoir near Dam - Surface,NV04-HR-81_00,Rye Patch Reservoir,,The entire reservoir,...,NV04-HR-81_00,Rye Patch Reservoir,The entire reservoir,1448,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00,"Aquatic Life, Not Supporting","Phosphorus Total SA Apr to Nov AQL, Selenium ...","Aquatic Life, Not Supporting, Phosphorus Total...","{""x"": 388956.9402999999, ""y"": 4480801.9739, ""s..."
2,3,674,NDEP,42,SB6,Steamboat Ditch @ Rhodes Road,NV06-SC-73_00,Steamboat Ditch,,STEAMBOAT CREEK LONG TERM MONITORING SITE. SI...,...,NV06-SC-73_00,Steamboat Ditch,Its entire length,0,{F3FA89FC-E687-471C-BE6E-747FA0EF82CC},,,,,"{""x"": 263659.5251000002, ""y"": 4362257.7042, ""s..."
3,4,675,NDEP,70,HS74,Thomas Creek East of 2 Drumlin-Like Hills,NV04-HR-173_00,Thomas Creek,,,...,NV04-HR-173_00,Thomas Creek,From its origin to Sec 19 T35N R38E,1446,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00,"Aquatic Life, Fully Supporting",,"Aquatic Life, Fully Supporting,","{""x"": 439672.04760000017, ""y"": 4528543.5872000..."
4,5,676,NDEP,55,p-NVW04485-0196,Boone Creek Upper &[BIOP-0135],UNK,Unspecified,BIOP-0135,BIOASSESSMENT SAMPLING SITE SAMPLED 8/13/2013,...,UNK,Unspecified,Waterbody Not Specified,0,{CAFFBCDF-532C-471C-9B37-C1475BB5A288},,,,,"{""x"": 501875.8312999997, ""y"": 4400162.4877, ""s..."


In [33]:
water_quality_map = water_quality_sdf.spatial.plot()
water_quality_map

MapView(layout=Layout(height='400px', width='100%'))

### Select Data

After exploring the data and generating the spatial dataframes, I realized I would need to pare down the datasets somewhat. I realized that many of the variables in the sampling locations dataset were duplicating other columns (Organizati, WaterbodyC, WaterbodyN, Latitude, Longitude, WQXMonLocC, Station_Co, Station_Na, Hydrograph, Waterbody_, Waterbod_1, Descriptio), were opaque and not defined in the metadata (PLSSTown, PLSSRange, PLSSSectio, FirstMonDa, WQXExport), were missing values for much of the dataset (AlternateC, LocationDe, PLSSQ, PLSSQQ, FieldOffic, Designated, CatchmentA, ReferenceC, PrimaryLan, SecondaryL, Channelize, Dam, StreamOrig, StreamSubs, Comment_, ContinousD, HasProfile, CreatedBy, CreatedDat, ModifiedBy, ModifiedDa), or were otherwise completely useless (StationID, StationCod, StationNam, Reporting_,  NAC). I removed these from the dataset.

In [34]:
water_quality_sdf = water_quality_sdf.drop(columns=['Organizati', 'StationID', 'StationCod',
                                                     'StationNam', 'WaterbodyC', 'AlternateC', 'LocationDe',
                                                     'Latitude', 'Longitude', 'PLSSTown', 'PLSSRange', 'PLSSSectio', 'PLSSQ',
                                                     'PLSSQQ', 'FieldOffic', 'Designated', 'CatchmentA',
                                                     'FirstMonDa', 'ReferenceC', 'PrimaryLan', 'SecondaryL',
                                                     'Channelize', 'Dam', 'StreamOrig', 'StreamSubs',
                                                     'Comment_', 'ContinousD', 'HasProfile', 'WQXExport',
                                                     'WQXMonLocC', 'CreatedBy', 'CreatedDat', 'ModifiedBy', 'ModifiedDa',
                                                     'Reporting_', 'Station_Co', 'Station_Na', 'Hydrograph',
                                                     'Waterbody_', 'Waterbod_1', 'Descriptio', 'NAC'])

In [35]:
water_quality_sdf.head()

Unnamed: 0,OBJECTID_1,OBJECTID,WaterbodyN,CountyName,Basin,Level4Ecor,HUC12Code,OwnershipN,Elevation,PrimeGeolo,PrimeSoilT,ResultsSum,Organiza_1,GlobalID,Waterbody_Code,Attainment,Cause,ATTAINMENT_and_CAUSE,SHAPE
0,1,672,Unspecified,Pershing,Central Region,Upper Lahontan Basin,160401000000.0,Bureau of Land Management,0,"Miocene to Quaternary, basalt",Whirlo-Oxcorel-Beoska (s5629),"Bacteria:3, General:56, :12, Field:4, Metal:70",Nevada Division of Environmental Protection,{5BFC00DB-4486-4B64-8439-B0B442D9E476},,,,,"{""x"": 460877.7647000002, ""y"": 4529717.3892, ""s..."
1,2,673,Rye Patch Reservoir,Pershing,Humboldt River Basin,Lahontan Salt Shrub Basin,160401000000.0,Private,4137,"Quaternary, alluvium",Water (s8369),"Bacteria:22, General:446, :102, Field:57, Org...",Nevada Division of Environmental Protection,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00,"Aquatic Life, Not Supporting","Phosphorus Total SA Apr to Nov AQL, Selenium ...","Aquatic Life, Not Supporting, Phosphorus Total...","{""x"": 388956.9402999999, ""y"": 4480801.9739, ""s..."
2,3,674,Steamboat Ditch,Washoe,Truckee River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160501000000.0,Private,4606,"Quaternary, alluvium",Xman-Old Camp-Mizel (s5423),"Bacteria:89, General:1044, :267, Field:103, M...",Nevada Division of Environmental Protection,{F3FA89FC-E687-471C-BE6E-747FA0EF82CC},,,,,"{""x"": 263659.5251000002, ""y"": 4362257.7042, ""s..."
3,4,675,Thomas Creek,Humboldt,Humboldt River Basin,Lahontan Sagebrush Slopes,160401000000.0,Bureau of Land Management,4874,"Quaternary, alluvium",Shabliss-Rad-Bliss (s5630),"Bacteria:12, General:160, :36, Field:22, Meta...",Nevada Division of Environmental Protection,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00,"Aquatic Life, Fully Supporting",,"Aquatic Life, Fully Supporting,","{""x"": 439672.04760000017, ""y"": 4528543.5872000..."
4,5,676,Unspecified,Lander,Humboldt River Basin,Central Nevada Mid-Slope Woodland and Brushland,160401000000.0,Bureau of Land Management,0,"Middle Cambrian to Late Cambrian, shale",Xine-Hymas-Hapgood-Halacan-Attella (s5778),"Bacteria:4, General:57, :13, Field:5, Metal:56",Nevada Division of Environmental Protection,{CAFFBCDF-532C-471C-9B37-C1475BB5A288},,,,,"{""x"": 501875.8312999997, ""y"": 4400162.4877, ""s..."


### Data Cleaning, Construction, and Formatting

In addition to removing superfluous data, I did quite a lot to tidy the water quality part of the water dataset with Python in the notebook. Initially, the “Attainment” column included strings in the form of “Attainment Category, Assessment”. I created two new columns, one containing just the attainment category and the other containing the water quality assessment for that category.

In [36]:
attainment_category = []
assessment = []
for cell in water_quality_sdf["Attainment"]:
    if cell == 'None':
        attainment_category.append('None')
        assessment.append('None')
    else:
        splitcell = cell.split(", ")
        #print(splitcell)
        attainment_category.append(splitcell[0])
        assessment.append(splitcell[1])
water_quality_sdf.insert(loc=2, column='Attainment Category', value=attainment_category)
water_quality_sdf.insert(loc=3, column='Assessment', value=assessment)
water_quality_sdf.head()

Unnamed: 0,OBJECTID_1,OBJECTID,Attainment Category,Assessment,WaterbodyN,CountyName,Basin,Level4Ecor,HUC12Code,OwnershipN,...,PrimeGeolo,PrimeSoilT,ResultsSum,Organiza_1,GlobalID,Waterbody_Code,Attainment,Cause,ATTAINMENT_and_CAUSE,SHAPE
0,1,672,,,Unspecified,Pershing,Central Region,Upper Lahontan Basin,160401000000.0,Bureau of Land Management,...,"Miocene to Quaternary, basalt",Whirlo-Oxcorel-Beoska (s5629),"Bacteria:3, General:56, :12, Field:4, Metal:70",Nevada Division of Environmental Protection,{5BFC00DB-4486-4B64-8439-B0B442D9E476},,,,,"{""x"": 460877.7647000002, ""y"": 4529717.3892, ""s..."
1,2,673,Aquatic Life,Not Supporting,Rye Patch Reservoir,Pershing,Humboldt River Basin,Lahontan Salt Shrub Basin,160401000000.0,Private,...,"Quaternary, alluvium",Water (s8369),"Bacteria:22, General:446, :102, Field:57, Org...",Nevada Division of Environmental Protection,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00,"Aquatic Life, Not Supporting","Phosphorus Total SA Apr to Nov AQL, Selenium ...","Aquatic Life, Not Supporting, Phosphorus Total...","{""x"": 388956.9402999999, ""y"": 4480801.9739, ""s..."
2,3,674,,,Steamboat Ditch,Washoe,Truckee River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160501000000.0,Private,...,"Quaternary, alluvium",Xman-Old Camp-Mizel (s5423),"Bacteria:89, General:1044, :267, Field:103, M...",Nevada Division of Environmental Protection,{F3FA89FC-E687-471C-BE6E-747FA0EF82CC},,,,,"{""x"": 263659.5251000002, ""y"": 4362257.7042, ""s..."
3,4,675,Aquatic Life,Fully Supporting,Thomas Creek,Humboldt,Humboldt River Basin,Lahontan Sagebrush Slopes,160401000000.0,Bureau of Land Management,...,"Quaternary, alluvium",Shabliss-Rad-Bliss (s5630),"Bacteria:12, General:160, :36, Field:22, Meta...",Nevada Division of Environmental Protection,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00,"Aquatic Life, Fully Supporting",,"Aquatic Life, Fully Supporting,","{""x"": 439672.04760000017, ""y"": 4528543.5872000..."
4,5,676,,,Unspecified,Lander,Humboldt River Basin,Central Nevada Mid-Slope Woodland and Brushland,160401000000.0,Bureau of Land Management,...,"Middle Cambrian to Late Cambrian, shale",Xine-Hymas-Hapgood-Halacan-Attella (s5778),"Bacteria:4, General:57, :13, Field:5, Metal:56",Nevada Division of Environmental Protection,{CAFFBCDF-532C-471C-9B37-C1475BB5A288},,,,,"{""x"": 501875.8312999997, ""y"": 4400162.4877, ""s..."


Then, I removed the old “Attainment” column and the “ATTAINMENT_ and_CAUSE” column. 

In [37]:
water_quality_sdf = water_quality_sdf.drop(columns=['Attainment', 'ATTAINMENT_and_CAUSE']) #remove duplicate data
water_quality_sdf.head()

Unnamed: 0,OBJECTID_1,OBJECTID,Attainment Category,Assessment,WaterbodyN,CountyName,Basin,Level4Ecor,HUC12Code,OwnershipN,Elevation,PrimeGeolo,PrimeSoilT,ResultsSum,Organiza_1,GlobalID,Waterbody_Code,Cause,SHAPE
0,1,672,,,Unspecified,Pershing,Central Region,Upper Lahontan Basin,160401000000.0,Bureau of Land Management,0,"Miocene to Quaternary, basalt",Whirlo-Oxcorel-Beoska (s5629),"Bacteria:3, General:56, :12, Field:4, Metal:70",Nevada Division of Environmental Protection,{5BFC00DB-4486-4B64-8439-B0B442D9E476},,,"{""x"": 460877.7647000002, ""y"": 4529717.3892, ""s..."
1,2,673,Aquatic Life,Not Supporting,Rye Patch Reservoir,Pershing,Humboldt River Basin,Lahontan Salt Shrub Basin,160401000000.0,Private,4137,"Quaternary, alluvium",Water (s8369),"Bacteria:22, General:446, :102, Field:57, Org...",Nevada Division of Environmental Protection,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00,"Phosphorus Total SA Apr to Nov AQL, Selenium ...","{""x"": 388956.9402999999, ""y"": 4480801.9739, ""s..."
2,3,674,,,Steamboat Ditch,Washoe,Truckee River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160501000000.0,Private,4606,"Quaternary, alluvium",Xman-Old Camp-Mizel (s5423),"Bacteria:89, General:1044, :267, Field:103, M...",Nevada Division of Environmental Protection,{F3FA89FC-E687-471C-BE6E-747FA0EF82CC},,,"{""x"": 263659.5251000002, ""y"": 4362257.7042, ""s..."
3,4,675,Aquatic Life,Fully Supporting,Thomas Creek,Humboldt,Humboldt River Basin,Lahontan Sagebrush Slopes,160401000000.0,Bureau of Land Management,4874,"Quaternary, alluvium",Shabliss-Rad-Bliss (s5630),"Bacteria:12, General:160, :36, Field:22, Meta...",Nevada Division of Environmental Protection,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00,,"{""x"": 439672.04760000017, ""y"": 4528543.5872000..."
4,5,676,,,Unspecified,Lander,Humboldt River Basin,Central Nevada Mid-Slope Woodland and Brushland,160401000000.0,Bureau of Land Management,0,"Middle Cambrian to Late Cambrian, shale",Xine-Hymas-Hapgood-Halacan-Attella (s5778),"Bacteria:4, General:57, :13, Field:5, Metal:56",Nevada Division of Environmental Protection,{CAFFBCDF-532C-471C-9B37-C1475BB5A288},,,"{""x"": 501875.8312999997, ""y"": 4400162.4877, ""s..."


Because my goal was to eventually run some sort of data analysis, turning the categorical variable "Assessment" into a numerical value would be helpful. The given data contains four different choices for assessment: Fully Supporting, Not Supporting, Insufficient Information, and Not Assessed. Additionally, after the merge, there were “None” variables for observations that did not have water quality assessments. Because, for my purposes, Insufficient Information and Not Assessed don't provide me useful information, I removed those from the dataset, along with the “None” observations.

In [38]:
drop_list = []
for rnum in range(len(water_quality_sdf)):
    if water_quality_sdf.iloc[rnum]["Assessment"] == "Not Assessed" or water_quality_sdf.iloc[rnum]["Assessment"] == "Insufficient Information" or water_quality_sdf.iloc[rnum]["Assessment"] == "None":
        drop_list.append(rnum)
water_quality_sdf = water_quality_sdf.drop(drop_list)
water_quality_sdf.head()

Unnamed: 0,OBJECTID_1,OBJECTID,Attainment Category,Assessment,WaterbodyN,CountyName,Basin,Level4Ecor,HUC12Code,OwnershipN,Elevation,PrimeGeolo,PrimeSoilT,ResultsSum,Organiza_1,GlobalID,Waterbody_Code,Cause,SHAPE
1,2,673,Aquatic Life,Not Supporting,Rye Patch Reservoir,Pershing,Humboldt River Basin,Lahontan Salt Shrub Basin,160401000000.0,Private,4137,"Quaternary, alluvium",Water (s8369),"Bacteria:22, General:446, :102, Field:57, Org...",Nevada Division of Environmental Protection,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00,"Phosphorus Total SA Apr to Nov AQL, Selenium ...","{""x"": 388956.9402999999, ""y"": 4480801.9739, ""s..."
3,4,675,Aquatic Life,Fully Supporting,Thomas Creek,Humboldt,Humboldt River Basin,Lahontan Sagebrush Slopes,160401000000.0,Bureau of Land Management,4874,"Quaternary, alluvium",Shabliss-Rad-Bliss (s5630),"Bacteria:12, General:160, :36, Field:22, Meta...",Nevada Division of Environmental Protection,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00,,"{""x"": 439672.04760000017, ""y"": 4528543.5872000..."
5,6,677,Aquatic Life,Not Supporting,Lamoille Creek at the Humboldt River,Elko,Humboldt River Basin,Upper Humboldt Plains,160401000000.0,Private,5648,"Quaternary, alluvium",Welsum-Upville-Halleck-Crooked Creek (s5491),"Daily Continuous Digest:228, Bacteria:8, Gene...",Nevada Division of Environmental Protection,{32E8EE82-76F7-49F3-ACE0-CFA945A826FD},NV04-HR-15-B_00,Temperature SV AQL,"{""x"": 628431.2467, ""y"": 4513886.2609, ""spatial..."
6,7,678,Aquatic Life,Not Supporting,Steamboat Creek at the Truckee River,Washoe,Truckee River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160501000000.0,Private,4451,"Late Miocene to Middle Miocene, andesite",Voltaire-Vamp-Truckee-Fettic (s5405),"Bacteria:7, General:82, :21, Field:13, Metal:58",Nevada Division of Environmental Protection,{293AC1C2-2B35-4DB1-82AD-55966D76B7EB},NV06-SC-42-D_00,"Arsenic 1-hour AQL, Arsenic 96-hour AQL, Iron ...","{""x"": 264692.5533999996, ""y"": 4374140.4789, ""s..."
7,8,679,Aquatic Life,Not Supporting,"Carson River, East Fork at the West Fork",Douglas,Carson River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160502000000.0,Private,0,"Quaternary, alluvium",Voltaire-Cradlebaugh (s5719),"Bacteria:14, General:329, :44, Metal:87",Nevada Division of Environmental Protection,{3B873931-D200-4499-B8EA-A772B10A231E},NV08-CR-05_02,"Iron 96-hour, Silver 1-hour AQL, Total Phospho...","{""x"": 257918.74849999975, ""y"": 4316643.8409, ""..."


For the remaining observations, I assigned a numerical value, or score: 1 if the waterbody was fully supporting that attainment category, and 0 if it was not.

In [39]:
num_list = []
for rnum in range(len(water_quality_sdf)):
    if water_quality_sdf.iloc[rnum]["Assessment"] == "Fully Supporting":
        num_list.append(1)
    else:
        num_list.append(0)
water_quality_sdf.insert(loc=3, column='Score', value=num_list)
water_quality_sdf.head()

Unnamed: 0,OBJECTID_1,OBJECTID,Attainment Category,Score,Assessment,WaterbodyN,CountyName,Basin,Level4Ecor,HUC12Code,OwnershipN,Elevation,PrimeGeolo,PrimeSoilT,ResultsSum,Organiza_1,GlobalID,Waterbody_Code,Cause,SHAPE
1,2,673,Aquatic Life,0,Not Supporting,Rye Patch Reservoir,Pershing,Humboldt River Basin,Lahontan Salt Shrub Basin,160401000000.0,Private,4137,"Quaternary, alluvium",Water (s8369),"Bacteria:22, General:446, :102, Field:57, Org...",Nevada Division of Environmental Protection,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00,"Phosphorus Total SA Apr to Nov AQL, Selenium ...","{""x"": 388956.9402999999, ""y"": 4480801.9739, ""s..."
3,4,675,Aquatic Life,1,Fully Supporting,Thomas Creek,Humboldt,Humboldt River Basin,Lahontan Sagebrush Slopes,160401000000.0,Bureau of Land Management,4874,"Quaternary, alluvium",Shabliss-Rad-Bliss (s5630),"Bacteria:12, General:160, :36, Field:22, Meta...",Nevada Division of Environmental Protection,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00,,"{""x"": 439672.04760000017, ""y"": 4528543.5872000..."
5,6,677,Aquatic Life,0,Not Supporting,Lamoille Creek at the Humboldt River,Elko,Humboldt River Basin,Upper Humboldt Plains,160401000000.0,Private,5648,"Quaternary, alluvium",Welsum-Upville-Halleck-Crooked Creek (s5491),"Daily Continuous Digest:228, Bacteria:8, Gene...",Nevada Division of Environmental Protection,{32E8EE82-76F7-49F3-ACE0-CFA945A826FD},NV04-HR-15-B_00,Temperature SV AQL,"{""x"": 628431.2467, ""y"": 4513886.2609, ""spatial..."
6,7,678,Aquatic Life,0,Not Supporting,Steamboat Creek at the Truckee River,Washoe,Truckee River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160501000000.0,Private,4451,"Late Miocene to Middle Miocene, andesite",Voltaire-Vamp-Truckee-Fettic (s5405),"Bacteria:7, General:82, :21, Field:13, Metal:58",Nevada Division of Environmental Protection,{293AC1C2-2B35-4DB1-82AD-55966D76B7EB},NV06-SC-42-D_00,"Arsenic 1-hour AQL, Arsenic 96-hour AQL, Iron ...","{""x"": 264692.5533999996, ""y"": 4374140.4789, ""s..."
7,8,679,Aquatic Life,0,Not Supporting,"Carson River, East Fork at the West Fork",Douglas,Carson River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160502000000.0,Private,0,"Quaternary, alluvium",Voltaire-Cradlebaugh (s5719),"Bacteria:14, General:329, :44, Metal:87",Nevada Division of Environmental Protection,{3B873931-D200-4499-B8EA-A772B10A231E},NV08-CR-05_02,"Iron 96-hour, Silver 1-hour AQL, Total Phospho...","{""x"": 257918.74849999975, ""y"": 4316643.8409, ""..."


Lastly, I changed the “None” entries in the cause column to “N/A - Fully Supporting” to be more clear in what a “None” conveyed.

In [40]:
#https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html
# replacing the Nones in the Cause column with a string
na_list = []
for rnum in range(len(water_quality_sdf)):
    if water_quality_sdf.iloc[rnum]["Cause"] == "None":
        na_list.append("N/A - Fully Supporting")
    else:
        na_list.append(water_quality_sdf.iloc[rnum]["Cause"])
water_quality_sdf.insert(loc=5, column='Causes', value=na_list)
water_quality_sdf = water_quality_sdf.drop(columns=['Cause'])
water_quality_sdf.head()

Unnamed: 0,OBJECTID_1,OBJECTID,Attainment Category,Score,Assessment,Causes,WaterbodyN,CountyName,Basin,Level4Ecor,HUC12Code,OwnershipN,Elevation,PrimeGeolo,PrimeSoilT,ResultsSum,Organiza_1,GlobalID,Waterbody_Code,SHAPE
1,2,673,Aquatic Life,0,Not Supporting,"Phosphorus Total SA Apr to Nov AQL, Selenium ...",Rye Patch Reservoir,Pershing,Humboldt River Basin,Lahontan Salt Shrub Basin,160401000000.0,Private,4137,"Quaternary, alluvium",Water (s8369),"Bacteria:22, General:446, :102, Field:57, Org...",Nevada Division of Environmental Protection,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00,"{""x"": 388956.9402999999, ""y"": 4480801.9739, ""s..."
3,4,675,Aquatic Life,1,Fully Supporting,N/A - Fully Supporting,Thomas Creek,Humboldt,Humboldt River Basin,Lahontan Sagebrush Slopes,160401000000.0,Bureau of Land Management,4874,"Quaternary, alluvium",Shabliss-Rad-Bliss (s5630),"Bacteria:12, General:160, :36, Field:22, Meta...",Nevada Division of Environmental Protection,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00,"{""x"": 439672.04760000017, ""y"": 4528543.5872000..."
5,6,677,Aquatic Life,0,Not Supporting,Temperature SV AQL,Lamoille Creek at the Humboldt River,Elko,Humboldt River Basin,Upper Humboldt Plains,160401000000.0,Private,5648,"Quaternary, alluvium",Welsum-Upville-Halleck-Crooked Creek (s5491),"Daily Continuous Digest:228, Bacteria:8, Gene...",Nevada Division of Environmental Protection,{32E8EE82-76F7-49F3-ACE0-CFA945A826FD},NV04-HR-15-B_00,"{""x"": 628431.2467, ""y"": 4513886.2609, ""spatial..."
6,7,678,Aquatic Life,0,Not Supporting,"Arsenic 1-hour AQL, Arsenic 96-hour AQL, Iron ...",Steamboat Creek at the Truckee River,Washoe,Truckee River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160501000000.0,Private,4451,"Late Miocene to Middle Miocene, andesite",Voltaire-Vamp-Truckee-Fettic (s5405),"Bacteria:7, General:82, :21, Field:13, Metal:58",Nevada Division of Environmental Protection,{293AC1C2-2B35-4DB1-82AD-55966D76B7EB},NV06-SC-42-D_00,"{""x"": 264692.5533999996, ""y"": 4374140.4789, ""s..."
7,8,679,Aquatic Life,0,Not Supporting,"Iron 96-hour, Silver 1-hour AQL, Total Phospho...","Carson River, East Fork at the West Fork",Douglas,Carson River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160502000000.0,Private,0,"Quaternary, alluvium",Voltaire-Cradlebaugh (s5719),"Bacteria:14, General:329, :44, Metal:87",Nevada Division of Environmental Protection,{3B873931-D200-4499-B8EA-A772B10A231E},NV08-CR-05_02,"{""x"": 257918.74849999975, ""y"": 4316643.8409, ""..."


Because I spent so long tidying the water data and trying to solve the problems that arose from it, I did not merge the mining datasets and remove duplicates.

After my imports and tidying, I had three tables. The water table included the codes, names, counties, basins and other geographic designations, ownership, elevation , geologic era and soil composition, numeric contaminants, water quality attainment categories, assessments, causes, and numeric water quality scores for the waterbodies measured. The active mines table included the name, operator, commodity, county, and location of the mines active in 2021. The historical mines table included the name, commodity, location, and yield in tons per year from 1987 to 2019.

## Analysis

### Create feature layers and maps

First, I attempted to just create feature layers with the spatial dataframes and map them with the GIS map feature. I created one map for each of the mines datasets, one map that included both of the mines datasets even though it also included duplicates, and one map with the merged water quality dataset. This worked in some cases, but sometimes revealed many of the variables were missing from the attribute table; this error will be discussed in the evaluation section.

In [41]:
water_fl = water_quality_sdf.spatial.to_featurelayer(f"NV_Water_FL_{dt.now().strftime('%Y%m%d%H%M%S')}")
water_fl

The operation was attempted on an empty geometry.


According to Esri, “while this error can occur, it occurs so rarely that the typical causes have not been identified so no solution is available at this time,” (Esri, n.d.). Despite my best efforts and help from several sources, I was unable to reliably fix this. One of my next steps is to figure out how to reproduce it in another notebook so I can share this with Esri via their feedback form. This error made it pretty much impossible to do spatial analysis of the water quality data.

In [42]:
#https://pro.arcgis.com/en/pro-app/latest/tool-reference/tool-errors-and-warnings/160001-170000/tool-errors-and-warnings-160101-160125-160111.htm

In [43]:
watermap = gis.map("Nevada")
watermap.add_layer(water_fl)
watermap

MapView(layout=Layout(height='400px', width='100%'))

When mapped, the feature layer reveals it is missing many of the variables.

In [44]:
active_mines_fl = active_mines_sdf.spatial.to_featurelayer(f"NV_active_mines_fl_{dt.now().strftime('%Y%m%d%H%M%S')}")
active_mines_fl

In [45]:
activeminemap = gis.map("Nevada")
activeminemap.add_layer(active_mines_fl)
activeminemap

MapView(layout=Layout(height='400px', width='100%'))

In [46]:
historic_mines_fl = historic_mines_sdf.spatial.to_featurelayer(f"NV_historic_mines_fl_{dt.now().strftime('%Y%m%d%H%M%S')}")
historic_mines_fl

In [47]:
historicminemap = gis.map("Nevada")
historicminemap.add_layer(historic_mines_fl)
historicminemap

MapView(layout=Layout(height='400px', width='100%'))

In [48]:
minemap = gis.map("Nevada")
minemap.add_layer(active_mines_fl)
minemap.add_layer(historic_mines_fl)
minemap

MapView(layout=Layout(height='400px', width='100%'))

I was able to create a reference map that contains the locations of current and historical mines. I was not able to remove duplicates between the two datasets. The symbology is also not very informative and the map would need more work in order to be useful to the non-profit or other people.

### Pandas Profiling Report

I ran a Pandas Profiling Report on the water dataset. This allowed me to see histograms of the variables, which revealed a few interesting things about the data (for instance, the waterbodies were privately owned by a wide margin, with the second largest owner being the forest service). It also showed me a Phik correlation matrix which unfortunately didn’t reveal much in terms of what was correlated with the water quality score, but did show that there might be a correlation between what county, basin, and ecoregion (so, general location) a waterbody was in, and what the assessment causes were.

First, I tried to run it on the SDF as is.

In [49]:
water_quality_sdf.head()

Unnamed: 0,OBJECTID_1,OBJECTID,Attainment Category,Score,Assessment,Causes,WaterbodyN,CountyName,Basin,Level4Ecor,HUC12Code,OwnershipN,Elevation,PrimeGeolo,PrimeSoilT,ResultsSum,Organiza_1,GlobalID,Waterbody_Code,SHAPE
1,2,673,Aquatic Life,0,Not Supporting,"Phosphorus Total SA Apr to Nov AQL, Selenium ...",Rye Patch Reservoir,Pershing,Humboldt River Basin,Lahontan Salt Shrub Basin,160401000000.0,Private,4137,"Quaternary, alluvium",Water (s8369),"Bacteria:22, General:446, :102, Field:57, Org...",Nevada Division of Environmental Protection,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00,"{""x"": 388956.9402999999, ""y"": 4480801.9739, ""s..."
3,4,675,Aquatic Life,1,Fully Supporting,N/A - Fully Supporting,Thomas Creek,Humboldt,Humboldt River Basin,Lahontan Sagebrush Slopes,160401000000.0,Bureau of Land Management,4874,"Quaternary, alluvium",Shabliss-Rad-Bliss (s5630),"Bacteria:12, General:160, :36, Field:22, Meta...",Nevada Division of Environmental Protection,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00,"{""x"": 439672.04760000017, ""y"": 4528543.5872000..."
5,6,677,Aquatic Life,0,Not Supporting,Temperature SV AQL,Lamoille Creek at the Humboldt River,Elko,Humboldt River Basin,Upper Humboldt Plains,160401000000.0,Private,5648,"Quaternary, alluvium",Welsum-Upville-Halleck-Crooked Creek (s5491),"Daily Continuous Digest:228, Bacteria:8, Gene...",Nevada Division of Environmental Protection,{32E8EE82-76F7-49F3-ACE0-CFA945A826FD},NV04-HR-15-B_00,"{""x"": 628431.2467, ""y"": 4513886.2609, ""spatial..."
6,7,678,Aquatic Life,0,Not Supporting,"Arsenic 1-hour AQL, Arsenic 96-hour AQL, Iron ...",Steamboat Creek at the Truckee River,Washoe,Truckee River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160501000000.0,Private,4451,"Late Miocene to Middle Miocene, andesite",Voltaire-Vamp-Truckee-Fettic (s5405),"Bacteria:7, General:82, :21, Field:13, Metal:58",Nevada Division of Environmental Protection,{293AC1C2-2B35-4DB1-82AD-55966D76B7EB},NV06-SC-42-D_00,"{""x"": 264692.5533999996, ""y"": 4374140.4789, ""s..."
7,8,679,Aquatic Life,0,Not Supporting,"Iron 96-hour, Silver 1-hour AQL, Total Phospho...","Carson River, East Fork at the West Fork",Douglas,Carson River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160502000000.0,Private,0,"Quaternary, alluvium",Voltaire-Cradlebaugh (s5719),"Bacteria:14, General:329, :44, Metal:87",Nevada Division of Environmental Protection,{3B873931-D200-4499-B8EA-A772B10A231E},NV08-CR-05_02,"{""x"": 257918.74849999975, ""y"": 4316643.8409, ""..."


In [50]:
#https://pypi.org/project/pandas-profiling/
water_report = ProfileReport(water_quality_sdf, title="Pandas Profiling Report")

In [51]:
water_report

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

KeyError: 'n_distinct'



Unfortunately, it doesn't work. I don't know what it means by n_distinct. I tried to solve this by removing columns and rerunning, and eventually narrowed the issue down to the SHAPE column. So, for just this analysis, I'll be removing the SHAPE column.

In [52]:
water_quality_sdf_profile = water_quality_sdf.drop(columns=['SHAPE']).copy(deep = True)
water_quality_sdf_profile.head()

Unnamed: 0,OBJECTID_1,OBJECTID,Attainment Category,Score,Assessment,Causes,WaterbodyN,CountyName,Basin,Level4Ecor,HUC12Code,OwnershipN,Elevation,PrimeGeolo,PrimeSoilT,ResultsSum,Organiza_1,GlobalID,Waterbody_Code
1,2,673,Aquatic Life,0,Not Supporting,"Phosphorus Total SA Apr to Nov AQL, Selenium ...",Rye Patch Reservoir,Pershing,Humboldt River Basin,Lahontan Salt Shrub Basin,160401000000.0,Private,4137,"Quaternary, alluvium",Water (s8369),"Bacteria:22, General:446, :102, Field:57, Org...",Nevada Division of Environmental Protection,{989F4263-52CD-4E97-A44C-0839F78D4025},NV04-HR-81_00
3,4,675,Aquatic Life,1,Fully Supporting,N/A - Fully Supporting,Thomas Creek,Humboldt,Humboldt River Basin,Lahontan Sagebrush Slopes,160401000000.0,Bureau of Land Management,4874,"Quaternary, alluvium",Shabliss-Rad-Bliss (s5630),"Bacteria:12, General:160, :36, Field:22, Meta...",Nevada Division of Environmental Protection,{C58DB131-C1E0-459B-95C4-3545DE76180C},NV04-HR-173_00
5,6,677,Aquatic Life,0,Not Supporting,Temperature SV AQL,Lamoille Creek at the Humboldt River,Elko,Humboldt River Basin,Upper Humboldt Plains,160401000000.0,Private,5648,"Quaternary, alluvium",Welsum-Upville-Halleck-Crooked Creek (s5491),"Daily Continuous Digest:228, Bacteria:8, Gene...",Nevada Division of Environmental Protection,{32E8EE82-76F7-49F3-ACE0-CFA945A826FD},NV04-HR-15-B_00
6,7,678,Aquatic Life,0,Not Supporting,"Arsenic 1-hour AQL, Arsenic 96-hour AQL, Iron ...",Steamboat Creek at the Truckee River,Washoe,Truckee River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160501000000.0,Private,4451,"Late Miocene to Middle Miocene, andesite",Voltaire-Vamp-Truckee-Fettic (s5405),"Bacteria:7, General:82, :21, Field:13, Metal:58",Nevada Division of Environmental Protection,{293AC1C2-2B35-4DB1-82AD-55966D76B7EB},NV06-SC-42-D_00
7,8,679,Aquatic Life,0,Not Supporting,"Iron 96-hour, Silver 1-hour AQL, Total Phospho...","Carson River, East Fork at the West Fork",Douglas,Carson River Basin,Sierra Nevada-Influenced Semiarid Hills and Basin,160502000000.0,Private,0,"Quaternary, alluvium",Voltaire-Cradlebaugh (s5719),"Bacteria:14, General:329, :44, Metal:87",Nevada Division of Environmental Protection,{3B873931-D200-4499-B8EA-A772B10A231E},NV08-CR-05_02


In [53]:
water_report2 = ProfileReport(water_quality_sdf_profile, title="Pandas Profiling Report")

In [54]:
water_report2

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]



Something that this report makes particularly clear is that the only Attainment Category is "Aquatic Life". Early in this project, there were many attainment categories. Somewhere down the line, these observations were either removed or had their attainment categories changed. I've tried to track down where this happens, and I don't know. I think it might be in the join, but that's mostly because it's the only part I don't feel like I fully understand.

Regardless, the report works now. The most interesting part of the analysis was the appearance of a correlation between general location and assessment cause. I would have to do further exploration of the data to make sure this is a genuine correlation and not created by confounding factors. I was not able to determine if a correlation exists between proximity to a mine and water quality, mostly because I was not able to add a data point about proximity to a mine to the water quality dataset. 

### Calculate hotspots

I calculated hotspots for the historic mines dataset. I created two maps that showed the hotspots for largest yield in 1987 and in 2019. Because the dataset is so wide, I suspect this is less a hotspot and more that these were the only mines producing anything those years, but it does show that the highly productive mining districts may have shifted over those 30 years.

In [55]:
historic_mines_hotspots_item = features.analyze_patterns.find_hot_spots(
    historic_mines_sdf, analysis_field='F2019',
    output_name=f"2019_mines_hotspots012_{dt.now().strftime('%Y%m%d%H%M%S')}")

{"cost": 0.298}


In [56]:
# Create a map of Nevada to show hot spot outputs
hot_spots_map = gis.map("Ely, NV")
hot_spots_map.add_layer(historic_mines_hotspots_item)
hot_spots_map.zoom = 7
hot_spots_map.basemap = 'gray'
hot_spots_map

MapView(layout=Layout(height='400px', width='100%'))

In [57]:
# Run optimized hot spots, using sales volume as the analysis field
historic_mines_hotspots_item_2 = features.analyze_patterns.find_hot_spots(
    historic_mines_sdf, analysis_field='F1987',
    output_name=f"2019_mines_hotspots02_{dt.now().strftime('%Y%m%d%H%M%S')}")

{"cost": 0.298}


In [58]:
# Create a map of Nevada to show hot spot outputs
hot_spots_map = gis.map("Nevada")
hot_spots_map.add_layer(historic_mines_hotspots_item_2)
hot_spots_map.zoom = 6
hot_spots_map.basemap = 'gray'
hot_spots_map

MapView(layout=Layout(height='400px', width='100%'))

I presented this analysis and a cleaned up notebook to the class. I also communicated about my progress with the non-profit, allowing me to move on to the evaluation stage.

## Evaluation and Conclusion

The non-profit initially came to me interested in a gold mine tracker that would show existing gold mines, proposed new gold mines, and proposed expansions of existing gold mines. Additionally, it would collect data points related to environmental justice and ecological preservation. They wanted it to be interactive, easily usable by the communities that are impacted by mining. The goal was to help communities become aware of when projects are proposed or expanded, to help fill the gap in the existing public input process and allow them to advocate for themselves in a more informed way.

It goes without saying that the project I did for this course did not meet this goal. However, this is a product I believe in, and one I would like to be a part of creating. This project allowed me to gain familiarity with the tools I could use to create that web app, begin to gather the data that would propel it, and get the ball rolling on the non-profit’s end to making it a reality. When I reached out to show them the progress I made during this course, I let them know that while I’m disappointed I haven’t made much progress toward their vision, I am still interested in seeing it through. While it would be dependent on funding, the non-profit staff member I’ve been working with would like to see it happen, and we’ll hopefully work together on it in the future.

Additionally, I’d like to return to this project in a future course or on my own time. I think that I’ve made a decent amount of progress toward the kind of final product I could try to get published or shown in a conference. It’s been a lot of time and effort, and I’d like to see it continue past this semester. At the very least, I plan on making a public-facing version that I could host as part of my data science portfolio going forward.

I’ve learned a lot from this project. It was definitely the first time I really engaged with the data science process, and I think now that I’ve been around the block I’ll be more confident tackling big projects like this in the future. I really enjoyed returning to Python for the notebooks and hope to integrate them into future GIS projects. Re-learning how to write for-loops, how to troubleshoot errors, how to read and understand the documentation of the various tools, and most of all how to persevere through frustration was a very valuable experience. I hope I’m able to make progress on this project again in the future, but regardless, I know I made progress on building my GIS and data science skills.

## References

Donnelly, Patrick. (2022). Western U.S. lithium. Google My Maps. Retrieved November 4, 2022, from https://www.google.com/maps/d/u/0/viewer?mid=1kq8TRUSMR97kg-XQ22kdQpE4lUT0Rj49&ll=37.972185499091744%2C-116.537113075&z=7

Environmental Protection Agency. (2022, March). Metal Mining. TRI National Analysis. Retrieved December 16, 2022, from https://www.epa.gov/trinationalanalysis/metal-mining

Esri. (n.d.). 160111: The feature has empty geometry. ArcGIS Pro. Retrieved December 16, 2022, from https://pro.arcgis.com/en/pro-app/latest/tool-reference/tool-errors-and-warnings/160001-170000/tool-errors-and-warnings-160101-160125-160111.htm

Fashola, M. O., Ngole-Jeme, V. M., & Babalola, O. O. (2016). Heavy Metal Pollution from Gold Mines: Environmental Effects and Bacterial Strategies for Resistance. International journal of environmental research and public health, 13(11), 1047. https://doi.org/10.3390/ijerph13111047

Impatterson_NDOM. (2022, September). Production of Minerals in Nevada 1987-Present . Arcgis.com. Retrieved December 16, 2022, from https://www.arcgis.com/home/item.html?id=9ee00802ed9544ee9850d6510e8d5885

Milman, O. (2022, October 18). There's lithium in them Thar Hills – but fears grow over us 'white gold' boom. The Guardian. Retrieved December 16, 2022, from https://www.theguardian.com/us-news/2022/oct/18/lithium-mining-nevada-boom-car-battery-us-climate-crisis

Mining in Nevada. Nevada Mining Association. (2022, August 1). Retrieved December 16, 2022, from https://www.nevadamining.org/mining-in-nevada/

Mining. Nevada Governor's Office of Economic Development. (2022, August 17). Retrieved December 16, 2022, from https://goed.nv.gov/key-industries/mining/

Most, M. (2022, December). Assessment 2022. Open Data. Retrieved December 16, 2022, from https://data-ndep-gis.opendata.arcgis.com/datasets/f9b1a5a981694d879148e414da5686f8_0/about

Most, M. (2022, December). BWQP Assessed Sample Sites 2022. Open Data. Retrieved December 16, 2022, from https://data-ndep-gis.opendata.arcgis.com/datasets/1872ab5db02e4e0081a74246b9b43c64_0/about

Muntean, J.L., and Davis, D.A., 2021, Nevada active mines and energy producers: Nevada Bureau of Mines and Geology Open-File Report 21-1, compilation scale 1:1,000,000. https://www.arcgis.com/home/item.html?id=826bb580b0a44bbbbc85d46133746747

Nevada Department of Conservation and Natural Resources. (2020, February). Nevada Quality Assurance Program Plan for Surface Water Sampling. Nevada Division of Environmental Protection, Bureau of Water Quality Planning. Retrieved December 16, 2022, from https://ndep.nv.gov/uploads/water-wqm-docs/QAPP_FINAL_2020.pdf

Nevada Department of Conservation and Natural Resources. (2022, February). Nevada 2020-2022 Water Quality Integrated Report. Nevada Division of Environmental Protection. Retrieved December 16, 2022, from https://ndep.nv.gov/uploads/water-wqm-docs/IR2022FINAL_Report.pdf

Nevada Department of Environmental Protection. (n.d.). BWQP. Open Data. Retrieved December 16, 2022, from https://data-ndep-gis.opendata.arcgis.com/search?q=2022&amp;tags=bwqp 

The University of Nevada, Reno. (2020). Nevada Mineral Explorer. ArcGIS web application. Retrieved December 16, 2022, from https://nbmg.maps.arcgis.com/apps/webappviewer/index.html?id=e279fb2d805945b59dea1cf661f5b4e6 

Vega, F. A., Covelo, E. F., Andrade, M. L., &amp; Marcet, P. (2004). Relationships between heavy metals content and soil properties in minesoils. Analytica Chimica Acta, 524(1-2), 141–150. https://doi.org/10.1016/j.aca.2004.06.073