# CSO QaQc: Large and Frequent Depth Values

In this notebook I'm going to explore large `depth` values as well as values that may be quite frequent. These frequent values can be potentially adding some sort of bias to our data so it is worth exploring.

In [1]:
import requests
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.io.img_tiles as cimgt

# Import necessary packages, may need more or less as I go.

In [2]:
CSO_CA = gpd.read_file('CSO_CA')
CSO_CA['timestamp'] = pd.to_datetime(CSO_CA.timestamp)

We are going to use the California domain to begin our exploration, just for the sake of simplicity. We'll also add a `flag` column to data frame like we usually do when sorting through cso data.

In [3]:
CSO_CA['flags'] = False
CSO_CA

Unnamed: 0,id,author,depth,source,timestamp,elevation,geometry,flags
0,qi3PPSnQ,Brandon Schwartz,234.000000,SnowPilot,2017-01-02 23:43:42+00:00,2414.250977,POINT (-120.35359 39.36812),False
1,2ZNehL12,Andy Anderson,170.000000,SnowPilot,2017-01-05 23:15:22+00:00,2204.346436,POINT (-120.23536 39.22963),False
2,/c2MfVDw,Andy Anderson,212.000000,SnowPilot,2017-01-07 03:30:39+00:00,2453.741699,POINT (-120.36517 39.35684),False
3,agjWJrxV,Brandon Schwartz,205.000000,SnowPilot,2017-01-11 23:58:35+00:00,2092.592285,POINT (-120.29299 39.34429),False
4,0nxaaHLX,Andy Anderson,505.000000,SnowPilot,2017-01-13 01:57:59+00:00,2424.332275,POINT (-120.25981 39.23891),False
...,...,...,...,...,...,...,...,...
412,MRe9TADy,Yunqing Cao,368.999986,MountainHub,2019-04-21 00:23:11.476000+00:00,2248.087646,POINT (-120.21737 38.62686),False
413,s9m3kkTq,Yunqing Cao,370.999986,MountainHub,2019-04-21 00:25:31.330000+00:00,2247.337402,POINT (-120.21720 38.62713),False
414,3muH06T8,Alexander Wong,200.000000,MountainHub,2019-05-11 22:52:38.848000+00:00,2249.148193,POINT (-120.21753 38.62687),False
415,4pmY8RAI,Alexander Wong,217.000000,MountainHub,2019-05-11 22:54:49.887001+00:00,2249.259766,POINT (-120.21756 38.62685),False


### We will add three distinct `depth` measurements to look for and see how frequent they are. These measurements are:

* 240 cm, this is the usual snow probe length that you can buy for retail.

* 300 cm, considered more professional however still can be bought.

* 330 cm, Dave had reportedly used this length of snow probe before.

In [4]:
Depth1 = 240
Depth2 = 300
Depth3 = 330

In [5]:
CSO_CA.loc[CSO_CA['depth'] == Depth1, 'flags'] = True
CSO_CADepth1 = CSO_CA.loc[CSO_CA['flags'] == True]
CSO_CADepth1

Unnamed: 0,id,author,depth,source,timestamp,elevation,geometry,flags
28,q1jTwQxR,Logan Talbott,240.0,SnowPilot,2017-02-12 02:01:57+00:00,2130.639404,POINT (-120.17395 39.25671),True
393,GPAFfWUr,Benjamin Hatchett,240.0,MountainHub,2019-03-23 17:05:35.131001+00:00,2378.8125,POINT (-119.91522 38.95086),True


In [6]:
CSO_CA['flags'] = False
CSO_CA.loc[CSO_CA['depth'] == Depth2, 'flags'] = True
CSO_CADepth2 = CSO_CA.loc[CSO_CA['flags'] == True]
CSO_CADepth2

Unnamed: 0,id,author,depth,source,timestamp,elevation,geometry,flags
29,y8XlLEr/,Aaron Liimatainen,300.0,SnowPilot,2017-02-12 03:59:22+00:00,2437.01416,POINT (-119.98492 38.78380),True
39,4YLCFydh,Nick Schiestel,300.0,SnowPilot,2017-02-20 05:18:11+00:00,2398.224854,POINT (-120.36093 39.34935),True
236,BNrZdbdy,Andy Anderson,300.0,SnowPilot,2018-03-24 00:42:14+00:00,2680.141846,POINT (-119.91744 39.31752),True
385,j+RX9nCS,Andy Anderson,300.0,SnowPilot,2019-02-19 21:00:00+00:00,2241.675293,POINT (-120.31284 39.30204),True
392,DNVMNHVS,Dan McEvoy,300.0,MountainHub,2019-03-15 19:44:12.853000+00:00,2160.469971,POINT (-120.32355 39.31506),True
411,IgtHYuGD,David Miller,300.0,SnowPilot,2019-04-03 19:00:00+00:00,2223.294678,POINT (-120.24971 39.15719),True


In [7]:
CSO_CA['flags'] = False
CSO_CA.loc[CSO_CA['depth'] == Depth3, 'flags'] = True
CSO_CADepth3 = CSO_CA.loc[CSO_CA['flags'] == True]
CSO_CADepth3

Unnamed: 0,id,author,depth,source,timestamp,elevation,geometry,flags
340,WLc5wmSI,Logan Talbott,330.0,SnowPilot,2019-01-19 19:15:00+00:00,2114.563721,POINT (-120.28033 39.20698),True


Looking at the Data frames we don't see too many values that fit these criteria. An interesting thing to note is that many of these values come from SnowPilot, which is the place I least expected to see a some form of snowprobe length bias... However the total number of observations that fit this discription is so low that maybe assessing this as a "bias" might not be worth it, as this bias would really only matter for observations from `MountainHub` as that is where large values could be limited due to snow probe length.