# Grabbing Jewel's SnowPit Data

1. Grab pit and format correctly to align with SnowEx database
2. Loop through folder of .csv's and create correct temperature profiles
3. Merge all profiles into one "dictionary"??? geopandas magic dataframe?

GoogleDrive PitDataTemperatures Folder: https://drive.google.com/drive/folders/1SFyBKULqiRLi52yiKO5mxRn5Icsp83Lm?usp=sharing

Test Blob (single temp profile .csv): 1DufHRYtWqxQUFrfqdFx1FViFbaTCkmAV

In [1]:
# set up and select a pit to work with
import pandas as pd
df = pd.read_csv('https://drive.google.com/uc?export=download&id=1DufHRYtWqxQUFrfqdFx1FViFbaTCkmAV', header=None)
df

Unnamed: 0,0,1,2
0,Location:,County Line,
1,Type:,Open,
2,Easting:,756905,
3,Northing:,4324353,
4,Surveyor:,J. Lund,
5,Time:,3/12/20 9:58,
6,Time Type:,AM,
7,Air Temp:,-2.8,
8,Hs:,111,
9,Ground:,"Rough ground, no info on vegetation.",


Pandas subsetting during imports: we'll read in the snow temp data and meta data separately then transpose, duplicate and combine the data

In [2]:
# grab the snow temperature data
dfdata = pd.read_csv('https://drive.google.com/uc?export=download&id=1DufHRYtWqxQUFrfqdFx1FViFbaTCkmAV', header = 13)
dfdata

Unnamed: 0,Hs,Temperature,Notes
0,110,-4.4,
1,100,-3.5,
2,90,-3.2,
3,80,-3.1,
4,70,-3.1,
5,60,-3.0,
6,50,-2.8,
7,40,-2.7,
8,30,-2.6,
9,20,-2.2,


In [3]:
# grab the metadata only
dfmeta = df.head(12)
dfmeta

Unnamed: 0,0,1,2
0,Location:,County Line,
1,Type:,Open,
2,Easting:,756905,
3,Northing:,4324353,
4,Surveyor:,J. Lund,
5,Time:,3/12/20 9:58,
6,Time Type:,AM,
7,Air Temp:,-2.8,
8,Hs:,111,
9,Ground:,"Rough ground, no info on vegetation.",


In [4]:
# transpose the metadata to columns
dftranspose = dfmeta.transpose().head(2)
dftranspose

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11
0,Location:,Type:,Easting:,Northing:,Surveyor:,Time:,Time Type:,Air Temp:,Hs:,Ground:,Notes:,Wx:
1,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."


In [5]:
# grab column names from the first row
newheader = dftranspose.iloc[0]
newheader

0      Location:
1          Type:
2       Easting:
3      Northing:
4      Surveyor:
5          Time:
6     Time Type:
7      Air Temp:
8            Hs:
9        Ground:
10        Notes:
11           Wx:
Name: 0, dtype: object

In [6]:
# remove the column names row
dftranspose = dftranspose[1:]
dftranspose

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11
1,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."


In [7]:
# add column names to transposed metadata
dftranspose.columns = newheader
dftranspose

Unnamed: 0,Location:,Type:,Easting:,Northing:,Surveyor:,Time:,Time Type:,Air Temp:,Hs:,Ground:,Notes:,Wx:
1,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."


In [8]:
# duplicate rows of metadata to length of snow pit temperature profile
dfdup = dftranspose.reindex(dftranspose.index.repeat(len(dfdata))).reset_index(drop=True)
dfdup

Unnamed: 0,Location:,Type:,Easting:,Northing:,Surveyor:,Time:,Time Type:,Air Temp:,Hs:,Ground:,Notes:,Wx:
0,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."
1,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."
2,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."
3,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."
4,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."
5,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."
6,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."
7,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."
8,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."
9,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,111,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th..."


In [9]:
# merge dfdata with dfdup to make a complete df of snow profile temps
dfmerged = pd.concat([dfdup, dfdata], axis = 1)
del dfmerged['Hs:'] # remove the total HS measurement column
dfmerged

Unnamed: 0,Location:,Type:,Easting:,Northing:,Surveyor:,Time:,Time Type:,Air Temp:,Ground:,Notes:,Wx:,Hs,Temperature,Notes
0,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",110,-4.4,
1,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",100,-3.5,
2,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",90,-3.2,
3,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",80,-3.1,
4,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",70,-3.1,
5,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",60,-3.0,
6,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",50,-2.8,
7,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",40,-2.7,
8,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",30,-2.6,
9,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",20,-2.2,


In [10]:
# create snow pit location geometry
import geopandas 
gdf = geopandas.GeoDataFrame(
    dfmerged, geometry=geopandas.points_from_xy(dfmerged['Easting:'], dfmerged['Northing:']))

# add lat.lon data
import utm
def utm_to_latlon(coords, zone_number, zone_letter):
    easting = coords[0]
    northing = coords[1]
    return utm.to_latlon(easting, northing, zone_number, zone_letter)

# Using nested list comprehension
gdf ["lat_lon_tuple"] = [[utm_to_latlon(xy, 13, "N") for xy in tuple(geom.coords)] for geom in gdf.geometry]

gdf

Unnamed: 0,Location:,Type:,Easting:,Northing:,Surveyor:,Time:,Time Type:,Air Temp:,Ground:,Notes:,Wx:,Hs,Temperature,Notes,geometry,lat_lon_tuple
0,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",110,-4.4,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"
1,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",100,-3.5,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"
2,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",90,-3.2,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"
3,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",80,-3.1,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"
4,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",70,-3.1,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"
5,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",60,-3.0,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"
6,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",50,-2.8,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"
7,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",40,-2.7,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"
8,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",30,-2.6,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"
9,County Line,Open,756905,4324353,J. Lund,3/12/20 9:58,AM,-2.8,"Rough ground, no info on vegetation.",Temperature 9:58-10:10.,"Calm, mostly cloudy, but sun still shining (th...",20,-2.2,,POINT (756905.000 4324353.000),"[(39.03049699624331, -102.03214985021661)]"


Set new column names and reorder to align with SnowEx database column headers

In [11]:
# grab the database column names for LayerData (in correct order)
db_colnames = ["site_name", "date", "time_created", "time_updated", "id", "doi", "date_accessed", "instrument", "type", "units", "...", "geom", "time", "depth", "site_id", 
               "bottom_depth", "comments", "sample_a", "sample_b", "sample_c", "value"]
db_colnames

['site_name',
 'date',
 'time_created',
 'time_updated',
 'id',
 'doi',
 'date_accessed',
 'instrument',
 'type',
 'units',
 '...',
 'geom',
 'time',
 'depth',
 'site_id',
 'bottom_depth',
 'comments',
 'sample_a',
 'sample_b',
 'sample_c',
 'value']

In [12]:
# reorder dfmerged to match database order and add blank columns to fill
dfreorder = pd.DataFrame(columns = db_colnames)
dfreorder

Unnamed: 0,site_name,date,time_created,time_updated,id,doi,date_accessed,instrument,type,units,...,geom,time,depth,site_id,bottom_depth,comments,sample_a,sample_b,sample_c,value


In [13]:
# add Jewel's pit data into appropriate columns - probably a cleaner way to do this, but I'm too new to Py...
dfreorder['geom'] = gdf['geometry']
dfreorder['depth'] = dfmerged['Hs']
dfreorder['value'] = dfmerged['Temperature']
dfreorder['site_id'] = dfmerged['Location:']
dfreorder['type'] = 'temperature'
dfreorder['site_name'] = 'Grand Mesa'
dfreorder['time_updated'] = 'None'
dfreorder['instrument'] = 'None'
dfreorder['units'] = 'None'
dfreorder

Unnamed: 0,site_name,date,time_created,time_updated,id,doi,date_accessed,instrument,type,units,...,geom,time,depth,site_id,bottom_depth,comments,sample_a,sample_b,sample_c,value
0,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,110,County Line,,,,,,-4.4
1,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,100,County Line,,,,,,-3.5
2,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,90,County Line,,,,,,-3.2
3,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,80,County Line,,,,,,-3.1
4,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,70,County Line,,,,,,-3.1
5,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,60,County Line,,,,,,-3.0
6,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,50,County Line,,,,,,-2.8
7,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,40,County Line,,,,,,-2.7
8,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,30,County Line,,,,,,-2.6
9,Grand Mesa,,,,,,,,temperature,,...,POINT (756905.000 4324353.000),,20,County Line,,,,,,-2.2


In [14]:
# split date/time to separate columns
date_time = dfmerged['Time:'].str.split(expand=True)
dfreorder['date'] = date_time[0]
dfreorder['time'] = date_time[1]
dfreorder

Unnamed: 0,site_name,date,time_created,time_updated,id,doi,date_accessed,instrument,type,units,...,geom,time,depth,site_id,bottom_depth,comments,sample_a,sample_b,sample_c,value
0,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,110,County Line,,,,,,-4.4
1,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,100,County Line,,,,,,-3.5
2,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,90,County Line,,,,,,-3.2
3,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,80,County Line,,,,,,-3.1
4,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,70,County Line,,,,,,-3.1
5,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,60,County Line,,,,,,-3.0
6,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,50,County Line,,,,,,-2.8
7,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,40,County Line,,,,,,-2.7
8,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,30,County Line,,,,,,-2.6
9,Grand Mesa,3/12/20,,,,,,,temperature,,...,POINT (756905.000 4324353.000),9:58,20,County Line,,,,,,-2.2


Creating a SiteData dataframe that conforms with SnowEx dataframe structure

In [18]:
# create dataframe with headers from SnowEx PointData
Sitedf = pd.DataFrame(columns = ['site_name', 'date', 'time_created', 'time_updated', 'id', 'site_id',
       'doi', 'date_accessed', 'latitude', 'longitude', 'northing', 'easting',
       'elevation', 'utm_zone', 'geom', 'time', 'slope_angle', 'aspect',
       'air_temp', 'total_depth', 'weather_description', 'precip', 'sky_cover',
       'wind', 'ground_condition', 'ground_roughness', 'ground_vegetation',
       'vegetation_height', 'tree_canopy', 'site_notes'])
Sitedf

Unnamed: 0,site_name,date,time_created,time_updated,id,site_id,doi,date_accessed,latitude,longitude,...,weather_description,precip,sky_cover,wind,ground_condition,ground_roughness,ground_vegetation,vegetation_height,tree_canopy,site_notes


In [19]:
# Fill the columns with metadata from Jewel's snow pits
Sitedf['site_name'] = 'Grand Mesa'
Sitedf['date'] = dfreorder['date'].iloc[0]
Sitedf['time_updated'] = 'None'
Sitedf['site_id'] = dfreorder['site_id'].iloc[0]
Sitedf['geom'] = dfreorder['geom'].iloc[0]
Sitedf['utm_zone'] = 12
Sitedf['time'] = dfreorder['time'].iloc[0]
Sitedf['air_temp'] = dftranspose['Air Temp:']
Sitedf['total_depth'] = dftranspose['Hs:']
Sitedf['weather_description'] = dftranspose['Wx:']
Sitedf['ground_roughness'] = dftranspose['Ground:']
Sitedf['tree_canopy'] = dftranspose['Type:']
Sitedf['site_notes'] = dftranspose['Notes:']

Sitedf

Unnamed: 0,site_name,date,time_created,time_updated,id,site_id,doi,date_accessed,latitude,longitude,...,weather_description,precip,sky_cover,wind,ground_condition,ground_roughness,ground_vegetation,vegetation_height,tree_canopy,site_notes
1,,,,,,,,,,,...,"Calm, mostly cloudy, but sun still shining (th...",,,,,"Rough ground, no info on vegetation.",,,Open,Temperature 9:58-10:10.


In [20]:
# grab lat/lon data by separating the column in gdf
latlon = gdf['lat_lon_tuple'].iloc[0]
LL = latlon[0]
Sitedf['latitude'] = LL[0]
Sitedf['longitude'] = LL[1]

Sitedf

Unnamed: 0,site_name,date,time_created,time_updated,id,site_id,doi,date_accessed,latitude,longitude,...,weather_description,precip,sky_cover,wind,ground_condition,ground_roughness,ground_vegetation,vegetation_height,tree_canopy,site_notes
1,,,,,,,,,39.030497,-102.03215,...,"Calm, mostly cloudy, but sun still shining (th...",,,,,"Rough ground, no info on vegetation.",,,Open,Temperature 9:58-10:10.


# Here we've got a LayerData dataframe and a SiteData dataframe for a single pit profile
- We could create a PointData dataframe to comply with the SnowEx database...
- We need to loop this script to create a massive pair of dataframes for use in our project

In [21]:
# LayerData dataframe
print(dfreorder.head)

<bound method NDFrame.head of      site_name     date time_created time_updated   id  doi date_accessed  \
0   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
1   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
2   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
3   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
4   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
5   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
6   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
7   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
8   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
9   Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
10  Grand Mesa  3/12/20          NaN         None  NaN  NaN           NaN   
11  Grand Mesa  3/12/20          NaN         N

In [22]:
# SiteData dataframe
print(Sitedf.head)

<bound method NDFrame.head of   site_name date time_created time_updated   id site_id  doi date_accessed  \
1       NaN  NaN          NaN          NaN  NaN     NaN  NaN           NaN   

    latitude  longitude  ...  \
1  39.030497 -102.03215  ...   

                                 weather_description precip sky_cover  wind  \
1  Calm, mostly cloudy, but sun still shining (th...    NaN       NaN   NaN   

  ground_condition                      ground_roughness ground_vegetation  \
1              NaN  Rough ground, no info on vegetation.               NaN   

  vegetation_height tree_canopy               site_notes  
1               NaN        Open  Temperature 9:58-10:10.  

[1 rows x 30 columns]>
