### Soil Moisture Data Cleaning

Rev0

Samir D. Patel

1/2/2018

This notebook takes the "_Decagaon_Corrections" Excel data for soil moisture across each of the four field sites (Aeschliman, Jones, OD and Wolff) and imports/cleans/exports the relevant moisture data in CSV format.

In [1]:
import pandas as pd

#### Aeschliman

Import of data from Excel sheet to Pandas.  Location of file currently set locally, will need change.

In [47]:
data_aes= pd.read_excel("./decagon-soilmoisture/Aeschliman_Decagon_corrections_FINAL.xlsx", sheetname = 'Data', skiprows = 0, usecols=range(36), header = 1)

In [48]:
data_aes = data_aes.rename(columns = {'Date-Time':'Date'})

In [49]:
data_aes["Field"] = "AES"

In [50]:
data_aes = pd.melt(data_aes, id_vars=['Depth', 'Date', 'Field'], var_name='Sensor_Full_Name', value_name='Soil_Moisture')

In [52]:
data_aes ['Sensor Number'] = data_aes['Sensor_Full_Name'].str.strip()

In [53]:
data_aes['Sensor Number'] = data_aes['Sensor Number'].str.extract('(\d+)').astype(int)

  if __name__ == '__main__':


In [55]:
data_aes.tail(5)

Unnamed: 0,Depth,Date,Field,Sensor_Full_Name,Soil_Moisture,Sensor Number
305825,5,2016-09-26,AES,A12-C,,12
305826,5,2016-09-27,AES,A12-C,,12
305827,5,2016-09-28,AES,A12-C,,12
305828,5,2016-09-29,AES,A12-C,,12
305829,5,2016-09-30,AES,A12-C,,12


#### Jones

In [57]:
data_j= pd.read_excel("./decagon-soilmoisture/Jones_Decagon_corrections_FINAL.xlsx", sheetname = 'Data', usecols=range(76), skiprows = 0, header = 1)

Note: Use of "Date" instead of "Date-Time"

In [58]:
data_j = data_j.drop(labels = ["Week", "Year"], axis = 1)

Changing "Date " column name (removing space at the end) to "Date" to match other field sites

In [59]:
data_j = data_j.rename(columns = {'Date ':'Date'})

In [61]:
data_j["Field"] = "J"

In [63]:
data_j = pd.melt(data_j, id_vars=['Depth', 'Date', 'Field'], var_name='Sensor_Full_Name', value_name='Soil_Moisture')

In [65]:
data_j ['Sensor Number'] = data_j['Sensor_Full_Name'].str.strip()

In [66]:
data_j['Sensor Number'] = data_j['Sensor Number'].str.extract('(\d+)').astype(int)

  if __name__ == '__main__':


In [68]:
data_j.tail(5)

Unnamed: 0,Depth,Date,Field,Sensor_Full_Name,Soil_Moisture,Sensor Number
642595,5,2016-09-26,J,J12.obs_hand,,12
642596,5,2016-09-27,J,J12.obs_hand,,12
642597,5,2016-09-28,J,J12.obs_hand,,12
642598,5,2016-09-29,J,J12.obs_hand,,12
642599,5,2016-09-30,J,J12.obs_hand,,12


#### OD

In [69]:
data_od = pd.read_excel("./decagon-soilmoisture/OD_Decagon_corrections_FINAL.xlsx", sheetname = 'Data', skiprows = 0, header = 1, usecols = range(76))

Note: Use of "Date" instead of "Date-Time"

In [70]:
data_od = data_od.drop(labels = ["Week", "Year"], axis = 1)

In [71]:
data_od["Field"] = "OD"

In [72]:
data_od = pd.melt(data_od, id_vars=['Depth', 'Date', 'Field'], var_name='Sensor_Full_Name', value_name='Soil_Moisture')

In [74]:
data_od ['Sensor Number'] = data_od['Sensor_Full_Name'].str.strip()

In [75]:
data_od['Sensor Number'] = data_od['Sensor Number'].str.extract('(\d+)').astype(int)

  if __name__ == '__main__':


In [76]:
data_od.head(5)

Unnamed: 0,Depth,Date,Field,Sensor_Full_Name,Soil_Moisture,Sensor Number
0,1,2011-12-04,OD,OD-1,0.185089,1
1,1,2011-12-05,OD,OD-1,0.184942,1
2,1,2011-12-06,OD,OD-1,0.184778,1
3,1,2011-12-07,OD,OD-1,0.184778,1
4,1,2011-12-08,OD,OD-1,0.184581,1


#### Wolff

In [78]:
data_w = pd.read_excel("./decagon-soilmoisture/Wolff_Decagon_Corrections_FINAL.xlsx", sheetname = 'Data', skiprows = 0, header = 1, usecols = range(76))

Note: Use of "Date" instead of "Date-Time"

In [79]:
data_w = data_w.drop(labels = ["Week", "Year"], axis = 1)

In [80]:
data_w["Field"] = "W"

In [81]:
data_w = pd.melt(data_w, id_vars=['Depth', 'Date', 'Field'], var_name='Sensor_Full_Name', value_name='Soil_Moisture')

In [83]:
data_w['Sensor Number'] = data_w['Sensor_Full_Name'].str.strip()

In [84]:
data_w['Sensor Number'] = data_w['Sensor Number'].str.extract('(\d+)').astype(int)

  if __name__ == '__main__':


In [85]:
data_w.head(5)

Unnamed: 0,Depth,Date,Field,Sensor_Full_Name,Soil_Moisture,Sensor Number
0,1,2011-11-22,W,W1,0.241346,1
1,1,2011-11-23,W,W1,0.232718,1
2,1,2011-11-24,W,W1,0.242795,1
3,1,2011-11-25,W,W1,0.247554,1
4,1,2011-11-26,W,W1,0.237257,1


In [90]:
frames = [data_aes, data_j, data_od, data_w]

In [93]:
data_all = pd.concat(frames)

Saving Output Data for each to CSV

In [86]:
data_aes.shape

(305830, 6)

In [87]:
data_j.shape

(642600, 6)

In [88]:
data_od.shape

(634680, 6)

In [89]:
data_w.shape

(585504, 6)

In [94]:
data_all.shape

(2168614, 6)

In [96]:
data_all.to_csv('./output_data/soil_moisture.csv', index=False)