### Soil Moisture Data Cleaning

Rev0

Samir D. Patel

1/2/2018

This notebook takes the "_Decagaon_Corrections" Excel data for soil moisture across each of the four field sites (Aeschliman, Jones, OD and Wolff) and imports/cleans/exports the relevant moisture data in CSV format.

In [1]:
import pandas as pd

#### Aeschliman

Import of data from Excel sheet to Pandas.  Location of file currently set locally, will need change.

In [2]:
data_aes= pd.read_excel("./decagon-soilmoisture/Aeschliman_Decagon_corrections_FINAL.xlsx", sheetname = 'Data', skiprows = 0, usecols=range(36), header = 1)

Pivoting data to widen data set, splitting each sensor/location's moisture data by various sensor depths

In [3]:
data_aes = data_aes.pivot("Date-Time", "Depth")

https://stackoverflow.com/questions/22113668/convert-list-of-tuples-of-mixed-data-types-into-all-string

Tuples require full conversion to strings to flatten the column headers.

In [4]:
tuples = [tuple(map(str, val2)) for val2 in data_aes]

In [5]:
flat_col_names = ['_'.join(col_tuple) for col_tuple in tuples]

In [6]:
data_aes.columns = flat_col_names

In [7]:
data_aes = data_aes.reset_index()

Changing "Date-Time" column name to "Date" to match other field sites

In [8]:
data_aes = data_aes.rename(columns = {'Date-Time':'Date'})

In [9]:
data_aes.head(5)

Unnamed: 0,Date,A1_1,A1_2,A1_3,A1_4,A1_5,A2_1,A2_2,A2_3,A2_4,...,A11-C_1,A11-C_2,A11-C_3,A11-C_4,A11-C_5,A12-C_1,A12-C_2,A12-C_3,A12-C_4,A12-C_5
0,2011-10-29,0.264665,0.2925,0.282253,,0.260778,0.245481,0.250873,0.248524,0.208491,...,0.069194,0.172116,0.248614,,,0.211317,0.161496,0.245646,,
1,2011-10-30,0.264987,0.291608,0.281947,,0.260126,0.246836,0.249533,0.24785,0.208864,...,0.070121,0.163867,0.23898,,,0.217598,0.167845,0.236705,,
2,2011-10-31,0.264665,0.291013,0.28164,,0.259799,0.246836,0.249533,0.247512,0.208864,...,0.070121,0.160181,0.235352,,,0.209858,0.169812,0.231497,,
3,2011-11-01,0.264987,0.290416,0.281026,,0.259799,0.246836,0.249533,0.247512,0.208864,...,0.071045,0.157406,0.233159,,,0.216637,0.170597,0.228205,,
4,2011-11-02,0.264342,0.290117,0.281026,,0.259472,0.246159,0.249533,0.247174,0.208864,...,0.069194,0.15648,0.231324,,,0.20201,0.170989,0.225551,,


#### Jones

In [10]:
data_j= pd.read_excel("./decagon-soilmoisture/Jones_Decagon_corrections_FINAL.xlsx", sheetname = 'Data', usecols=range(76), skiprows = 0, header = 1)

Note: Use of "Date" instead of "Date-Time"

In [11]:
data_j = data_j.drop(labels = ["Week", "Year"], axis = 1)

In [12]:
data_j = data_j.pivot("Date ", "Depth")

https://stackoverflow.com/questions/22113668/convert-list-of-tuples-of-mixed-data-types-into-all-string

In [13]:
tuples = [tuple(map(str, val2)) for val2 in data_j]

In [14]:
flat_col_names = ['_'.join(col_tuple) for col_tuple in tuples]

In [15]:
data_j.columns = flat_col_names

In [16]:
data_j = data_j.reset_index()

Changing "Date " column name (removing space at the end) to "Date" to match other field sites

In [17]:
data_j = data_j.rename(columns = {'Date ':'Date'})

In [18]:
data_j.head(5)

Unnamed: 0,Date,J1_1,J1_2,J1_3,J1_4,J1_5,J2_1,J2_2,J2_3,J2_4,...,J11.obs_hand_1,J11.obs_hand_2,J11.obs_hand_3,J11.obs_hand_4,J11.obs_hand_5,J12.obs_hand_1,J12.obs_hand_2,J12.obs_hand_3,J12.obs_hand_4,J12.obs_hand_5
0,2011-11-12,0.121137,0.242148,0.20507,0.2994,0.280056,,,,,...,,,,,,,,,,
1,2011-11-13,0.129111,0.236851,0.202477,0.290752,0.27843,0.126477,,0.295852,0.255225,...,,,,,,,,,,
2,2011-11-14,0.154589,0.231111,0.199288,0.280979,0.276956,0.12441,,0.290487,0.255446,...,,,,,,,,,,
3,2011-11-15,0.156646,0.226416,0.196443,0.273185,0.275528,0.122798,,0.283174,0.254936,...,,,,,,,,,,
4,2011-11-16,0.155028,0.222312,0.193868,0.266936,0.274225,0.121387,,0.278055,0.254329,...,,,,,,,,,,


#### OD

In [19]:
data_od = pd.read_excel("./decagon-soilmoisture/OD_Decagon_corrections_FINAL.xlsx", sheetname = 'Data', skiprows = 0, header = 1, usecols = range(76))

Note: Use of "Date" instead of "Date-Time"

In [20]:
data_od = data_od.drop(labels = ["Week", "Year"], axis = 1)

In [21]:
data_od = data_od.pivot("Date", "Depth")

https://stackoverflow.com/questions/22113668/convert-list-of-tuples-of-mixed-data-types-into-all-string

In [22]:
tuples = [tuple(map(str, val2)) for val2 in data_od]

In [23]:
flat_col_names = ['_'.join(col_tuple) for col_tuple in tuples]

In [24]:
data_od.columns = flat_col_names

In [25]:
data_od = data_od.reset_index()

In [26]:
data_od.head(5)

Unnamed: 0,Date,OD-1_1,OD-1_2,OD-1_3,OD-1_4,OD-1_5,OD-2_1,OD-2_2,OD-2_3,OD-2_4,...,OD-11.obs_new_1,OD-11.obs_new_2,OD-11.obs_new_3,OD-11.obs_new_4,OD-11.obs_new_5,OD-12.obs_new_1,OD-12.obs_new_2,OD-12.obs_new_3,OD-12.obs_new_4,OD-12.obs_new_5
0,2011-12-04,0.185089,0.245197,0.288532,0.380638,0.393566,0.198131,0.353326,0.334477,0.272076,...,,,,,,,,,,
1,2011-12-05,0.184942,0.243595,0.287064,0.380433,0.394028,0.19794,0.344783,0.327793,0.270786,...,,,,,,,,,,
2,2011-12-06,0.184778,0.24227,0.285843,0.380173,0.394564,0.197797,0.337872,0.322693,0.269239,...,,,,,,,,,,
3,2011-12-07,0.184778,0.241142,0.28482,0.379834,0.394932,0.197718,0.332383,0.318751,0.267632,...,,,,,,,,,,
4,2011-12-08,0.184581,0.240268,0.283932,0.379512,0.395199,0.197463,0.328092,0.315557,0.266058,...,,,,,,,,,,


#### Wolff

In [27]:
data_w = pd.read_excel("./decagon-soilmoisture/Wolff_Decagon_Corrections_FINAL.xlsx", sheetname = 'Data', skiprows = 0, header = 1, usecols = range(76))

Note: Use of "Date" instead of "Date-Time"

In [28]:
data_w = data_w.drop(labels = ["Week", "Year"], axis = 1)

In [29]:
data_w = data_w.pivot("Date", "Depth")

https://stackoverflow.com/questions/22113668/convert-list-of-tuples-of-mixed-data-types-into-all-string

In [30]:
tuples = [tuple(map(str, val2)) for val2 in data_w]

In [31]:
flat_col_names = ['_'.join(col_tuple) for col_tuple in tuples]

In [32]:
data_w.columns = flat_col_names

In [33]:
data_w = data_w.reset_index()

In [34]:
data_w.head(5)

Unnamed: 0,Date,W1_1,W1_2,W1_3,W1_4,W1_5,W2_1,W2_2,W2_3,W2_4,...,W.obs_11_new_1,W.obs_11_new_2,W.obs_11_new_3,W.obs_11_new_4,W.obs_11_new_5,W.obs_12_new_1,W.obs_12_new_2,W.obs_12_new_3,W.obs_12_new_4,W.obs_12_new_5
0,2011-11-22,0.241346,0.200636,0.220599,0.255078,0.202288,0.199415,0.155185,0.165937,0.163477,...,,,,,,,,,,
1,2011-11-23,0.232718,0.200586,0.220705,0.255364,0.202698,0.205393,0.155185,0.165937,0.16382,...,,,,,,,,,,
2,2011-11-24,0.242795,0.200791,0.221472,0.256038,0.203028,0.207029,0.155115,0.16575,0.164401,...,,,,,,,,,,
3,2011-11-25,0.247554,0.200823,0.222043,0.25667,0.203295,0.223914,0.155185,0.165255,0.165016,...,,,,,,,,,,
4,2011-11-26,0.237257,0.200223,0.222043,0.257177,0.203625,0.219615,0.154959,0.165238,0.165221,...,,,,,,,,,,


Saving Output Data for each to CSV

In [35]:
data_aes.to_csv('./output_data/soil_moisture_AES.csv', index=False)

In [36]:
data_j.to_csv('./output_data/soil_moisture_J.csv', index=False)

In [37]:
data_od.to_csv('./output_data/soil_moisture_OD.csv', index=False)

In [38]:
data_w.to_csv('./output_data/soil_moisture_W.csv', index=False)