### MAC Season 6 Data Cleaning
#### Traits
- aboveground dry biomass
- canopy height

#### Season Dates
- Planting: 2018-04-25
- Last Day of Harvest: 2018-08-01

This notebook contains the code used to clean Season 6 Sorghum Data from the Maricopa Agricultural Center (MAC). The raw input data were queried from betydb using this `R` code:
```
library(traits)

options(betydb_url = "https://terraref.ncsa.illinois.edu/bety/",
        betydb_api_version = 'v1',
        betydb_key = 'abcde_super_secret_key_1234')

season_6 <- betydb_query(sitename  = "~Season 6",
                         limit     =  "none")

write.csv(season_6, file = 'mac_season_six_2020-04-22.csv')
```
- Environmental data were downloaded from the MAC weather station [website](https://cals.arizona.edu/azmet/06.htm). 
- Please email ejcain@arizona.edu with any questions or comments. 

#### Custom Functions Used

In [3]:
def check_for_subplots(df):
    
    """
    Function takes a dataframe as argument and checks for sitename subplots ending in ' E' or ' W'
    Will return rows with subplots, if any.
    """
    return df.loc[(df.sitename.str.endswith(' E')) | (df.sitename.str.endswith(' W'))]

#### I. Import python packages

In [4]:
import datetime
import numpy as np
import pandas as pd
import sqlalchemy
import sqlite3

#### II. Read in dataset

In [5]:
df_0 = pd.read_csv('data/raw/mac_season_six_2020-04-22.csv', low_memory=False)
print(df_0.shape)
# df_0.head(3)

(380574, 39)


Unnamed: 0.1,Unnamed: 0,checked,result_type,id,citation_id,site_id,treatment_id,sitename,city,lat,...,n,statname,stat,notes,access_level,cultivar,entity,method_name,view_url,edit_url
0,1,0,traits,6003833949,6000000020,6000014606,6000000032,MAC Field Scanner Season 6 Range 41 Column 5,Maricopa,33.075985,...,,,,,2,PI329519,,Whole above ground biomass moisture content,https://terraref.ncsa.illinois.edu/bety/traits...,https://terraref.ncsa.illinois.edu/bety/traits...
1,2,0,traits,6003833979,6000000020,6000015300,6000000032,MAC Field Scanner Season 6 Range 41 Column 15,Maricopa,33.075985,...,,,,,2,PI585954,,Whole above ground biomass moisture content,https://terraref.ncsa.illinois.edu/bety/traits...,https://terraref.ncsa.illinois.edu/bety/traits...
2,3,0,traits,6003833985,6000000020,6000015304,6000000032,MAC Field Scanner Season 6 Range 42 Column 3,Maricopa,33.076021,...,,,,,2,PI330807,,Whole above ground biomass moisture content,https://terraref.ncsa.illinois.edu/bety/traits...,https://terraref.ncsa.illinois.edu/bety/traits...


#### III. Drop Columns

In [6]:
# df_0.columns

Index(['Unnamed: 0', 'checked', 'result_type', 'id', 'citation_id', 'site_id',
       'treatment_id', 'sitename', 'city', 'lat', 'lon', 'scientificname',
       'commonname', 'genus', 'species_id', 'cultivar_id', 'author',
       'citation_year', 'treatment', 'date', 'time', 'raw_date', 'month',
       'year', 'dateloc', 'trait', 'trait_description', 'mean', 'units', 'n',
       'statname', 'stat', 'notes', 'access_level', 'cultivar', 'entity',
       'method_name', 'view_url', 'edit_url'],
      dtype='object')

In [8]:
cols_to_drop = ['Unnamed: 0', 'checked', 'result_type', 'id', 'citation_id', 'site_id', 'treatment_id', 'city', 
                'scientificname', 'commonname', 'genus', 'species_id', 'cultivar_id', 'author',
                'citation_year', 'time', 'raw_date', 'month', 'year', 'dateloc', 'n', 'statname', 'stat', 'notes', 
                'access_level', 'entity', 'view_url', 'edit_url', 'treatment']

In [9]:
df_1 = df_0.drop(labels=cols_to_drop, axis=1)
print(df_1.shape)
# df_1.head(3)

(380574, 10)


Unnamed: 0,sitename,lat,lon,date,trait,trait_description,mean,units,cultivar,method_name
0,MAC Field Scanner Season 6 Range 41 Column 5,33.075985,-111.974983,2018 Aug 1 (America/Phoenix),aboveground_biomass_moisture,Whole above ground biomass moisture content,80.3,%,PI329519,Whole above ground biomass moisture content
1,MAC Field Scanner Season 6 Range 41 Column 15,33.075985,-111.974819,2018 Aug 1 (America/Phoenix),aboveground_biomass_moisture,Whole above ground biomass moisture content,73.8,%,PI585954,Whole above ground biomass moisture content
2,MAC Field Scanner Season 6 Range 42 Column 3,33.076021,-111.975015,2018 Aug 1 (America/Phoenix),aboveground_biomass_moisture,Whole above ground biomass moisture content,76.5,%,PI330807,Whole above ground biomass moisture content


In [10]:
for col in df_1.columns:
    print(f'{col}: {df_1[col].nunique()}')

sitename: 847
lat: 847
lon: 847
date: 127
trait: 16
trait_description: 9
mean: 189620
units: 8
cultivar: 326
method_name: 13


#### IV. Change `date` format

In [11]:
new_dates = []

for d in df_1.date.values:
    
    # strip '(America/Phoenix)' string from date
    if 'Phoenix' in d:
        new_name = d[:-18]
        new_dates.append(new_name)
    
    else:
        new_name = d
        new_dates.append(new_name)
        

# check that length of new dates matches number of rows
print(len(new_dates))
print(df_1.shape[0])

380574
380574


Convert string dates to datetime

In [12]:
iso_format_dates = pd.to_datetime(new_dates)

Add new column with datetime values

In [13]:
# copy df to avoid SettingWithCopyWarning

df_2 = df_1.copy()
df_2['date_1'] = iso_format_dates

print(df_2.shape)
# df_2.head(3)

(380574, 11)


Unnamed: 0,sitename,lat,lon,date,trait,trait_description,mean,units,cultivar,method_name,date_1
0,MAC Field Scanner Season 6 Range 41 Column 5,33.075985,-111.974983,2018 Aug 1 (America/Phoenix),aboveground_biomass_moisture,Whole above ground biomass moisture content,80.3,%,PI329519,Whole above ground biomass moisture content,2018-08-01
1,MAC Field Scanner Season 6 Range 41 Column 15,33.075985,-111.974819,2018 Aug 1 (America/Phoenix),aboveground_biomass_moisture,Whole above ground biomass moisture content,73.8,%,PI585954,Whole above ground biomass moisture content,2018-08-01
2,MAC Field Scanner Season 6 Range 42 Column 3,33.076021,-111.975015,2018 Aug 1 (America/Phoenix),aboveground_biomass_moisture,Whole above ground biomass moisture content,76.5,%,PI330807,Whole above ground biomass moisture content,2018-08-01


#### V. Extract Range & Column Values for Location

In [14]:
df_3 = df_2.copy()

df_3['range'] = df_3['sitename'].str.extract("Range (\d+)").astype(int)
df_3['column'] = df_3['sitename'].str.extract("Column (\d+)").astype(int)

# df_3.sample(n=3)

Unnamed: 0,sitename,lat,lon,date,trait,trait_description,mean,units,cultivar,method_name,date_1,range,column
85630,MAC Field Scanner Season 6 Range 28 Column 14,33.075518,-111.974835,2018 May 26 (America/Phoenix),leaf_length,Length of leaf from tip to stem along the midrib.,196.132,mm,PI156463,3D scanner to leaf length and width,2018-05-26,28,14
296023,MAC Field Scanner Season 6 Range 30 Column 7,33.07559,-111.97495,2017 Apr 27,emergence_count,number of plants counted within plot or subplo...,70.0,,PI217691,Stereo RGB data to emergence count,2017-04-27,30,7
50545,MAC Field Scanner Season 6 Range 11 Column 3,33.074907,-111.975015,2018 Jun 24,canopy_cover,Fraction of ground covered by plant,97.891245,%,PI170787,Green Canopy Cover Estimation from Field Scann...,2018-06-24,11,3


#### VI. Drop, Rename, & Reorder Columns

In [15]:
# drop string date column

df_4 = df_3.drop(labels=['date'], axis=1)

In [16]:
df_5 = df_4.rename({'date_1': 'date', 'mean': 'value'}, axis=1)

In [17]:
new_col_order = ['sitename', 'range', 'column', 'lat', 'lon', 'date', 'trait', 'trait_description', 'method_name', 'cultivar', 'value', 'units']

df_6 = pd.DataFrame(data=df_5, columns=new_col_order).reset_index(drop=True)
print(df_6.shape)
# df_6.head(3)

(380574, 12)


Unnamed: 0,sitename,range,column,lat,lon,date,trait,trait_description,method_name,cultivar,value,units
0,MAC Field Scanner Season 6 Range 41 Column 5,41,5,33.075985,-111.974983,2018-08-01,aboveground_biomass_moisture,Whole above ground biomass moisture content,Whole above ground biomass moisture content,PI329519,80.3,%
1,MAC Field Scanner Season 6 Range 41 Column 15,41,15,33.075985,-111.974819,2018-08-01,aboveground_biomass_moisture,Whole above ground biomass moisture content,Whole above ground biomass moisture content,PI585954,73.8,%
2,MAC Field Scanner Season 6 Range 42 Column 3,42,3,33.076021,-111.975015,2018-08-01,aboveground_biomass_moisture,Whole above ground biomass moisture content,Whole above ground biomass moisture content,PI330807,76.5,%


In [16]:
df_6.trait.unique()

array(['surface_temperature', 'canopy_height', 'canopy_cover',
       'leaf_angle_mean', 'leaf_angle_alpha', 'leaf_angle_beta',
       'leaf_angle_chi', 'aboveground_biomass_moisture',
       'aboveground_fresh_biomass', 'leaf_width', 'leaf_length',
       'aboveground_dry_biomass', 'panicle_count', 'panicle_volume',
       'panicle_surface_area', 'stalk_diameter_fixed_height',
       'emergence_count'], dtype=object)

#### VI. Select for specific traits
- aboveground dry biomass
- canopy height

### A. Aboveground Dry Biomass

In [19]:
adb_0 = df_6.loc[df_6.trait == 'aboveground_dry_biomass']
print(adb_0.shape)
# adb_0.tail(3)

(617, 12)


Unnamed: 0,sitename,range,column,lat,lon,date,trait,trait_description,method_name,cultivar,value,units
335172,MAC Field Scanner Season 6 Range 52 Column 14,52,14,33.076381,-111.974836,2018-08-01,aboveground_dry_biomass,Aboveground Dry Biomass,Whole above ground biomass at harvest,SP1516,11800.0,kg / ha
335173,MAC Field Scanner Season 6 Range 52 Column 15,52,15,33.076381,-111.974819,2018-08-01,aboveground_dry_biomass,Aboveground Dry Biomass,Whole above ground biomass at harvest,SP1516,6910.0,kg / ha
345474,MAC Field Scanner Season 6 Range 30 Column 14,30,14,33.07559,-111.974835,2018-08-01,aboveground_dry_biomass,Aboveground Dry Biomass,Whole above ground biomass at harvest,PI576401,10500.0,kg / ha


##### Check for E and W subplots

In [21]:
# will have no output if there are no subplots

check_for_subplots(adb_0)

Unnamed: 0,sitename,range,column,lat,lon,date,trait,trait_description,method_name,cultivar,value,units


In [23]:
# check date ranges - should only include two dates of harvesting

print(adb_0.date.min())
print(adb_0.date.max())

2018-07-31 00:00:00
2018-08-01 00:00:00


#### Drop & Rename Columns, Reset Index

In [31]:
adb_1 = adb_0.drop(labels=['trait', 'trait_description', 'method_name'], axis=1)
print(adb_1.shape)
# adb_1.head()

(617, 9)


Unnamed: 0,sitename,range,column,lat,lon,date,cultivar,value,units
5938,MAC Field Scanner Season 6 Range 30 Column 13,30,13,33.07559,-111.974852,2018-08-01,PI535792,8450.0,kg / ha
5939,MAC Field Scanner Season 6 Range 16 Column 13,16,13,33.075087,-111.974851,2018-08-01,PI570074,7140.0,kg / ha
5940,MAC Field Scanner Season 6 Range 19 Column 14,19,14,33.075195,-111.974835,2018-08-01,PI570431,15600.0,kg / ha
5941,MAC Field Scanner Season 6 Range 20 Column 6,20,6,33.075231,-111.974966,2018-08-01,PI562981,13000.0,kg / ha
5942,MAC Field Scanner Season 6 Range 22 Column 8,22,8,33.075302,-111.974933,2018-08-01,PI563021,8570.0,kg / ha


In [32]:
adb_2 = adb_1.rename({'value': 'aboveground_dry_biomass'}, axis=1)
print(adb_2.shape)
# adb_2.tail()

(617, 9)


Unnamed: 0,sitename,range,column,lat,lon,date,cultivar,aboveground_dry_biomass,units
335170,MAC Field Scanner Season 6 Range 50 Column 3,50,3,33.076309,-111.975016,2018-08-01,PI540518,13900.0,kg / ha
335171,MAC Field Scanner Season 6 Range 51 Column 5,51,5,33.076345,-111.974983,2018-08-01,PI563022,8270.0,kg / ha
335172,MAC Field Scanner Season 6 Range 52 Column 14,52,14,33.076381,-111.974836,2018-08-01,SP1516,11800.0,kg / ha
335173,MAC Field Scanner Season 6 Range 52 Column 15,52,15,33.076381,-111.974819,2018-08-01,SP1516,6910.0,kg / ha
345474,MAC Field Scanner Season 6 Range 30 Column 14,30,14,33.07559,-111.974835,2018-08-01,PI576401,10500.0,kg / ha


In [33]:
adb_3 = adb_2.reset_index(drop=True)
print(adb_3.shape)
# adb_3.head()

(617, 9)


Unnamed: 0,sitename,range,column,lat,lon,date,cultivar,aboveground_dry_biomass,units
0,MAC Field Scanner Season 6 Range 30 Column 13,30,13,33.07559,-111.974852,2018-08-01,PI535792,8450.0,kg / ha
1,MAC Field Scanner Season 6 Range 16 Column 13,16,13,33.075087,-111.974851,2018-08-01,PI570074,7140.0,kg / ha
2,MAC Field Scanner Season 6 Range 19 Column 14,19,14,33.075195,-111.974835,2018-08-01,PI570431,15600.0,kg / ha
3,MAC Field Scanner Season 6 Range 20 Column 6,20,6,33.075231,-111.974966,2018-08-01,PI562981,13000.0,kg / ha
4,MAC Field Scanner Season 6 Range 22 Column 8,22,8,33.075302,-111.974933,2018-08-01,PI563021,8570.0,kg / ha


#### Check for duplicates

In [38]:
adb_3.duplicated().value_counts()

False    362
True     255
dtype: int64

In [39]:
adb_duplicates = adb_3.loc[adb_3.duplicated() == True]
print(adb_duplicates.shape)
# adb_duplicates.sample(n=3)

(255, 9)


Unnamed: 0,sitename,range,column,lat,lon,date,cultivar,aboveground_dry_biomass,units
426,MAC Field Scanner Season 6 Range 37 Column 5,37,5,33.075842,-111.974983,2018-08-01,PI569423,6190.0,kg / ha
536,MAC Field Scanner Season 6 Range 4 Column 8,4,8,33.074655,-111.974933,2018-07-31,PI329711,12300.0,kg / ha
383,MAC Field Scanner Season 6 Range 8 Column 11,8,11,33.074799,-111.974884,2018-08-01,PI570042,9730.0,kg / ha


In [40]:
adb_3.loc[(adb_3.sitename == 'MAC Field Scanner Season 6 Range 37 Column 5') & (adb_3.cultivar == 'PI569423')]

Unnamed: 0,sitename,range,column,lat,lon,date,cultivar,aboveground_dry_biomass,units
341,MAC Field Scanner Season 6 Range 37 Column 5,37,5,33.075842,-111.974983,2018-08-01,PI569423,6190.0,kg / ha
426,MAC Field Scanner Season 6 Range 37 Column 5,37,5,33.075842,-111.974983,2018-08-01,PI569423,6190.0,kg / ha


#### Drop Duplicates

In [41]:
adb_4 = adb_3.drop_duplicates(ignore_index=True)
print(adb_4.shape)
adb_4.tail(3)

(362, 9)


Unnamed: 0,sitename,range,column,lat,lon,date,cultivar,aboveground_dry_biomass,units
357,MAC Field Scanner Season 6 Range 52 Column 2,52,2,33.076381,-111.975032,2018-08-01,SP1516,9940.0,kg / ha
358,MAC Field Scanner Season 6 Range 52 Column 4,52,4,33.076381,-111.974999,2018-08-01,SP1516,12300.0,kg / ha
359,MAC Field Scanner Season 6 Range 52 Column 5,52,5,33.076381,-111.974983,2018-08-01,SP1516,8240.0,kg / ha
360,MAC Field Scanner Season 6 Range 52 Column 12,52,12,33.076381,-111.974868,2018-08-01,SP1516,12900.0,kg / ha
361,MAC Field Scanner Season 6 Range 30 Column 15,30,15,33.07559,-111.974819,2018-08-01,PI641830,10500.0,kg / ha


#### Write dataframe to csv file 

In [42]:
timestamp = datetime.datetime.now().replace(microsecond=0).isoformat()
output_filename = f'data/processed/aboveground_dry_biomass_season_6_{timestamp}.csv'.replace(':', '')

adb_4.to_csv(output_filename, index=False)

### B. Canopy Height

In [43]:
ch_0 = df_6.loc[df_6.trait == 'canopy_height']
print(ch_0.shape)
# ch_0.head(3)

(56707, 12)


Unnamed: 0,sitename,range,column,lat,lon,date,trait,trait_description,method_name,cultivar,value,units
15464,MAC Field Scanner Season 6 Range 9 Column 7,9,7,33.074835,-111.974949,2018-04-19,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI152730,11.0,cm
15465,MAC Field Scanner Season 6 Range 9 Column 8,9,8,33.074835,-111.974933,2018-04-19,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI569418,11.0,cm
15466,MAC Field Scanner Season 6 Range 9 Column 9,9,9,33.074835,-111.974917,2018-04-19,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI455301,11.0,cm


In [44]:
# should return empty df if no subplots

subplots = check_for_subplots(ch_0)
subplots

Unnamed: 0,sitename,range,column,lat,lon,date,trait,trait_description,method_name,cultivar,value,units


#### Check for duplicates by sitename, date

In [50]:
ch_0.duplicated(subset=['sitename', 'date', 'method_name']).value_counts()

False    28762
True     27945
dtype: int64

In [51]:
duplicates = ch_0.loc[ch_0.duplicated(subset=['sitename', 'date', 'method_name']) == True]
print(duplicates.shape)
# duplicates.sample(n=3)

(27945, 12)


Unnamed: 0,sitename,range,column,lat,lon,date,trait,trait_description,method_name,cultivar,value,units
130078,MAC Field Scanner Season 6 Range 43 Column 13,43,13,33.076057,-111.974852,2018-05-20,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI329338,11.0,cm
148966,MAC Field Scanner Season 6 Range 15 Column 7,15,7,33.075051,-111.97495,2018-05-28,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI646251,29.0,cm
269104,MAC Field Scanner Season 6 Range 43 Column 8,43,8,33.076057,-111.974934,2018-06-29,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI329394,245.0,cm


In [52]:
ch_0.loc[(ch_0.sitename == 'MAC Field Scanner Season 6 Range 43 Column 13') & (ch_0.date == '2018-05-20')]

Unnamed: 0,sitename,range,column,lat,lon,date,trait,trait_description,method_name,cultivar,value,units
61927,MAC Field Scanner Season 6 Range 43 Column 13,43,13,33.076057,-111.974852,2018-05-20,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI329338,11.0,cm
130078,MAC Field Scanner Season 6 Range 43 Column 13,43,13,33.076057,-111.974852,2018-05-20,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI329338,11.0,cm


#### Drop Duplicates with same sitename, date, method name, cultivar, value

In [54]:
ch_1 = ch_0.drop_duplicates(subset=['sitename', 'date', 'method_name', 'cultivar', 'value'], ignore_index=True)
print(ch_1.shape)
# ch_1.head()

(28762, 12)


Unnamed: 0,sitename,range,column,lat,lon,date,trait,trait_description,method_name,cultivar,value,units
0,MAC Field Scanner Season 6 Range 9 Column 7,9,7,33.074835,-111.974949,2018-04-19,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI152730,11.0,cm
1,MAC Field Scanner Season 6 Range 9 Column 8,9,8,33.074835,-111.974933,2018-04-19,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI569418,11.0,cm
2,MAC Field Scanner Season 6 Range 9 Column 9,9,9,33.074835,-111.974917,2018-04-19,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI455301,11.0,cm
3,MAC Field Scanner Season 6 Range 9 Column 10,9,10,33.074835,-111.9749,2018-04-19,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI329645,12.0,cm
4,MAC Field Scanner Season 6 Range 9 Column 12,9,12,33.074835,-111.974868,2018-04-19,canopy_height,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI196598,11.0,cm


#### Use sqlite database to group by `sitename` and `date`

In [55]:
conn = sqlite3.connect('canopy_heights_season_6.sqlite')
cursor = conn.cursor()
print("Opened database successfully")

Opened database successfully


In [56]:
# comment next line out if db has already been created
ch_1.to_sql('canopy_heights_season_6.sqlite', conn)

In [59]:
# drop trait and units columns, rename value column

ch_2 = pd.read_sql_query("""
                            SELECT sitename, range, column, lat, lon, date, trait_description, method_name, 
                            cultivar, value as canopy_height_cm 
                            FROM 'canopy_heights_season_6.sqlite'
                            ORDER BY date ASC;
                            """, conn)

print(ch_2.shape)
# ch_2.head(3)

(28762, 10)


Unnamed: 0,sitename,range,column,lat,lon,date,trait_description,method_name,cultivar,canopy_height_cm
0,MAC Field Scanner Season 6 Range 9 Column 7,9,7,33.074835,-111.974949,2018-04-19 00:00:00,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI152730,11.0
1,MAC Field Scanner Season 6 Range 9 Column 8,9,8,33.074835,-111.974933,2018-04-19 00:00:00,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI569418,11.0
2,MAC Field Scanner Season 6 Range 9 Column 9,9,9,33.074835,-111.974917,2018-04-19 00:00:00,"top of the general canopy of the plant, discou...",3D scanner to 98th quantile height,PI455301,11.0


#### Set date as index

In [60]:
ch_3 = ch_2.set_index(keys='date')
print(ch_3.shape)
ch_3.tail(3)

(28762, 9)


Unnamed: 0_level_0,sitename,range,column,lat,lon,trait_description,method_name,cultivar,canopy_height_cm
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2018-07-30 00:00:00,MAC Field Scanner Season 6 Range 53 Column 15,53,15,33.076417,-111.974819,"top of the general canopy of the plant, discou...",Scanner 3d ply data to height,SP1516,336.0
2018-07-30 00:00:00,MAC Field Scanner Season 6 Range 29 Column 6,29,6,33.075554,-111.974966,"top of the general canopy of the plant, discou...",Scanner 3d ply data to height,PI521019,340.0
2018-07-30 00:00:00,MAC Field Scanner Season 6 Range 39 Column 16,39,16,33.075913,-111.974803,"top of the general canopy of the plant, discou...",Scanner 3d ply data to height,SP1516,328.0


#### Write dataframe to csv file 

In [None]:
timestamp = datetime.datetime.now().replace(microsecond=0).isoformat()
output_filename = f'data/processed/canopy_height_season_6_{timestamp}.csv'.replace(':', '')

ch_3.to_csv(output_filename, index=True)