## Import data into DataFrame with DateTime index

This code imports the spider data and converts it into a usable format. This is done by adding a 'Date_Time' column and setting it as the index. Included is a short section of code that counts the total number of days in the experiment. This will be more important the next step.

In [24]:
file_names = "Metazygia wittfeldae Monitor 1 activity"

import pandas as pd

def importData(file_names):
    #Creates df as a global variable and allows us to refer to it outside of the function
    global df
    df = pd.read_csv(file_names + '.csv')
    #Creates date_time column and sets it as the index
    df["Date_Time"] = pd.to_datetime(df.Date + ' ' + df.Time)
    df = df.set_index("Date_Time")
  
    total_days = df.Date.value_counts().shape[0]
    print("The experiment expanded over a total of", total_days, "days.")
    
importData(file_names)

display(df)

The experiment expanded over a total of 19 days.


Unnamed: 0_level_0,Date,Time,lights,s1,s2,s3,s4,s5,s6,s7,...,s23,s24,s25,s26,s27,s28,s29,s30,s31,s32
Date_Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2017-06-10 12:25:00,10-Jun-17,12:25:00,1,0,3,0,0,0,15,15,...,0,0,5,0,0,0,0,0,0,0
2017-06-10 12:26:00,10-Jun-17,12:26:00,1,0,3,0,0,0,63,2,...,0,0,5,0,0,0,0,0,0,0
2017-06-10 12:27:00,10-Jun-17,12:27:00,1,0,1,0,0,7,5,0,...,1,0,0,18,1,0,0,0,0,0
2017-06-10 12:28:00,10-Jun-17,12:28:00,1,0,5,0,0,3,11,0,...,2,0,0,4,1,0,0,0,0,0
2017-06-10 12:29:00,10-Jun-17,12:29:00,1,0,0,0,0,0,0,0,...,0,0,4,1,2,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2017-06-28 13:44:00,28-Jun-17,13:44:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2017-06-28 13:45:00,28-Jun-17,13:45:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2017-06-28 13:46:00,28-Jun-17,13:46:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2017-06-28 13:47:00,28-Jun-17,13:47:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## Remove incomplete days from DataFrame

This function looks through the data frame and removes incomplete days. It does so by counting how many minutes are in each day. If the value is not equivalent to 1440 (60 mins * 24 hours), that day is deleted from the data frame. The previous code to count the total number of days is also here to show the difference between the two.

In [25]:
def deleteIncompleteDays():
    #defines df as a global varibale so it can be refered to out the function and changes can be made to it
    global df
    mins_in_day = df.Date.value_counts()
    complete_days = mins_in_day[mins_in_day == 24 * 60]
    #saves df as a dataframe with only complete days
    df = df[df.Date.isin(complete_days.index)]
    
    total_complete_days = df.Date.value_counts().shape[0]
    print("The total amount of complete (24 hour) days conducted in the experiment is", total_complete_days, "days.")

deleteIncompleteDays()

display(df.head())
display(df.tail())

The total amount of complete (24 hour) days conducted in the experiment is 17 days.


Unnamed: 0_level_0,Date,Time,lights,s1,s2,s3,s4,s5,s6,s7,...,s23,s24,s25,s26,s27,s28,s29,s30,s31,s32
Date_Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2017-06-11 00:00:00,11-Jun-17,0:00:00,0,6,2,0,0,0,0,7,...,5,1,0,0,0,0,0,7,6,0
2017-06-11 00:01:00,11-Jun-17,0:01:00,0,0,6,0,0,0,0,8,...,0,7,0,4,0,0,0,0,0,0
2017-06-11 00:02:00,11-Jun-17,0:02:00,0,0,11,0,0,0,0,0,...,0,0,0,17,0,0,0,0,1,0
2017-06-11 00:03:00,11-Jun-17,0:03:00,0,0,7,0,0,0,0,0,...,0,4,0,0,0,0,0,0,0,0
2017-06-11 00:04:00,11-Jun-17,0:04:00,0,0,20,0,0,0,5,4,...,0,0,0,0,0,0,0,0,0,0


Unnamed: 0_level_0,Date,Time,lights,s1,s2,s3,s4,s5,s6,s7,...,s23,s24,s25,s26,s27,s28,s29,s30,s31,s32
Date_Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2017-06-27 23:55:00,27-Jun-17,23:55:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,2,0,1,0
2017-06-27 23:56:00,27-Jun-17,23:56:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,3,0,0,0
2017-06-27 23:57:00,27-Jun-17,23:57:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,3,0,0,3
2017-06-27 23:58:00,27-Jun-17,23:58:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
2017-06-27 23:59:00,27-Jun-17,23:59:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0


## Split DataFrame into LD and DD dataframes

This function works to separate the data frame into LD and DD cycles. The first part of this is to create a new column named 'switch' will include when there is a difference in the lights value (1->0 or 0->1). The next part drop all values except for when switch is equal to -1 (when the lights turn off). From this, you can find where the final end date of LD is. Next, you can separete into two separate dataframes based on this date. If the date is before the LD end date, it is classified as LD and this data is saved into it's own csv file. If the date is after the LD end date, its saved into its own DD csv file.

In [26]:
def splitDF():
    #creates a new column that finds when there is a difference in the 'lights' value
    df["switch"] = df['lights'].diff()
    #creates a variable for when switch = -1 (the lights turn off)
    lights_turn_off = df[df['switch'] == -1].dropna()
    LD_end_date = lights_turn_off.Date[-1]
    #For cosmetic purposes, so our dataframe only has the necessary values
    del df['switch']
    
    #creates global variable so we can reference it outside the function
    global LD
    #If the date is before the end date, it is in LD cycle
    LD = df[df['Date'] <= LD_end_date]
    LD.to_csv(file_names + "_LD.csv")
    #creates global variable so we can reference it outside the function
    global DD
    #If the date is after the end date, it is in DD cycle
    DD = df[df['Date'] > LD_end_date]
    DD.to_csv(file_names + "_DD.csv")
    
splitDF()

In [27]:
display(LD)
display(DD)

Unnamed: 0_level_0,Date,Time,lights,s1,s2,s3,s4,s5,s6,s7,...,s23,s24,s25,s26,s27,s28,s29,s30,s31,s32
Date_Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2017-06-11 00:00:00,11-Jun-17,0:00:00,0,6,2,0,0,0,0,7,...,5,1,0,0,0,0,0,7,6,0
2017-06-11 00:01:00,11-Jun-17,0:01:00,0,0,6,0,0,0,0,8,...,0,7,0,4,0,0,0,0,0,0
2017-06-11 00:02:00,11-Jun-17,0:02:00,0,0,11,0,0,0,0,0,...,0,0,0,17,0,0,0,0,1,0
2017-06-11 00:03:00,11-Jun-17,0:03:00,0,0,7,0,0,0,0,0,...,0,4,0,0,0,0,0,0,0,0
2017-06-11 00:04:00,11-Jun-17,0:04:00,0,0,20,0,0,0,5,4,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2017-06-18 23:55:00,18-Jun-17,23:55:00,0,0,14,0,0,0,37,0,...,0,1,0,0,0,0,0,0,16,0
2017-06-18 23:56:00,18-Jun-17,23:56:00,0,0,2,0,0,0,13,0,...,0,2,0,0,0,0,0,0,8,8
2017-06-18 23:57:00,18-Jun-17,23:57:00,0,0,6,0,0,0,7,0,...,13,3,0,0,0,0,0,0,0,4
2017-06-18 23:58:00,18-Jun-17,23:58:00,0,0,5,0,0,0,22,0,...,4,5,0,0,0,0,0,0,0,0


Unnamed: 0_level_0,Date,Time,lights,s1,s2,s3,s4,s5,s6,s7,...,s23,s24,s25,s26,s27,s28,s29,s30,s31,s32
Date_Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2017-06-19 00:00:00,19-Jun-17,0:00:00,0,0,8,0,0,0,10,0,...,0,0,0,0,0,0,0,0,17,0
2017-06-19 00:01:00,19-Jun-17,0:01:00,0,0,8,0,0,0,8,0,...,0,0,0,0,0,0,0,0,3,0
2017-06-19 00:02:00,19-Jun-17,0:02:00,0,0,1,0,0,0,11,0,...,0,12,0,0,0,0,0,0,0,0
2017-06-19 00:03:00,19-Jun-17,0:03:00,0,0,7,0,0,0,0,0,...,0,3,0,0,0,0,0,0,0,0
2017-06-19 00:04:00,19-Jun-17,0:04:00,0,0,0,0,0,0,0,0,...,0,9,0,0,0,0,2,0,0,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2017-06-27 23:55:00,27-Jun-17,23:55:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,2,0,1,0
2017-06-27 23:56:00,27-Jun-17,23:56:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,3,0,0,0
2017-06-27 23:57:00,27-Jun-17,23:57:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,3,0,0,3
2017-06-27 23:58:00,27-Jun-17,23:58:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
