## Import Test Data

The "bin_test" data is test data made specifically to test this program as it is much smaller than the actual data we will be working with. It is simply filler to test our functions.

In [82]:
import numpy as np
import pandas as pd
pd.options.mode.chained_assignment = None

df = pd.read_csv("bin_test.csv")
df

Unnamed: 0,s1,s2
0,1,1
1,1,1
2,1,0
3,0,1
4,1,0
5,1,0
6,1,1
7,1,0
8,1,1


## Where is a 0 in between 2 values of 1?

This section consists of visualizing where we're looking to replace the data. The swicth column acts to show us where the lights turn on/off. Since our goal is to replace 0s in between 1s, we are looking for a switch pattern of 1.0, -1.0, 1.0. This is also only based on a singular column, 's1'.

In [83]:
df_switch = df.copy()

df_switch['switch'] = df_switch['s1'].diff()
from_1_to_0 = df_switch[df_switch['switch'] == -1]
from_0_to_1 = df_switch[df_switch['switch'] == 1]



display(df_switch)


Unnamed: 0,s1,s2,switch
0,1,1,
1,1,1,0.0
2,1,0,0.0
3,0,1,-1.0
4,1,0,1.0
5,1,0,0.0
6,1,1,0.0
7,1,0,0.0
8,1,1,0.0


## Reduce noise in a single column

Since there are many errors when it comes to our locomotor activity data and collection, we assume that when a 0 is surrounded by 1s, the spider is actually in an unitterrupted bout of activity and not resting. This code works to replace the 0s surrounded by 1s, with 1s. This essentially "de-noises" the data. This code also only applies to one specific column, 's1'.

In [84]:
reduce_noise = df.copy()

reduce_noise[1:].loc[(reduce_noise.s1.shift(-1) == 1) == (reduce_noise.s1.shift(1) == 1), 's1'] = 1

display(reduce_noise)
display(df)




Unnamed: 0,s1,s2
0,1,1
1,1,1
2,1,0
3,1,1
4,1,0
5,1,0
6,1,1
7,1,0
8,1,1


Unnamed: 0,s1,s2
0,1,1
1,1,1
2,1,0
3,0,1
4,1,0
5,1,0
6,1,1
7,1,0
8,1,1


## Noise reduction throughout multiple columns

With this code, we want to iterate through each column in our dataframe. We want to accomplish this by looping through a list of column names for our function (which runs through one column at a time) to work.

In [86]:
df1 = df.copy()

col = list(df1)
for x in col:
    df1[1:].loc[(df1[x].shift(-1) == 1) == (df1[x].shift(1) == 1), x] = 1
    
display(df1)
display(df)


Unnamed: 0,s1,s2
0,1,1
1,1,1
2,1,1
3,1,1
4,1,0
5,1,0
6,1,1
7,1,1
8,1,1


Unnamed: 0,s1,s2
0,1,1
1,1,1
2,1,0
3,0,1
4,1,0
5,1,0
6,1,1
7,1,0
8,1,1


## Final Code

In [90]:
file_names = "Metazygia wittfeldae Monitor 1 activity"

df_DD = pd.read_csv(file_names + '_DD_binary.csv', index_col = 0)
df_DD["Date_Time"] = pd.to_datetime(df_DD.Date + ' ' + df_DD.Time)
df_DD = df_DD.set_index("Date_Time")

display(df_DD)

Unnamed: 0_level_0,Date,Time,lights,s1,s2,s3,s4,s5,s6,s7,...,s23,s24,s25,s26,s27,s28,s29,s30,s31,s32
Date_Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2017-06-19 00:00:00,19-Jun-17,0:00:00,0,0,1,0,0,0,1,0,...,0,0,0,0,0,0,0,0,1,0
2017-06-19 00:01:00,19-Jun-17,0:01:00,0,0,1,0,0,0,1,0,...,0,0,0,0,0,0,0,0,1,0
2017-06-19 00:02:00,19-Jun-17,0:02:00,0,0,1,0,0,0,1,0,...,0,1,0,0,0,0,0,0,0,0
2017-06-19 00:03:00,19-Jun-17,0:03:00,0,0,1,0,0,0,0,0,...,0,1,0,0,0,0,0,0,0,0
2017-06-19 00:04:00,19-Jun-17,0:04:00,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,1,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2017-06-27 23:55:00,27-Jun-17,23:55:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,1,0
2017-06-27 23:56:00,27-Jun-17,23:56:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
2017-06-27 23:57:00,27-Jun-17,23:57:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,1
2017-06-27 23:58:00,27-Jun-17,23:58:00,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0


In [91]:
col_DD = list(df_DD)
for x in col_DD:
    df_DD[1:].loc[(df_DD[x].shift(-1) == 1) == (df_DD[x].shift(1) == 1), x] = 1
    
df_DD.to_csv(file_names + "_DD_binary_noiseReduction.csv")