### Statistical Analyses
#### List of tasks accomplished in this Jupyter Notebook:
- Print out the n values for each treatment for reference
- Compare physiological (length) differences between males, females, and fed vs starved larvae
- Compare larval activity to time of day
- Compare differences between fed and starved larvae in behavioral metrics
- Calculate the preference value for each stimulus for both fed and starved larvae
- Compare behavioral metrics for neutral, aversive, and appetitive cues
- Compare the effect of larval presence on dye diffusion over time
- Assess fit of the regression line used to fit distance to concentration for simulations
- Adjust all tests using the Holm-Bonferroni correction

In [1]:
R.version # print R version for reference

               _                           
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          3                           
minor          5.1                         
year           2018                        
month          07                          
day            02                          
svn rev        74947                       
language       R                           
version.string R version 3.5.1 (2018-07-02)
nickname       Feather Spray               

In [2]:
library(broom)

setwd("A:/gitrepos/behavior_paper_final_code/data/")
data <- read.csv('./trajectories/summary/cleaned_animal_analyses.csv')

data$A_starved <- as.factor(data$A_starved)
data$A_treatment_odor <- as.factor(data$A_treatment_odor)
data$A_sex <- as.factor(data$A_sex)

method_n <- "holm"

- Print out the n values for each treatment for reference

In [3]:
cat("Starved larvae:")
sub <- subset(data, A_starved=="1day")
data.frame(table(sub$A_treatment_odor))

sub <- subset(data, A_starved=="1day")
data.frame(table(sub$A_sex))

cat("Fed larvae:")
sub <- subset(data, A_starved=="no")
data.frame(table(sub$A_treatment_odor))

sub <- subset(data, A_starved=="no")
data.frame(table(sub$A_sex))

Starved larvae:

Var1,Freq
naive_100ul_left_food_05percent,32
naive_100ul_left_food_extract,19
naive_100ul_left_indole_100uM,20
naive_100ul_left_indole_10mM,19
naive_100ul_left_milliQ_water,16
naive_100ul_left_o-cresol_100uM,25
naive_100ul_left_quinine_10mM,19
naive_100ul_left_yeastRNA_1gL,18


Var1,Freq
f,79
m,89


Fed larvae:

Var1,Freq
naive_100ul_left_food_05percent,57
naive_100ul_left_food_extract,19
naive_100ul_left_indole_100uM,36
naive_100ul_left_indole_10mM,17
naive_100ul_left_milliQ_water,39
naive_100ul_left_o-cresol_100uM,36
naive_100ul_left_quinine_10mM,24
naive_100ul_left_yeastRNA_1gL,20


Var1,Freq
f,120
m,128


- Compare physiological (length) differences between males, females, and fed vs starved larvae

In [4]:
# starved females vs fed females
sub <- subset(data, A_sex=="f")
sub1 <- subset(sub, A_starved=="1day")
sub2 <- subset(sub, A_starved=="no")
resp <- wilcox.test(sub1$A_larvae_length_mm, sub2$A_larvae_length_mm)
p <- resp$p.value
t <- "f starved vs f fed size"

# starved males vs fed males
sub <- subset(data, A_sex=="m")
sub1 <- subset(sub, A_starved=="1day")
sub2 <- subset(sub, A_starved=="no")
resp <- wilcox.test(sub1$A_larvae_length_mm, sub2$A_larvae_length_mm)
p <- c(p, resp$p.value)
t <- c(t, "m starved vs m fed size")

# starved females vs starved males
sub <- subset(data, A_starved=="1day")
sub1 <- subset(sub, A_sex=="f")
sub2 <- subset(sub, A_sex=="m")
resp <- wilcox.test(sub1$A_larvae_length_mm, sub2$A_larvae_length_mm)
p <- c(p, resp$p.value)
t <- c(t, "m starved vs f starved size")

# fed females vs fed males
sub <- subset(data, A_starved=="no")
sub1 <- subset(sub, A_sex=="f")
sub2 <- subset(sub, A_sex=="m")
resp <- wilcox.test(sub1$A_larvae_length_mm, sub2$A_larvae_length_mm)
p <- c(p, resp$p.value)
t <- c(t, "f fed vs m fed size")

cat("Number of tests ran:", length(p))

Number of tests ran: 4

- Compare larval activity to time of day

In [5]:
resp <- lm(data$A_time_move~data$A_minutes_past_L)
r <- glance(resp)$p.value
p <- c(p, r)
t <- c(t, "time moving by daylight regression")

resp <- lm(data$A_time_wall~data$A_minutes_past_L)
r <- glance(resp)$p.value
p <- c(p, r)
t <- c(t, "time next to wall by daylight regression")

resp <- lm(data$A_median_speed~data$A_minutes_past_L)
r <- glance(resp)$p.value
p <- c(p, r)
t <- c(t, "median speed by daylight regression")

cat("Number of tests ran:", length(p))

Number of tests ran: 7

- Compare differences between fed and starved larvae in behavioral metrics

In [6]:
sub1 <- subset(data, A_starved=="1day")
sub2 <- subset(data, A_starved=="no")
resp <- wilcox.test(sub1$A_time_move, sub2$A_time_move)
p <- c(p, resp$p.value)
t <- c(t, "time moving by starvation state")

sub1 <- subset(data, A_starved=="1day")
sub2 <- subset(data, A_starved=="no")
resp <- wilcox.test(sub1$A_time_wall, sub2$A_time_wall)
p <- c(p, resp$p.value)
t <- c(t, "time next to wall by starvation state")

cat("Number of tests ran:", length(p))

Number of tests ran: 9

- Calculate the preference value for each stimulus for both fed and starved larvae

In [7]:
treatments <- c("naive_100ul_left_food_05percent", "naive_100ul_left_food_extract", 
                "naive_100ul_left_yeastRNA_1gL", "naive_100ul_left_quinine_10mM",
                "naive_100ul_left_indole_100uM", "naive_100ul_left_o-cresol_100uM", 
                "naive_100ul_left_milliQ_water", "naive_100ul_left_indole_10mM")

for (fed in c("1day", "no")){
    for (treatment in treatments){
        ss <- subset(data, A_treatment_odor==treatment)
        ss <- subset(ss, A_starved==fed)
        resp <- t.test(ss$A_median_conc, ss$E_median_conc, paired=TRUE, alternative="two.sided")
        p <- c(p, resp$p.value)
        t <- c(t, paste(treatment, fed, "paired test of median concentration", sep=" "))}}

cat("Number of tests ran:", length(p))

Number of tests ran: 25

- Compare behavioral metrics for neutral, aversive, and appetitive cues

In [8]:
data3 <- read.csv('./trajectories/summary/cleaned_animal_analyses_stimuli_groups.csv')
data3$stimulus <- as.factor(data3$stimulus)

resp <- kruskal.test(data3$median_conc_diff~data3$stimulus)
p <- c(p, resp$p.value)
t <- c(t, "median concentration by odor")

resp <- kruskal.test(data3$cd_move_diff~data3$stimulus)
p <- c(p, resp$p.value)
t <- c(t, "cd_move_diff by odor")

resp <- kruskal.test(data3$discovery_time_diff~data3$stimulus)
p <- c(p, resp$p.value)
t <- c(t, "discovery_time by odor")

resp <- kruskal.test(data3$c_speed_diff~data3$stimulus)
p <- c(p, resp$p.value)
t <- c(t, "c_speed_diff by odor")

resp <- kruskal.test(data3$cd_speed_diff~data3$stimulus)
p <- c(p, resp$p.value)
t <- c(t, "cd_speed_diff by odor")

resp <- kruskal.test(data3$c_turn_diff~data3$stimulus)
p <- c(p, resp$p.value)
t <- c(t, "c_turn_diff by odor")

resp <- kruskal.test(data3$cd_turn_diff~data3$stimulus)
p <- c(p, resp$p.value)
t <- c(t, "cd_turn_diff by odor")

cat("Number of tests ran:", length(p))

Number of tests ran: 32

- Compare the effect of larval presence on dye diffusion over time

In [9]:
data <- read.csv("./fluorescein/larvae_no_larvae_comparison.csv")
data$larva_presence = as.factor(data$larva_presence)

resp <- lm(data$perc_over_50 ~ data$time+data$larva_presence)
summary(resp)
p <- c(p, 2e-16)
t <- c(t, "regression larval presence time concentration: intercept")
p <- c(p, 0.000777)
t <- c(t, "regression larval presence time concentration: time")
p <- c(p, 7.23e-12)
t <- c(t, "regression larval presence time concentration: presence")


Call:
lm(formula = data$perc_over_50 ~ data$time + data$larva_presence)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.12474 -0.02777  0.00568  0.02734  0.10630 

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
(Intercept)                   0.1415961  0.0054842  25.819  < 2e-16 ***
data$time                    -0.0018729  0.0005519  -3.394 0.000777 ***
data$larva_presenceno_larvae -0.0362318  0.0050881  -7.121 7.23e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.04551 on 317 degrees of freedom
Multiple R-squared:  0.1641,	Adjusted R-squared:  0.1588 
F-statistic: 31.11 on 2 and 317 DF,  p-value: 4.603e-13


- Assess fit of the regression line used to fit distance to concentration for simulations

In [10]:
# See fit of regression line. The regression itself was done in Python. 

data2 <- read.csv("./fluorescein/distance_concentration_map_fitted.csv")
resp <- lm(data2$ln_conc~data2$distance_mm)
           
summary(resp)
r <- glance(resp)$p.value
p <- c(p, r)
t <- c(t, "Regression of distance vs concentration for 10 minutes")


Call:
lm(formula = data2$ln_conc ~ data2$distance_mm)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.5566 -0.5366 -0.2004  0.4153  2.0556 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)        4.8971198  0.0341109   143.6   <2e-16 ***
data2$distance_mm -0.0804699  0.0007013  -114.7   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.7115 on 2388 degrees of freedom
Multiple R-squared:  0.8465,	Adjusted R-squared:  0.8464 
F-statistic: 1.316e+04 on 1 and 2388 DF,  p-value: < 2.2e-16


- Adjust all tests using the Holm-Bonferroni correction

In [11]:
p <- p.adjust(p, method=method_n)
d <- data.frame(t, p, p<0.05)
d

t,p,p...0.05
f starved vs f fed size,1.050464e-06,True
m starved vs m fed size,0.0153372,True
m starved vs f starved size,0.007891945,True
f fed vs m fed size,3.622364e-10,True
time moving by daylight regression,1.0,False
time next to wall by daylight regression,1.0,False
median speed by daylight regression,1.0,False
time moving by starvation state,2.112323e-14,True
time next to wall by starvation state,2.156391e-16,True
naive_100ul_left_food_05percent 1day paired test of median concentration,2.932855e-09,True
