In [None]:
# importing libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import datetime as dt
import scipy.stats as stats

In [None]:
# importing sql
%load_ext sql

%config SqlMagic.autopandas = True
%config SqlMagic.feedback = False
%config SqlMagic.displaycon = False

%sql duckdb:///:memory:

### Exploring snow levels and observation counts for the Common Redpoll  
As a secondary research question, we investigated how the Common Redpoll (_Acanthis flammea_) may tend to appear in differently sized groups at feeders when different levels of snow are present. Redpolls are a small, common feeder bird that is known to often appear in groups numbering in the hundreds.

In [None]:
redpoll_df = pd.read_csv("redpoll_df.csv")
print("Mean of redpoll count when snow is 0.0: " + str(redpoll_df[redpoll_df['0.0'] == 1]['how_many'].mean()))
print("Mean of redpoll count when snow is 0.001 - 5.0: " + str(redpoll_df[redpoll_df['0.001'] == 1]['how_many'].mean()))
print("Mean of redpoll count when snow is 5.0 - 15.001: " + str(redpoll_df[redpoll_df['5.0'] == 1]['how_many'].mean()))
print("Mean of redpoll count when snow is 15.001+: " + str(redpoll_df[redpoll_df['15.001'] == 1]['how_many'].mean()))

At first glance, it appears that the count of redpolls at feeders per observation increases with increasing snow depth. This can be verified with Welch's two-sample t-test to evaluate if the mean counts are statistically significantly different for observations at feeders with no snow compared to observations at feeders with over 15 cm of snow.

In [None]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15,8))
ax1.boxplot(redpoll_df[redpoll_df['0.0'] == 1]['how_many'])
ax2.boxplot(redpoll_df[redpoll_df['15.001'] == 1]['how_many'])
ax1.set_title("Counts per observation with 0 cm of snow")
ax2.set_title("Counts per observation with 15 cm+ of snow")

In [None]:
stats.ttest_ind(redpoll_df[redpoll_df['0.0'] == 1]['how_many'], redpoll_df[redpoll_df['15.001'] == 1]['how_many'], equal_var = False)

Notably, the t-test produces a surprisingly low p-value, indicating that the true mean of observed redpoll flock size at feeders when there is no snow on the ground is almost certainly not equal to the true mean of flock size when there is significant snow on the ground. This result could speak to behavioral changes throughout winter conditions that drive the formation of larger groups of individuals congregating around and relying heavily on specific feeders.