<a href="https://colab.research.google.com/github/comparativechrono/microscoPi/blob/main/papers/Hearn_et_al_2025/statistical_tests.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Photoperiodic and Intrinsic Circadian Regulation of Heart Rate in *Daphnia pulex*

**Tim Hearn\***<sup>1,2</sup>, **Millicenta Ampiah**<sup>1</sup>, **Linda King**<sup>2</sup>, **David Whitmore**<sup>3</sup>  
<sup>1</sup>Department of Genomic Medicine, University of Cambridge, UK  
<sup>2</sup>School of Life Sciences, Anglia Ruskin University, UK  
<sup>3</sup>Australian Institute of Tropical Health and Medicine, James Cook University, Australia  

\*Corresponding author  

---

This Jupyter notebook contains the statistical analysis accompanying the study, including paired comparisons of heart rate measurements under different photoperiodic and diel time points. Statistical tests are selected based on the distribution of paired differences, and results are summarized with p-values and significance levels.


In [7]:
import sys
import scipy
print("Python version:", sys.version)
print("SciPy version:", scipy.__version__)


Python version: 3.11.11 (main, Dec  4 2024, 08:55:07) [GCC 11.4.0]
SciPy version: 1.14.1


In [8]:
import numpy as np
import pandas as pd
import scipy.stats as stats

In [None]:
pip install scikit_posthocs

This creates a dictionary called `entrained`. The dictionary stores peak and trough measurement data for various diel cycle comparisons under long and short day conditions. Each key represents a time point comparison, with corresponding lists of values for each time point.


In [10]:
# Long day and short day diel cycles peak and troughs to test
entrained = {
    "Time 9 vs Time 15": {
        "Time 9": [90, 94, 78, 72, 72, 96, 72, 66, 78, 98],
        "Time 15": [87, 102, 114, 80, 96, 114, 82, 90, 96, 102]
    },
    "Time 33 vs Time 39": {
        "Time 33": [85, 79, 74, 76, 68, 95, 77, 70, 76, 107],
        "Time 39": [80, 104, 112, 65, 98, 110, 83, 90, 99, 102]
    },
    "Time 3 vs Time 9": {
        "Time 3": [83, 79, 76, 66, 71, 92, 68, 60, 74, 105],
        "Time 9": [84, 97, 108, 77, 95, 92, 81, 89, 93, 101]
    },
    "Time 27 vs Time 33": {
        "Time 27": [85, 88, 74, 66, 72, 97, 69, 62, 75, 101],
        "Time 33": [79, 106, 95, 73, 101, 114, 75, 85, 99, 96]
    }
}

This creates a function called `perform_stat_test`. This function performs a paired statistical analysis on time point comparisons. It first tests for normality of the differences using the Shapiro-Wilk test. Based on the result, it applies either a paired t-test (parametric) or Wilcoxon signed-rank test (non-parametric). The function returns a summary DataFrame with the test used, p-values, and significance levels for each comparison.


In [11]:
# Function to perform paired stats analysis on pairs
def perform_stat_test(data_pairs):
    results = []

    for comparison, times in data_pairs.items():
        group1, group2 = times.keys()
        data1, data2 = np.array(times[group1]), np.array(times[group2])

        # Normality test on differences
        differences = data1 - data2
        shapiro_diff = stats.shapiro(differences)

        # Choose statistical test based on normality
        if shapiro_diff.pvalue > 0.05:
            t_test = stats.ttest_rel(data1, data2)  # Paired t-test
            test_used = "Paired t-test"
        else:
            t_test = stats.wilcoxon(data1, data2)  # Wilcoxon signed-rank test (non-parametric alternative)
            test_used = "Wilcoxon signed-rank test"

        # Assign significance level asterisks
        if t_test.pvalue <= 0.0001:
            significance = "****"
        elif t_test.pvalue <= 0.001:
            significance = "***"
        elif t_test.pvalue <= 0.01:
            significance = "**"
        elif t_test.pvalue <= 0.05:
            significance = "*"
        else:
            significance = "n.s."  # Not significant

        # Store results
        results.append({
            "Comparison": comparison,
            "Test Used": test_used,
            "Shapiro-Wilk p-value (Differences)": shapiro_diff.pvalue,
            "Test Statistic": t_test.statistic,
            "p-value": t_test.pvalue,
            "Significance": significance
        })

    return pd.DataFrame(results)



Run the paired statistical tests on all time point comparisons in the `entrained` dataset using the `perform_stat_test` function, and print the resulting summary DataFrame.


In [12]:
# Perform stats test on all comparisons
stat_test_results = perform_stat_test(entrained)
print(stat_test_results)

           Comparison      Test Used  Shapiro-Wilk p-value (Differences)  \
0   Time 9 vs Time 15  Paired t-test                            0.882579   
1  Time 33 vs Time 39  Paired t-test                            0.578389   
2    Time 3 vs Time 9  Paired t-test                            0.664056   
3  Time 27 vs Time 33  Paired t-test                            0.269806   

   Test Statistic   p-value Significance  
0       -4.045872  0.002903           **  
1       -2.590594  0.029185            *  
2       -3.641220  0.005391           **  
3       -3.459353  0.007168           **  
