# Manual vs Automatic Sleep Stage Scoring

This notebook analyzes and compares manual and automatic sleep stage scoring.

## 1. Data Preparation

Data sourced from 17 CSV files. Invalid sleep scores removed.

## 2. Data Visualization

Bar plots display the manual and automatic sleep stage scores for each valid stage. 

## 3. Statistical Analysis

**Shapiro-Wilk Test**: Tests if data follows normal distribution.

**Levene Test**: Checks if variances are equal.

**Mann-Whitney U Test**: Non-parametric test for comparing distributions.

**T-test**: Compares mean scores, assuming normality and equal variances.

The notebook concludes with analysis interpretations.

In [37]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
from scipy.stats import mannwhitneyu
from scipy.stats import levene

file_paths = [
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_129.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_130.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_132.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_139.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_227.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_229.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_236.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_237.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_238.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_241.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_365.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_366.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_369.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_373.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_382.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_404.csv',
    '/Users/valentinreateguirangel/Python/Manual_and_auto_score/manual_auto_433.csv'
]

# Define the valid sleep states and their labels
valid_states = [0, 1, 2]
state_names = ['Wake', 'NonREM', 'REM']

# Read and process the data
all_data = []
for file_path in file_paths:
    df = pd.read_csv(file_path)
    # Filter out rows with invalid sleep states
    df = df[df['manual_score'].isin(valid_states) & df['automatic_score'].isin(valid_states)]
    all_data.append(df)

# Initialize arrays for storing means and errors for each state
manual_means = np.zeros(len(valid_states))
auto_means = np.zeros(len(valid_states))
manual_errors = np.zeros(len(valid_states))
auto_errors = np.zeros(len(valid_states))

# Initialize arrays for storing individual manual and automatic counts
individual_manual_counts = []
individual_auto_counts = []

# Loop through each sleep state
for state in valid_states:
    state_manual_counts = []
    state_auto_counts = []

    # Calculate manual and automatic counts for each sleep state
    for df in all_data:
        manual_count = (df['manual_score'] == state).sum()
        auto_count = (df['automatic_score'] == state).sum()
        state_manual_counts.append(manual_count)
        state_auto_counts.append(auto_count)

    # Calculate median and standard error for each sleep state
    manual_medians[state] = np.median(state_manual_counts)
    auto_medians[state] = np.median(state_auto_counts)
    manual_errors[state] = stats.sem(state_manual_counts)
    auto_errors[state] = stats.sem(state_auto_counts)

    # Store individual manual and automatic counts for later use in plotting
    individual_manual_counts.append(state_manual_counts)
    individual_auto_counts.append(state_auto_counts)

# Create the bar plot
x = np.arange(len(valid_states))
width = 0.35

fig, ax = plt.subplots(figsize=(8, 6), dpi=100)

# Set the appearance of error bars
errorbar_width = 15
cap_size = 25
error_kw = dict(ecolor='black', elinewidth=errorbar_width, capsize=cap_size)

# Define colors for the bars
manual_bar_color = 'cornflowerblue'
auto_bar_color = 'peachpuff'

# Plot the bars with medians and error bars
manual_bars = ax.bar(x - width / 2, manual_medians, width, yerr=manual_errors, label='Manual', color=manual_bar_color, error_kw=error_kw)
auto_bars = ax.bar(x + width / 2, auto_medians, width, yerr=auto_errors, label='Automatic', color=auto_bar_color, error_kw=error_kw)

# Set the size of individual data points (markers) and connecting lines
marker_size = 8
line_width = 1

# Loop through the sleep states and their corresponding index
for i, state in enumerate(valid_states):
    # Calculate the x-coordinates for individual manual and automatic data points
    manual_points_x = [x[i] - width / 2] * len(individual_manual_counts[state])
    auto_points_x = [x[i] + width / 2] * len(individual_auto_counts[state])

    # Plot the individual manual data points with the specified marker size and color
    ax.plot(manual_points_x, individual_manual_counts[state], 'o', markersize=marker_size, color='steelblue')

    # Plot the individual automatic data points with the specified marker size and color
    ax.plot(auto_points_x, individual_auto_counts[state], 'o', markersize=marker_size, color='darkorange')

    # Loop through the pairs of manual and automatic counts for the current state
    for manual_count, auto_count in zip(individual_manual_counts[state], individual_auto_counts[state]):
        # Plot a dashed line connecting the manual and automatic data points for each pair
        ax.plot([x[i] - width / 2, x[i] + width / 2], [manual_count, auto_count], 'k--', linewidth=line_width)

# Configure the x-axis ticks and labels
ax.set_xticks(x)
ax.set_xticklabels(state_names)

# Set the y-axis label and the plot title
ax.set_ylabel('Counts')
ax.set_title('Manual vs Automatic Scoring for Sleep States (Mean)')

# Display the legend
ax.legend()

# Adjust the layout of the plot and show it
fig.tight_layout()
plt.savefig("Manual_vs_Automatic_Scoring.png")
plt.close(fig)

In [38]:
#######Function that plots an specific brainstate#############
def plot_sleep_stage(state_index):
    fig, ax = plt.subplots(figsize=(8, 6), dpi=100)

    # Plot the bars with medians and error bars for the specific sleep state
    manual_bar = ax.bar(state_index - width / 2, manual_medians[state_index], width, yerr=manual_errors[state_index], label='Manual', color=manual_bar_color, error_kw=error_kw)
    auto_bar = ax.bar(state_index + width / 2, auto_medians[state_index], width, yerr=auto_errors[state_index], label='Automatic', color=auto_bar_color, error_kw=error_kw)

    # Plot the individual manual data points with the specified marker size and color for the specific sleep state
    manual_points_x = [state_index - width / 2] * len(individual_manual_counts[state_index])
    ax.plot(manual_points_x, individual_manual_counts[state_index], 'o', markersize=marker_size, color='steelblue')

    # Plot the individual automatic data points with the specified marker size and color for the specific sleep state
    auto_points_x = [state_index + width / 2] * len(individual_auto_counts[state_index])
    ax.plot(auto_points_x, individual_auto_counts[state_index], 'o', markersize=marker_size, color='darkorange')

    # Loop through the pairs of manual and automatic counts for the specific sleep state
    for manual_count, auto_count in zip(individual_manual_counts[state_index], individual_auto_counts[state_index]):
        # Plot a dashed line connecting the manual and automatic data points for each pair
        ax.plot([state_index - width / 2, state_index + width / 2], [manual_count, auto_count], 'k--', linewidth=line_width)

    # Configure the x-axis ticks and labels
    ax.set_xticks([state_index])
    ax.set_xticklabels([state_names[state_index]])

    # Set the y-axis label and the plot title
    ax.set_ylabel('Counts')
    ax.set_title(f'Manual vs Automatic Scoring for {state_names[state_index]} Sleep Stage (Mean)')

    # Display the legend
    ax.legend()

    # Adjust the layout of the plot and show it
    fig.tight_layout()
    plt.savefig("Rem_only_Manual_vs_Automatic_Scoring.png")
    plt.close(fig)

In [39]:
# Call the function with the index corresponding to the REM sleep stage
plot_sleep_stage(valid_states.index(2))  # REM sleep stage is at index 2

In [20]:
# Shapiro-Wilk test for normality
print("Shapiro-Wilk Test for Normality")

# Iterate through valid sleep states and their corresponding names
for state, state_name in zip(valid_states, state_names):
    # Extract the manual and automatic counts for the current sleep state
    manual_counts = np.array(individual_manual_counts[state])
    auto_counts = np.array(individual_auto_counts[state])

    # Perform the Shapiro-Wilk test on both manual and automatic counts
    manual_stat, manual_p = stats.shapiro(manual_counts)
    auto_stat, auto_p = stats.shapiro(auto_counts)

    # Print the results of the Shapiro-Wilk test for the current sleep state
    # W-statistic and p-value for both manual and automatic counts
    print(f"{state_name} (Manual): W = {manual_stat:.4f}, p-value = {manual_p:.4f}")
    print(f"{state_name} (Automatic): W = {auto_stat:.4f}, p-value = {auto_p:.4f}")
    print()

Shapiro-Wilk Test for Normality
Wake (Manual): W = 0.9695, p-value = 0.8097
Wake (Automatic): W = 0.9539, p-value = 0.5203

NonREM (Manual): W = 0.8328, p-value = 0.0059
NonREM (Automatic): W = 0.8097, p-value = 0.0028

REM (Manual): W = 0.9141, p-value = 0.1173
REM (Automatic): W = 0.9386, p-value = 0.3018



In [21]:
##############Levene########################
from scipy.stats import levene

print("Levene Test for Equal Variances")

# Iterate through valid sleep states and their corresponding names
for state, state_name in zip(valid_states, state_names):
    # Extract the manual and automatic counts for the current sleep state
    manual_counts = np.array(individual_manual_counts[state])
    auto_counts = np.array(individual_auto_counts[state])

    # Perform the Levene test on both manual and automatic counts
    stat, p = levene(manual_counts, auto_counts)

    # Print the results of the Levene test for the current sleep state
    # W-statistic and p-value for both manual and automatic counts
    print(f"{state_name}: W = {stat:.4f}, p-value = {p:.4f}")

Levene Test for Equal Variances
Wake: W = 0.7806, p-value = 0.3836
NonREM: W = 0.0701, p-value = 0.7929
REM: W = 0.3348, p-value = 0.5669


In [22]:
############Mann.whitney####################
from scipy.stats import mannwhitneyu
print("Mann-Whitney U Test")

# Iterate through valid sleep states and their corresponding names
for state, state_name in zip(valid_states, state_names):
    # Extract the manual and automatic counts for the current sleep state
    manual_counts = np.array(individual_manual_counts[state])
    auto_counts = np.array(individual_auto_counts[state])

    # Perform the Mann-Whitney U test on both manual and automatic counts
    u_stat, u_p = mannwhitneyu(manual_counts, auto_counts)

    # Print the results of the Mann-Whitney U test for the current sleep state
    # U-statistic and p-value
    print(f"{state_name}: U = {u_stat:.4f}, p-value = {u_p:.4f}")

Mann-Whitney U Test
Wake: U = 142.0000, p-value = 0.9451
NonREM: U = 190.0000, p-value = 0.1212
REM: U = 44.0000, p-value = 0.0006


In [23]:
######T-test#######

# T-test
print("T-test for Manual vs Automatic Scoring")

# Iterate through valid sleep states and their corresponding names
for state, state_name in zip(valid_states, state_names):
    # Extract the manual and automatic counts for the current sleep state
    manual_counts = np.array(individual_manual_counts[state])
    auto_counts = np.array(individual_auto_counts[state])

    # Perform the t-test on both manual and automatic counts
    # stats.ttest_ind performs an independent two sample t-test
    t_stat, t_p = stats.ttest_ind(manual_counts, auto_counts)

    # Print the results of the t-test for the current sleep state
    # t-statistic and p-value for both manual and automatic counts
    print(f"{state_name}: t = {t_stat:.4f}, p-value = {t_p:.4f}")
    print()

T-test for Manual vs Automatic Scoring
Wake: t = -0.1403, p-value = 0.8893

NonREM: t = 0.8665, p-value = 0.3927

REM: t = -4.0647, p-value = 0.0003



# Conclusion

The focus of this study was to investigate the potential use of sleep recording and imaginary coherence data from Grin2B mutated rats as a diagnostic biomarker for GRIN2B neurodevelopmental disorders. The main components of this investigation involved a comparison of automatic and manual sleep scoring methods and an exploration of imaginary coherence levels in GRIN2B-mutated rats.

Our automatic sleep scorer demonstrated high accuracy in identifying Wake and NonRem states, but had a less robust performance in classifying REM sleep, necessitating further calibration.

A comparison was made between manual and automatic sleep EEG scoring methods, specifically for Wake, NonREM, and REM sleep stages. The intent of this comparison was to assess the accuracy and reliability of the software, with an eye towards potential future applications in sleep research and clinical environments.

Our analysis revealed that data distribution was normal for Wake and REM sleep stages, but not for the NonREM sleep stage. Nonetheless, the variances between manual and automatic scoring were consistent across all three brain states.

Upon further examination of the NonREM sleep stage, we found no significant difference between the manual and automatic sleep scoring methods (Mann-Whitney U test: p-value = 0.1212), suggesting both techniques can be reliably used for scoring this particular sleep stage.

A similar conclusion was drawn for the Wake sleep stage. The comparison showed no significant difference between the manual and automatic scoring methods (Student's t-test: p-value = 0.8893), indicating that both methods are equally effective in assessing the Wake sleep state.

However, when it came to the REM sleep stage, we identified a significant difference between the manual and automatic scoring methods (Student's t-test: p-value = 0.0003). This finding highlights a potential inconsistency between the two techniques when it comes to classifying REM sleep states. This underscores the importance of additional investigation into the underlying causes of this discrepancy and an assessment of the implications for using different scoring methods for REM sleep classification in future research.
