## Resting-State EEG Correlates of Sustained Attention in Healthy Ageing: Cross-Sectional Findings from the LEISURE Study

Created by Alicia J. Campbell, pertaining to the analysis for:

Campbell, A. J., Anijärv, T. E., Pace, T., Treacy, C., Lagopoulos, J., Hermens, D. F., Levenstein, J. M., & Andrews, S. C. (2024). [Resting-state EEG correlates of sustained attention in healthy ageing: Cross-sectional findings from the LEISURE study](https://doi.org/10.1016/j.neurobiolaging.2024.09.005). *Neurobiology of Aging*, 144, 68–77.

### EEG preprocessing and Spectral Analysis

EEG preprocessing and spectral analysis were conducted in a seperate repository. Please see: [EEG-pyline/studies Campbell_Resting_EEG_Sustained_Attention_Healthy_Ageing_Cross_Sectional_LEISURE.ipynb](https://github.com/teanijarv/EEG-pyline/blob/main/studies/Campbell_Resting_EEG_Sustained_Attention_Healthy_Ageing_Cross_Sectional_LEISURE.ipynb)

### SART d prime calculation

Participants completed the Sustained Attention to Response Task (SART) using E-Prime 2.0.10 software. Calculation of D prime measure can be found in this study folder.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
import statsmodels.api as sm
import numpy as np
import seaborn as sns
import pingouin as pg
import warnings
from scipy.stats import spearmanr
from HLR import HierarchicalLinearRegression
from statsmodels.stats.multitest import multipletests

warnings.simplefilter(action='ignore', category=FutureWarning)

### Sample descriptives. 

Descriptives described in:
-  Table 1. in Supplementary Materials 

Prior to this, gender was dummy-coded (0=male and 1=female) and outlier corrections were made using the z-score standard deviation transformation method. Specifically, values with z-scores greater than 3.29 or less than -3.29 were modified to one unit above or below the nearest acceptable value. In this sample, an outlier correction was applied to a single data point in the education variable.

In [None]:
fullsample_df = pd.read_excel('data/revision_2/LEISURE_T1_SART_CANTAB_rsEEG_data_revisions_2.xlsx')
fullsample_df.describe()

In [None]:
# Filter the DataFrame to include only ages between 50 and 85
filtered_manu_sample_df = fullsample_df[(fullsample_df['Age'] >= 50) & (fullsample_df['Age'] <= 85)]

# Create bins for the histogram
bins = range(50, 86, 5)  # You can adjust the step to your preference

sns.set_style("white")

plt.figure(figsize=(8, 6))
sns.histplot(filtered_manu_sample_df['Age'], bins=bins, kde=False, edgecolor=None)
plt.xlabel('Age (years)', fontsize=12)
plt.ylabel('Frequency', fontsize=12)
plt.xticks(bins)
plt.grid(False)

sns.despine()

plt.show()

### Spearmans Correlations

Correlations reported in:

3.0 Results 
- 3.2 Bivariate correlations of demographics, sustained attention, and resting-state EEG measures 
- Figure 2.

In [None]:
selected_columns = ['Age', 'Education_years_OA', 'Dprime', 'RVPA', 'Alpha_CF_parietooccipital', 'Alpha_absoluteBP_parietooccipital', 'Exponent_global', 'Offset_global']

corr = pg.rcorr(fullsample_df[selected_columns], method='spearman', stars=False)
corr

In [None]:
# Initialize the correlation and p-value matrices
correlation_matrix = pd.DataFrame(index=selected_columns, columns=selected_columns)
p_value_matrix = pd.DataFrame(index=selected_columns, columns=selected_columns)

# Store all p-values in a list to apply FDR correction later
all_p_values = []
p_value_locs = []

# Compute pairwise Spearman correlation and p-values
for col1 in selected_columns:
    for col2 in selected_columns:
        if col1 == col2:
            # Perfect correlation for diagonal
            correlation_matrix.loc[col1, col2] = 1
            p_value_matrix.loc[col1, col2] = np.nan  # No p-value for diagonal
        else:
            # Drop NaNs pairwise for the two columns
            valid_data = fullsample_df[[col1, col2]].dropna()
            
            if not valid_data.empty:  # Check if there's valid data left
                corr, p_value = spearmanr(valid_data[col1], valid_data[col2])
                correlation_matrix.loc[col1, col2] = round(corr, 2)
                p_value_matrix.loc[col1, col2] = p_value
                all_p_values.append(p_value)
                p_value_locs.append((col1, col2))  # Keep track of the location
            else:
                correlation_matrix.loc[col1, col2] = np.nan
                p_value_matrix.loc[col1, col2] = np.nan

# Apply FDR correction to the list of p-values
rejected, pvals_corrected, _, _ = multipletests(all_p_values, alpha=0.05, method='fdr_bh')

# Replace the original p-values in the matrix with the FDR-corrected p-values
for idx, (col1, col2) in enumerate(p_value_locs):
    p_value_matrix.loc[col1, col2] = pvals_corrected[idx]

# Create a DataFrame to hold both the correlation and p-values in the format you want
annot_matrix = correlation_matrix.copy()

# Iterate through the p_value_matrix and format the annotations
for col1 in selected_columns:
    for col2 in selected_columns:
        if pd.notna(p_value_matrix.loc[col1, col2]):
            if p_value_matrix.loc[col1, col2] < 0.001:
                # Replace p-values < 0.001 with the formatted string "(p<0.001)"
                annot_matrix.loc[col1, col2] = f"{correlation_matrix.loc[col1, col2]}\n" + r"($\it{p}$<0.001)"
            else:
                # For other p-values, include "(p=value)" with p in italics
                annot_matrix.loc[col1, col2] = f"{correlation_matrix.loc[col1, col2]}\n" + r"($\it{p}$=" + f"{round(p_value_matrix.loc[col1, col2], 3)})"

# Rename columns and index for better readability
renaming_dict = {
    'Education_years_OA': "Education", 
    'RVPA': r"RVP_$\it{A'}$",
    'Dprime': r"SART_$\it{d'}$",
    'Alpha_CF_parietooccipital': "IAF", 
    'Alpha_absoluteBP_parietooccipital': "aIAP", 
    'Exponent_global': "Exponent", 
    'Offset_global': "Offset"
}

# Apply renaming
correlation_matrix.rename(columns=renaming_dict, index=renaming_dict, inplace=True)
annot_matrix.rename(columns=renaming_dict, index=renaming_dict, inplace=True)

# Create a mask for the upper triangle without the diagonal
mask = np.triu(np.ones_like(correlation_matrix, dtype=bool))

# Set up the Seaborn theme and color palette
fntsize = 11
sns.set_theme(style='white')
cmap = plt.get_cmap("coolwarm")
newcolors = cmap(np.linspace(0, 1, 100))
newcolors[99] = mpl.colors.to_rgb('#D85A5A') + (1,)
newcmap = mpl.colors.ListedColormap(newcolors)

# Generate the heatmap
plt.figure(figsize=(8, 8), dpi=300)
ax = sns.heatmap(correlation_matrix.astype(float), cmap=newcmap, fmt="", cbar=False, mask=mask, annot=annot_matrix, annot_kws={"size": fntsize}, linewidths=.5)

# Customize tick labels
xticklabels = ax.get_xticklabels()
yticklabels = ax.get_yticklabels()

xticklabels[-1] = ''
yticklabels[0] = ''

# Apply the rotation, alignment, and font size
ax.set_xticklabels(xticklabels, rotation=30, ha='right', fontsize=fntsize)
ax.set_yticklabels(yticklabels, rotation=0, fontsize=fntsize)

plt.tight_layout()
plt.show()

### Heriachical linear regressions

Regressions described in:
3.0 Results
- 3.3 Hierarchical linear regression of resting-state EEG measures and sustained attention
- Table 1
- Table 2 in Supplementary Materials

- Each regression model treated an EEG measure as the main predictor and each sustained attention performance metric, RVP_A’ and SART_d’, as the dependent variables. 

- In constructing the hierarchical regression models, age (at the time of EEG recording), gender, and years of education were controlled for in the first step. The EEG measure of interest was introduced in the second step, while the third step included the interaction between age and the EEG measure of interest. 

- Separate regression models were conducted to specifically investigate the interaction between IAF and the exponent on each sustained attention performance metric, RVP_A’ and SART_d’. Age, gender, and years of education were controlled for in the first step, with IAF and the exponent introduced in the second step, and the interaction between IAF and the exponent included in the third step. 

- Tests were assigned significance at an alpha level of less than 0.05.

- Interaction terms were calculated prior by z scoring each variable and multiplying them together

#### IAF

In [None]:
X = {
    1: ['Age', 'Gender_F', 'Education_years_OA'],
    2: ['Age', 'Gender_F', 'Education_years_OA', 'Alpha_CF_parietooccipital'], 
    3: ['Age', 'Gender_F', 'Education_years_OA', 'Alpha_CF_parietooccipital', 'Age_Alpha_CF_parietooccipital_interaction']
}

# List of DVs
target_vars = ['Dprime', 'RVPA']

# Dictionary to store summaries for each DV
summaries = {}

# Loop over each DV
for y in target_vars:
    # Extract all predictor columns from X (dict of lists)
    predictor_columns = set()
    for predictors in X.values():
        predictor_columns.update(predictors)
    
    # Add the current DV (y) to the list of columns to check for NaN
    all_columns = list(predictor_columns) + [y]
    
    # Drop rows with NaN values in the predictor or DV columns
    clean_df = fullsample_df.dropna(subset=all_columns)
    
    # Run Hierarchical Linear Regression
    model = HierarchicalLinearRegression(clean_df, X, y)
    summary_df = model.summary()
    
    # Store the summary dataframe for later display
    summaries[y] = summary_df

# Display each summary using pandas display function
for y, summary_df in summaries.items():
    print(f"Displaying summary for {y}:")
    display(summary_df)  # Display the summary_df for each DV

#### IAF x exponent interaction

In [None]:
X = {
    1: ['Age', 'Gender_F', 'Education_years_OA'],
    2: ['Age', 'Gender_F', 'Education_years_OA', 'Alpha_CF_parietooccipital', 'Exponent_global'], 
    3: ['Age', 'Gender_F', 'Education_years_OA', 'Alpha_CF_parietooccipital', 'Exponent_global', 'Alpha_CF_parietooccipital_Exponent_global_interaction']
}

target_vars = ['Dprime', 'RVPA']

summaries = {}

for y in target_vars:
    predictor_columns = set()
    for predictors in X.values():
        predictor_columns.update(predictors)
    
    all_columns = list(predictor_columns) + [y]
    
    clean_df = fullsample_df.dropna(subset=all_columns)
    
    model = HierarchicalLinearRegression(clean_df, X, y)
    summary_df = model.summary()
    
    summaries[y] = summary_df

for y, summary_df in summaries.items():
    print(f"Displaying summary for {y}:")
    display(summary_df)

#### Aperiodic-adjusted alpha power

In [None]:
X = {
    1: ['Age', 'Gender_F', 'Education_years_OA'],
    2: ['Age', 'Gender_F', 'Education_years_OA', 'Alpha_absoluteBP_parietooccipital'], 
    3: ['Age', 'Gender_F', 'Education_years_OA', 'Alpha_absoluteBP_parietooccipital', 'Age_Alpha_absoluteBP_parietooccipital_interaction'] 
}

target_vars = ['Dprime', 'RVPA']

summaries = {}

for y in target_vars:
    predictor_columns = set()
    for predictors in X.values():
        predictor_columns.update(predictors)
    
    all_columns = list(predictor_columns) + [y]
    
    clean_df = fullsample_df.dropna(subset=all_columns)
    
    model = HierarchicalLinearRegression(clean_df, X, y)
    summary_df = model.summary()
    
    summaries[y] = summary_df

for y, summary_df in summaries.items():
    print(f"Displaying summary for {y}:")
    display(summary_df)

#### Exponent

In [None]:
X = {
    1: ['Age', 'Gender_F', 'Education_years_OA'],
    2: ['Age', 'Gender_F', 'Education_years_OA', 'Exponent_global'], 
    3: ['Age', 'Gender_F', 'Education_years_OA', 'Exponent_global', 'Age_Exponent_global_interaction'] 
}

target_vars = ['Dprime', 'RVPA']

summaries = {}

for y in target_vars:
    predictor_columns = set()
    for predictors in X.values():
        predictor_columns.update(predictors)
    
    all_columns = list(predictor_columns) + [y]
    
    clean_df = fullsample_df.dropna(subset=all_columns)
    
    model = HierarchicalLinearRegression(clean_df, X, y)
    summary_df = model.summary()
    
    summaries[y] = summary_df

for y, summary_df in summaries.items():
    print(f"Displaying summary for {y}:")
    display(summary_df)

#### Offset

In [None]:
X = {
    1: ['Age', 'Gender_F', 'Education_years_OA'],
    2: ['Age', 'Gender_F', 'Education_years_OA', 'Offset_global'], 
    3: ['Age', 'Gender_F', 'Education_years_OA', 'Offset_global', 'Age_Offset_global_interaction']
}

target_vars = ['Dprime', 'RVPA']

summaries = {}

for y in target_vars:
    predictor_columns = set()
    for predictors in X.values():
        predictor_columns.update(predictors)
    
    all_columns = list(predictor_columns) + [y]
    
    clean_df = fullsample_df.dropna(subset=all_columns)
    
    model = HierarchicalLinearRegression(clean_df, X, y)
    summary_df = model.summary()
    
    summaries[y] = summary_df

for y, summary_df in summaries.items():
    print(f"Displaying summary for {y}:")
    display(summary_df)

### Regression Plots

#### Regression plot for Figure 3 of significant HLR

In [None]:
# Regress Dprime on covariates and get residuals (independent variable)
X = fullsample_df[['Age', 'Education_years_OA', 'Gender_F']]
X = sm.add_constant(X)  # Add constant term for the intercept
x = fullsample_df['Dprime']
model_x = sm.OLS(x, X).fit()
x_resid = model_x.resid  # Residuals of Dprime

# Use raw Alpha_CF_parietooccipital (dependent variable)
y_raw = fullsample_df['Alpha_CF_parietooccipital']

# Create a DataFrame with residuals of Dprime and raw Alpha_CF_parietooccipital
residuals_df = pd.DataFrame({'x_resid': x_resid, 'y_raw': y_raw})

# Plot the residuals of Dprime against raw Alpha_CF_parietooccipital
sns.set_style("white")
sns.regplot(x='x_resid', y='y_raw', data=residuals_df, 
            scatter_kws={'s': 50, 'color': '#6180e9'},    # Set color of dots
            line_kws={'color': '#D85A5A'})                # Set color of regression line

plt.xlabel(r"SART_$\it{d'}$ (residuals)", fontsize=12)
plt.ylabel('IAF', fontsize=12)
sns.despine()

plt.show()

#### Regression plots for Supplementary Materials

In [None]:
y_raw = fullsample_df['Exponent_global']

X = fullsample_df[['Age', 'Education_years_OA', 'Gender_F']]
X = sm.add_constant(X)  
x = fullsample_df['RVPA']
model_x = sm.OLS(x, X).fit()
x_resid = model_x.resid  

residuals_df = pd.DataFrame({'x_resid': x_resid, 'y_raw': y_raw})

sns.regplot(x='x_resid', y='y_raw', data=residuals_df, 
            scatter_kws={'s': 50, 'color': '#6180e9'},    
            line_kws={'color': '#D85A5A'})                
plt.xlabel(r"RVP_$\it{A'}$ (residuals)", fontsize=12)
plt.ylabel('Aperiodic exponent', fontsize=12)
plt.text(-0.1, 1.1, 'A.', transform=plt.gca().transAxes, fontsize=14, fontweight='bold', va='top', ha='left')

sns.despine()

plt.show()

In [None]:
y_raw = fullsample_df['Exponent_global']

X = fullsample_df[['Age', 'Education_years_OA', 'Gender_F']]
X = sm.add_constant(X)  
x = fullsample_df['Dprime']
model_x = sm.OLS(x, X).fit()
x_resid = model_x.resid  

residuals_df = pd.DataFrame({'x_resid': x_resid, 'y_raw': y_raw})

sns.set_style("white")
sns.regplot(x='x_resid', y='y_raw', data=residuals_df, 
            scatter_kws={'s': 50, 'color': '#6180e9'},    
            line_kws={'color': '#D85A5A'})                
plt.xlabel(r"SART_$\it{d'}$ (residuals)", fontsize=12)
plt.ylabel('Aperiodic exponent', fontsize=12)
plt.text(-0.1, 1.1, 'B.', transform=plt.gca().transAxes, fontsize=14, fontweight='bold', va='top', ha='left')

sns.despine()

plt.show()

In [None]:
y_raw = fullsample_df['Offset_global']

X = fullsample_df[['Age', 'Education_years_OA', 'Gender_F']]
X = sm.add_constant(X)  
x = fullsample_df['RVPA']
model_x = sm.OLS(x, X).fit()
x_resid = model_x.resid  

residuals_df = pd.DataFrame({'x_resid': x_resid, 'y_raw': y_raw})

sns.regplot(x='x_resid', y='y_raw', data=residuals_df, 
            scatter_kws={'s': 50, 'color': '#6180e9'},    
            line_kws={'color': '#D85A5A'})                
plt.xlabel(r"RVP_$\it{A'}$ (residuals)", fontsize=12)
plt.ylabel('Aperiodic offset', fontsize=12)
plt.text(-0.1, 1.1, 'C.', transform=plt.gca().transAxes, fontsize=14, fontweight='bold', va='top', ha='left')

sns.despine()

plt.show()

In [None]:
y_raw = fullsample_df['Offset_global']

X = fullsample_df[['Age', 'Education_years_OA', 'Gender_F']]
X = sm.add_constant(X)  
x = fullsample_df['Dprime']
model_x = sm.OLS(x, X).fit()
x_resid = model_x.resid  

residuals_df = pd.DataFrame({'x_resid': x_resid, 'y_raw': y_raw})

sns.regplot(x='x_resid', y='y_raw', data=residuals_df, 
            scatter_kws={'s': 50, 'color': '#6180e9'},    
            line_kws={'color': '#D85A5A'})                
plt.xlabel(r"SART_$\it{d'}$ (residuals)", fontsize=12)
plt.ylabel('Aperiodic offset', fontsize=12)
plt.text(-0.1, 1.1, 'D.', transform=plt.gca().transAxes, fontsize=14, fontweight='bold', va='top', ha='left')

sns.despine()

plt.show()

In [None]:
y_raw = fullsample_df['Alpha_absoluteBP_parietooccipital']

X = fullsample_df[['Age', 'Education_years_OA', 'Gender_F']]
X = sm.add_constant(X)  
x = fullsample_df['RVPA']
model_x = sm.OLS(x, X).fit()
x_resid = model_x.resid  

residuals_df = pd.DataFrame({'x_resid': x_resid, 'y_raw': y_raw})

sns.set_style("white")
sns.regplot(x='x_resid', y='y_raw', data=residuals_df, 
            scatter_kws={'s': 50, 'color': '#6180e9'},    
            line_kws={'color': '#D85A5A'})                
plt.xlabel(r"RVP_$\it{A'}$ (residuals)", fontsize=12)
plt.ylabel('aIAP', fontsize=12)
plt.text(-0.1, 1.1, 'E.', transform=plt.gca().transAxes, fontsize=14, fontweight='bold', va='top', ha='left')

sns.despine()

plt.show()

In [None]:
y_raw = fullsample_df['Alpha_absoluteBP_parietooccipital']

X = fullsample_df[['Age', 'Education_years_OA', 'Gender_F']]
X = sm.add_constant(X)  
x = fullsample_df['Dprime']
model_x = sm.OLS(x, X).fit()
x_resid = model_x.resid  

residuals_df = pd.DataFrame({'x_resid': x_resid, 'y_raw': y_raw})

sns.set_style("white")
sns.regplot(x='x_resid', y='y_raw', data=residuals_df, 
            scatter_kws={'s': 50, 'color': '#6180e9'},    
            line_kws={'color': '#D85A5A'})                
plt.xlabel(r"SART_$\it{d'}$ (residuals)", fontsize=12)
plt.ylabel('aIAP', fontsize=12)
plt.text(-0.1, 1.1, 'F.', transform=plt.gca().transAxes, fontsize=14, fontweight='bold', va='top', ha='left')

sns.despine()

plt.show()

In [None]:
y_raw = fullsample_df['Alpha_CF_parietooccipital']

X = fullsample_df[['Age', 'Education_years_OA', 'Gender_F']]
X = sm.add_constant(X)  
x = fullsample_df['RVPA']
model_x = sm.OLS(x, X).fit()
x_resid = model_x.resid  

residuals_df = pd.DataFrame({'x_resid': x_resid, 'y_raw': y_raw})

sns.set_style("white")
sns.regplot(x='x_resid', y='y_raw', data=residuals_df, 
            scatter_kws={'s': 50, 'color': '#6180e9'},    
            line_kws={'color': '#D85A5A'})                
plt.xlabel(r"RVP_$\it{A'}$ (residuals)", fontsize=12)
plt.ylabel('IAF', fontsize=12)
plt.text(-0.1, 1.1, 'G.', transform=plt.gca().transAxes, fontsize=14, fontweight='bold', va='top', ha='left')

sns.despine()

plt.show()

### Mediation analysis

Mediation analysis described in:

3.0 Results
- 3.4 Mediation analysis

was completed in R. The code for this has been included in the folder for this study.