# Final Model for Paper 2 - INTERVENTION ONLY!

## UPDATED Analysis Plan
Throughout the notebook, we complete the following tasks:
1. Fit group spectra
2. Create a .csv export for the following metrics:
    - All peak metrics (4 columns: cfs, pws, bws, model index)
        - Make sure you assign frequency bands based off of cfs value
    - All Model Metrics
        - Aperiodic parameters (2 columns: offsets, exps [in order models])
        - Errors
        - R-squared
    - **Make sure you use the model index to pull relevant information from `df_final`**
3. Plot mean spectra with standard deviation for all 11 clusters. (although we will only be using a select few).
    - Plot with aperiodic fit as dotted line (just for record keeping)
    - Plot as flattened spectrum
        - For these spectra, we will need to plot all participants averaged. However, we will create the following splits:
                - Baseline
                - All Executive Function
                - Shoulder Spectra averaged and Tandem Spectra Average (all participants should have data from all trials)

In [1]:
# Spectral parameterization imports
from specparam import SpectralGroupModel

# Import custom function
from create_intervention_specparam_plots import *

### Load Datasets

In [2]:
df = pd.read_pickle('E:/Tasnim_Dissertation_Analysis/specparam_analysis/Paper 2/intervention/results/Paper2_Intervention_df_final.pkl')

### Fit Spectra

In [3]:
# Extract spectra from final dataframe
spectra = np.array([spec for spec in df['spectra']])
freqs = np.arange(251)
# Initialize and fit new SpectralGroupModel on cleaned data
fg = SpectralGroupModel(peak_width_limits=[2, 8], min_peak_height=0.2, peak_threshold=2,
                               max_n_peaks=6, verbose=False)
freq_range = [1, 55]
fg.fit(freqs, spectra, freq_range)
fg.print_results()
fg.save_report('E:/Tasnim_Dissertation_Analysis/specparam_analysis/Paper 2/intervention/results/final_model/final_intervention_groupModel_for_Paper2.pdf')

                                                                                                  
                                          GROUP RESULTS                                           
                                                                                                  
                            Number of power spectra in the Group: 229                             
                                                                                                  
                        The model was run on the frequency range 1 - 55 Hz                        
                                 Frequency Resolution is 1.00 Hz                                  
                                                                                                  
                              Power spectra were fit without a knee.                              
                                                                                                  
          

### Export Summary Data as .csv files
They will be analyzed further to summarize our data for the manuscript

#### Extracting Parameters

In [4]:
# Extract peak parameters
peaks = fg.get_params('peak_params')  # 4 columns: cfs, pws, bws, model index
aps = fg.get_params('aperiodic_params')  # 2 columns: offsets, exps [in order of channels/models]
# Extract goodness-of-fit metrics
errors = fg.get_params('error')
r2s = fg.get_params('r_squared')

##### Peaks Export
For each peak that we export, we will need to export its model's following values from `df`:
1. subject
2. session
3. experience
4. component
5. cluster

In [5]:
peaks_export = pd.DataFrame(peaks, columns=['cf', 'pw', 'bw', 'model_ind'])

# Convert Model Indices to integers to index from df and MERGE
peaks_export['model_ind'] = peaks_export['model_ind'].astype(int)
peaks_merged = peaks_export.merge(
    df[['subject', 'session', 'experience', 'component', 'cluster']],
    left_on='model_ind',
    right_index=True,
    how='left'
)
print("Method 1 - Using pandas merge:")
print(peaks_merged.head())

# Creating a column for frequency bands
def assign_freq_band(cf_value):
    if 1 <= cf_value < 4:
        return "delta"
    elif 4 <= cf_value < 8:
        return "theta"
    elif 8 <= cf_value < 12:
        return "alpha"
    elif 12 <= cf_value < 20:
        return "low_beta"
    elif 20 <= cf_value < 30:
        return "high_beta"
    elif 30 <= cf_value < 55:
        return "gamma"
    else:
        return None  # Default if cf doesn't fall within the specified ranges
peaks_merged['freq_band'] = peaks_merged['cf'].apply(assign_freq_band)

# Save as .csv file for analysis on R
peaks_merged.to_csv('E:/Tasnim_Dissertation_Analysis/specparam_analysis/Paper 2/intervention/results/final_model/peaks.csv', index=False)

Method 1 - Using pandas merge:
          cf        pw        bw  model_ind  subject session    experience  \
0  11.902783  0.361351  3.815842          2  exgm005      s2  intervention   
1  10.313496  0.549711  3.939707          4  exgm012      s2  intervention   
2  15.397515  0.329851  7.899401          5  exgm016      s2  intervention   
3  12.514357  0.341992  6.668084          6  exgm018      s2  intervention   
4  30.564168  0.361610  2.000000          6  exgm018      s2  intervention   

   component  cluster  
0          1        3  
1          5        3  
2          1        3  
3          7        3  
4          7        3  


##### Model Metrics Export
Just like above we will need to export each model's following values from `df`:
1. subject
2. session
3. experience
4. component
5. cluster

In [6]:
# Ensure all arrays are 2D
r2s = r2s.reshape(-1, 1)  # Reshape r2s to (12, 1)
errors = errors.reshape(-1, 1)  # Reshape errors to (12, 1)

# Horizontally stack aperiodic params, R^2, and errors
model_metrics = pd.DataFrame(
    data=np.hstack([aps, r2s, errors]),  # Horizontally stack arrays
    columns=["offset", "exp", "r2", "error"]  # Column names
)

# Add Model Index for set up for joining with model information
model_metrics["model_ind"] = np.arange(len(df), dtype=int)

# Add Model Details to model_metric dataframe
model_metrics_merged = model_metrics.merge(
    df[['subject', 'session', 'experience', 'component', 'cluster']],
    left_on='model_ind',
    right_index=True,
    how='left'
)

# Write .csv to results folder
model_metrics_merged.to_csv('E:/Tasnim_Dissertation_Analysis/specparam_analysis/Paper 2/intervention/results/final_model/model_metrics.csv', index=False)

## Creating Plots for each cluster

In [7]:
create_all_plots(df, fg) # make sure you edit this function by the time you are analyzing two sessions!

Loading intervention assignments from ../demographicsPsych/data/intervention_assignments.xlsx...
Loaded 63 intervention assignments

Filtered to 229 rows for 'intervention' task in session 2

Starting plot generation...
Clusters: [np.int64(3), np.int64(4), np.int64(5), np.int64(6), np.int64(7)]

Processing Cluster 3 (49 observations):
  Music_Listening: 19 observations
  Biking: 16 observations
  Dance_Exergaming: 14 observations
  Created intervention comparison plots for cluster 3

Processing Cluster 4 (47 observations):
  Biking: 19 observations
  Dance_Exergaming: 16 observations
  Music_Listening: 12 observations
  Created intervention comparison plots for cluster 4

Processing Cluster 5 (43 observations):
  Music_Listening: 18 observations
  Dance_Exergaming: 14 observations
  Biking: 11 observations
  Created intervention comparison plots for cluster 5

Processing Cluster 6 (54 observations):
  Biking: 21 observations
  Music_Listening: 19 observations
  Dance_Exergaming: 14 obs