# Introduction

Hearing loss is becoming an increasingly common complaint, not solely due to aging populations, but also as a result of increasing incidence of noise-induced hearing loss in younger individuals @niskar2001estimated and @shargorodsky2010change. One of the major consequences is reduced speech-comprehension ability, especially in acoustically challenging conditions such as in reverberant or noisy environments. Individual differences in (dis)satisfaction with both hearing aids and cochlear implants calls for more investigation on interventions that could enhance speech comprehension abilities by exploiting existing cognitive mechanisms. To this end, this project aims at gaining fundamental knowledge on the neurocognitive basis of speech comprehension in acoustically challenging situations.

Our experiment uses electroencephalography (EEG) to investigate patterns of electrophysiological activity associated with better speech in noise comprehension. This pilot study has data from 15 healthy adult participants performing a sentence comprehension task with noise-vocoded speech and speech in speech-shaped noise stimuili. We use a multivariate pattern analysis approach focused on using EEG information preceding stimuli to decode accuracy of performance.

This pilot study is aimed at specifying targets and experimental design for subsequent studies focused on potential neurofeedback interventions.

::: {.callout-important collapse="true"}
# 👁️ Note on the links to scripts on Github \>\>

This is ongoing work and contains links to scripts in the project's Github repository. Here we often use the **URL** of a specific folder in the scripts' repository, instead of using the Github's **permalink** to a script, which would link to the script version at the time of writing this report. This was done because it is possible that some scripts changed or were debugged after this report. Based on the description, filenames and comments in the code, it should be easy to identify which scripts performed the reported actions. In other instance, like for stimuli generation and presentation of the experiment, we use the permalink to point at the script version used at the time of the experiments.

If you find any broken link or cannot find a script described in this report, please contact the authors
:::

# Methods

## Participants and procedures

We recruited 15 adult participants; but the task data from one participant was discarded due to issues during recording, which results on an initial of n = 14 for the main experimental task. Participants were recruited using a mailing list of the Faculty of Psychology (University of Zurich) and flyers. Inclusion criteria were: age 18-35 years, normal hearing without hearing aids, normal vision (corrections allowed), German as first language, right-handed. Exclusion criteria were being fuent in more than two languages, having hearing aids or history of hearing impairments or having any psychological or psychiatric condition that may impede adequate task performance (e.g., attentional problems, speech comprehension issues). All participants signed an inform consent form before participation with included a 60 CHF reward for participation. The study was approved by the ethics commision of the canton of Zurich.

Before selection for the experiment participants were screened via phone call or email on the main inclusion criteria for the study. Upon arrival to the lab, participants first filled in a short questionnaire with additional questions on language, hearing and any relevant psychological or psychiatric condition that may impact task performance. The questionnaire was presented and filled using the web application [Redcap](https://www.project-redcap.org/) to ensure that any personal data was secured. After completion of the study, the survey data was downloaded as a table with identifying variables like D.O.B masked. The file downloaded from Redcap with **participant's information** can be found in @sec-Supplement1.

After completing the questionnaire participants were informed about the specific content of the session. Electrode placement and signal check took approximately between 20 and 40 minutes. Then, a few example trials were presented before starting the session consisting of the following: a 4 minutes resting state recording with eyes closed, the main task (with 4 blocks of approximately 15 minuntes duration each) and finally, another 4 minutes of eyes closed resting state.

## Resting-state recordings

The resting-state recordings before and after the task were identical. Subjects were instructed to close their eyes and remain still for 4 minutes. The start and end of the 4 minutes block was indicated by a beep sound (1-second long).

## Main Task

### Trial design

We used a version of the Coordinate Response Measure (CRM) task used in @brungart2001evaluation. In the task, participants listen to different versions of a sentence with a fixed structure and three words that change in every trial. There are 4 possible alternatives for each of the targets, thus there are 64 possible variations of the sentence. The task was originally designed to investigate intelligibility of speech commands (in a military context) in audio transmissions with various types of background competition. The German sentence used in this study was: "*Vorsicht* \[call sign\], *geh sofort zum* \[color\] *Feld von der Spalte*\[number\]" (Translated with a small variation from the original article in English: "*Ready \[call sign\] go to \[color\]\[number\] now*"). The 4 possible call sign items were: Adler, Drossel, Tiger or Kröte (eagle, thrush, tiger or toad). The possible colors were gelben, gruenen, roten, weissen (yellow, green, red or white). The numbers were Ein, Zwei, Drei or Vier (one to four).

![Figure 1. Overview of the experimental session](images/Experiment_outline.jpg){.lightbox}

In each trial the sentences were aurally presented and a fixation cross was shown in the center of the screen. After playing the sentence, an array of 4 x 3 was shown with drawings representing the target items, the array had always in the same order and columns 1-3 represented targets 1-3 in the same order as they appeared in the sentence, that is, first the column with call signs, then colors and then numbers. Participants had to click with the mouse on the picture representing the item they heard in each of the columns before moving on to the next trial. The trials were response-terminated, but there was a response time limit of 10 seconds before moving on to the next trial. In the example trials, presented with 3 different levels of auditory difficulty the researchers conducting the experiment ensured that participants understood that they should click on each of the three columns in all trials, even if they were not sure what item they heard.

### Blocks design

The sentences were presented in blocks of two **audio manipulation conditions**: noise vocoded (NV) or with a background speech-shaped noise (SSN) and there were three possible **levels of difficulty** (easy, medium, hard). Half of the sentences were synthesized with a female voice and half with a male voice. The *Stimuli* section provides more details on these manipulations.

The task consisted of 4 blocks of trials: two blocks of NV and two blocks of SSN stimuli presented in alternating order. There were four different block orders (all possible block sequences without consecutive blocks of the same condition, e.g., NV1-SSN1-NV2-SSN2). Participants were randomly assigned to these sequences. In each block there were 96 trials: 16 sentences x 2 voices x 3 degradation levels (shuffled within the block and all unique trials). Thus in total 192 NV and 192 SSN unique stimuli were presented in the experiment.

## Stimuli

The stimuli for the main task were presented using the external audio interface Fireface UFX II and [ER-1 tubal in-ear headphones](https://shop.neurospec.com/er2-tubal-insert-ear-phones).

### Speech generation

We used the [Google Cloud Text-to-speech API](https://cloud.google.com/text-to-speech/) to generate the 64 possible variations of our sentence in a female and male speaker voice. Each sentence was output in a separate .wav file with 44.1KHz sampling rate. After trying several of the freely available voices we chose the [neural2 voices](https://cloud.google.com/text-to-speech/docs/voice-types#neural2_voices) 'de-DE-Neural2-D' and 'de-DE-Neural2-F' for male and female speech, respective, as they sounded more natural (this was verified by native speakers). We used a Python script to communicate with the text-to-speech API (see\
[sentence generation script](https://github.com/Neuroling/SPINCO_SINEEG/blob/9d0c5f89611f521f7e93dd9b2087706180e1036d/Gen_stimuli/Gen_speech/TTS_sentenceGenerator.py) in our project repository).

::: {.callout-note collapse="true"}
# Note on stimuli metadata \>\>

In the 'Stimuli' folder of the data repository (named 'tts-golang-44100Hz') filenames follow the convention `<language>_<algorithm>-<voice>_<1stToken>-<2ndTarget>-<3rdTarget>.wav`. e.g. DE_Neural2-D_Ad-ge-Dr.wav is German Neural2 type of speech synthesizer voice D, the sentence contained the target items :Adler, gelbe, Drei. ). The wav files are accommpanied by .txt files with the sentences in text. See the corresponding README files for details. }
:::

##### Onset and offset times of target words

The onsets and offsets times of the target items, as well as first and last sound of the sentence were detected following a semiautomatic approach. We used the free webservice for phonetics and speech processing [Webmaus](https://clarin.phonetik.uni-muenchen.de/BASWebServices/interface/WebMAUSBasic) to compute a word segmentation, phonetic segmentation and labeling. The input were the wav files with the speech signal together with a text file with the orthographic transcripts. The output were .TextGrid files with those analyses. They were then used in the phonetics tool [Praat](https://www.fon.hum.uva.nl/praat/) to visually check the detected onsets and offsets of the target words, first and last sound (the segmentation boundaries were displayed overlayed together with the waveforms of the sentences speech signal). The boundaries indicating onsets and offsets of interest were then manually adjusted in all sentences by two research assistants. Finally the values averaged across both researchers were used in the experiment scripts to send the corresponding triggers.

## Audio manipulations

### Noise vocoding

With the noise vocoding technique we manipulate and degrade spectral information of speech stimuli, while preserving slow varying temporal cues. This approach is used as a crude simulation of the loss of frequency information experienced by individual with cochlear implants. In short, it consists of filtering the signal into a range of frequency bands, extracting the smoothed amplitude profile of each band, using this profile as modulating wave of some noise (typically white noise) and then reconstructing the signal. The less frequency bands we use the less channels we have to reconstruct the signal and the less intelligible the output will be (16 channels yields intelligible speech, 3 practically non-intelligible). A relevant parameter is how the frequency bands are spaced, e.g., they can be spaced linearly, logarithmically or following other functions. The Greenwood frequency-position function (see @greenwood1990cochlear for a review) is used often to estimate how the frequencies are represented in cochlear implant electrodes. It correlates the position of hair cells in the inner ear to the frequencies that stimulate their auditory neurons.

In our study we used the vocoding approach described in @aller2022differential, and modified from @zoefel2023intelligibility. The scripts are at the [Gen_speech_noise folder](https://github.com/Neuroling/SPINCO_SINEEG/tree/main/Gen_stimuli/Gen_speech_noise)of our repository. A wrapper [*run* script](https://github.com/Neuroling/SPINCO_SINEEG/blob/9d0c5f89611f521f7e93dd9b2087706180e1036d/Gen_stimuli/Gen_speech_noise/Run_vocoder_aller.m) is used to call the associated functions for this manipulation. The main function used was [vocode_aller.m](https://github.com/Neuroling/SPINCO_SINEEG/blob/9d0c5f89611f521f7e93dd9b2087706180e1036d/Gen_stimuli/Gen_speech_noise/functions/vocode_aller.m).

In the approach of @aller2022differential the signal was filtered into 16 logarithmically-spaced frequency bands. Then the envelope (env) of each frequency band (b) was multiplied by a proportion (p) of the broadband envelope: envfinal(b)=env(b)*p+env(broadband)*(1−p). The proportion ranges from 0 to 1. If p = 1 it is equivalent to applying a 16-channels vocoder (i.e., fully intelligible). Smaller *p* values (e.g., 0.8) lead to different levels of degradation. This approach allows a bit finer tunning of the intelligibility levels than changing the number of channels in the 'conventional' vocoding approach. Internal pilot tests in our lab confirmed this advantage for our stimuli.

We used the broadband proportion levels `0.5, 0.4 and 0.3` for the easy, medium and difficult levels, respectively. This choice was based on previous reports (e.g., @HervaisAdelman2012, @Davis2005) and in-house expertise an internal pilots. However, there are no previous report examining intelligibility and this vocoding approach with this paradigm and stimuli. The current pilot analysis should help refining the values used in this manipulation.

The resulting files had a 44.1KHz sampling rate and 24 bits per sample. The files were normalized to -23 LUFS (Loudness Unit Full Scale) using Matlab's function [integratedLoudness](https://ch.mathworks.com/help/audio/ref/integratedloudness.html). In order to facilitate offline adjustment for audio-delays the EEG triggers, we created a copy with two channels in which one of them had a a click (100-miliseconds) inserted at the start of the file (see [script here](https://github.com/Neuroling/SPINCO_SINEEG/blob/63d9c14fb229944d95a814f2eb1aa8844700787f/Gen_stimuli/Gen_speech_noise/utils/Add_click.m))

### Speech in speech-shaped noise

Another auditory manipulation in this study was embedding the sentences into background noise. This is a frequent acoustically challenging circumstance, for example, when having to undestand someone with the noise of traffic of in the background. As a control in this manipulation, the background noise used was created to have similar spectral properties as speech, that is, we used speech-shaped noise.

The [*runner* script](https://github.com/Neuroling/SPINCO_SINEEG/blob/9d0c5f89611f521f7e93dd9b2087706180e1036d/Gen_stimuli/Gen_speech_noise/Run_speechShapeNoise.m) called an [in-house function](https://github.com/Neuroling/SPINCO_SINEEG/blob/9d0c5f89611f521f7e93dd9b2087706180e1036d/Gen_stimuli/Gen_speech_noise/functions/speechshapednoise.m) to filter white noise with the long-term average speech spectrum (LTASS) derived from the signal, which was the concatenated sentences (filtered with a 70 to 5000 Hz Butterworth filter, see scripts for details). The LTASS calculation used a function from [SoundZone_Tools](https://github.com/JacobD10/SoundZone_Tools/archive/master.zip). After generating the speech-shaped noise from the concatenated sentence, chunks of noise selected at random from the concatenated signal, were added as background to the sentence, with 0.5 seconds in and out ramps.

We used the SNR levels `-7 db , -9 db and -11 db` for the easy, medium and difficult levels, respectively. As in the case of the noise vocoding, whether these values are optimal for our difficulty manipulation is still a matter of discussion.

The resulting files had a 44.1KHz sampling rate and 24 bits per sample. They were normalized and a two-channel file copy with one of the channels containing a click were generated ollowing the same precedure as in the vocoded condition.

## EEG

### Signal acquisition

The EEG recordings were conducted using the Biosemi Active Two MK2hS-System with a sampling rate of 2048 Hz. We used 64 scalp electrodes positioned according to the 10-20 system. In addition, six external Flat-Type Active electrodes were used: four electrodes for recording the vertical and horizontal electro-oculogram (EOG) and two electrodes were placed at mastoids for off-line reference. The Biosemi system uses two additional electrodes (Common Mode Sense \[CMS\] and Driven Right Leg \[DRL\]) creating a feedback loop to replace the conventional ground electrode (see www.biosemi.com/faq/cms&drl.htm for details). The CMS electrode served as online reference. The offset of the electrodes was kept between +20 and -20 µV.

### Minimal preprocessing of source data into 'raw' data

During the experimental session a single file was generated containing the resting-state recordings as well as the main task. The file was in Biosemi's [bdf](https://www.biosemi.com/faq/file_format.htm) format which is a 24 bits version of the popular [EDF](https://www.edfplus.info) format. The file contains the main task and two resting state recordings (before and after task). The file has an additional channel labeled 'ergo1' with the audio output signal (sent also to the headphones) to help correcting for audio delay.

The [first script for importing data](https://github.com/Neuroling/SPINCO_SINEEG/blob/main/Analysis/SiN/EEG/1_source_to_raw/Import_01_importBDF_loadLocs_split.m) performs the following:

-   *Event trigger adjustment*. The click's inserted at the start of the *audio signal* are detected for each trial in the task. The onset of the clicks is used to adjust the event triggers (target word onsets and start/end of the sentence) accounting for any audio output delay
-   *Split file* in three datasets: `s001_task-rest-pre.set`, `s001_task-rest-post.set`, `s001_task-sin.set` (event triggers are used to determine the split segments)
-   *Load Channel locations*
-   *Create Metadata*. This includes`.json` metadata files and `.tsv` channel location and event table files. **NOTE** we add the trial accuracy values into the *event table file*, this values are read from the experimental file (in behavioral folder 'beh')

The resulting datasets are considered 'raw' for subsequent preprocessing and analysis.

::: {.callout-note collapse="false"}
### Why was minimally preprocessed raw saved as .set and not .bdf ?

The .set format was preferred after finding several issues when exporting to formats as bdf or edf using EEGlab and MNE toolboxes. When the 'Ergo1' channel with digital audio recorded was loaded, there were (what appeared to be) sampling errors in the exported bdf or edf from EEGlab (using *writeeg* from Biosig) and EDF with [mne export raw](https://mne.tools/stable/generated/mne.export.export_raw.html).These problems were not present when using the Biosemi Actitools software (however this is not a desired step as it requires manual input in GUI and could not be integrated in the scripts).
:::

### Preprocessing

#### Main Pipeline (automatic with Automagic)

For our main analysis we opted for a fully automated pipeline implemented with the open-source Matlab toolbox [automagic](https://github.com/methlabUZH/automagic), see (@pedroni2019automagic). The toolbox is actually a wrapper that runs different functions from EEGlab (@delorme2004eeglab), probably the most widely used open-source software for EEG analysis (also Matlab-based). Automagic facilitates selection of multiple options for data cleaning and artifact rejection available in different plugins for EEGlab. In addition, it provides its own data quality summary measures. We chose Automagic as our main preprocessing approach first to facilitate automation, potentially iterating through different pipeline variations and second, to obtain more *objective* descriptors of data quality, constraining the researchers' degrees of freedom and facilitating transparency on subsequent decisions and explorations on data quality levels vis-a-vis main analyses results.

##### Quality assessment

::: {.callout-warning collapse="false"}
# ⏳ This work is pending

Automagic provides .txt logs and pictures to assess quality. The toolbox GUI also provides a utility to inspect and rate quality (note that the version of Automagic should be the same than the one that generated the input mat file for the Quality Assessment window to work. Additional efforts could summarize the automated quality ratings and further work could iterate through different preprocessing pipelines in Automagic and compare results.
:::

## Code organization

#### Code Repository

The code repository of this project is <https://github.com/Neuroling/SPINCO_SINEEG>. The repository contains a wiki with some information (less detailed than this report and some for internal usage).

#### Main scripts folders

-   Analysis/SiN: all required for preprocessing (including import of source data) and analysis. Includes the scripts to generate this report
-   Experiments: psychopy scripts. The audio files are not pushed into github (too many). The files are in the 'Stimuli' folder of the data storage repository \~\Projects\Spinco\SINEEG\Stimuli\AudioGens\selected\_audio_psychoPy_click
-   Gen_stimuli: generate stimuli sentences and add noise / vocode
-   Misc: some tests and mixed files. Including \_fromRobertBecker: collection of scripts shared by R.B
-   Utils: various mixed utilities

#### Relative paths

Analysis scripts use paths relative to the scripts folder: the parent scripts folder is assumed to be at the same level at the parent data folder. This scheme was not implemented in scripts for stimuli generation.

#### Metadata

Check the README files at the code repository subfolders to get oriented. The main scripts and their folders are numbered indicating the order in which they should be run

## Analysis

::: {.callout-warning collapse="false"}
# ⏳ This work is in progress...
:::

# Results

## Behavioral analysis

### Read gathered table 
Read gathered table, perform minor format adjustments 

In [124]:
# Library installations: 
#conda install -c plotly plotly 
#conda install -c conda-forge itables
# ------------------------------------------------------------------
import sys
import os
import glob
import re
import pandas as pd
import numpy as np
import plotly.express as px 
from itables import init_notebook_mode, show 
from datetime import date
today = date.today()
#print("Run date:", today)

# File paths
thisScriptDir = os.getcwd()
baseDir = thisScriptDir[:thisScriptDir.find("Scripts")]
dirinput = os.path.join(baseDir + 'Data','SiN','analysis','beh','task-sin')
diroutput = os.path.join(baseDir + 'Data','SiN','analysis','beh','task-sin')

# Find file read data set 
fileinput =  [os.path.join(dirinput, f) for f in os.listdir(dirinput) if f.startswith('Gathered_')]
df = pd.read_csv(fileinput[0], index_col=None)

# Relabel noise levels
levelmapping = {'0.6p': '1_easy','0.4p': '2_mid','0.2p': '3_hard','-7db': '1_easy','-9db': '2_mid','-11db': '3_hard'}
df['levels'] = df['levels'].replace(levelmapping) 

 #  ---------------
print(' Table read with ' + str(len(df)) + ' rows and ' + str(len(df. columns)) + ' columns')
#show(df, scrollY="400px", scrollCollapse=True, paging=False)

 Table read with 5376 rows and 16 columns


subj,noise,block,voice,sentence,levels,callSign,colour,number,callSign_resp,col_resp,numb_resp,callSignCorrect,colourCorrect,numberCorrect,n_cor_items
Loading... (need help?),,,,,,,,,,,,,,,


### Transform to long 

In [116]:
df_long= pd.melt(df, id_vars=['subj','noise','block','voice','levels','sentence','callSign','colour','number'],value_vars=['callSignCorrect', 'colourCorrect','numberCorrect'],
                 value_name='accu', var_name='position')

 
df_long['accu'] = df_long['accu'].astype(int)
show(df_long)

subj,noise,block,voice,levels,sentence,callSign,colour,number,position,accu
Loading... (need help?),,,,,,,,,,


#### Summarize accuracy score

In [125]:
 # Define how many unique targets
unique_levels = {col: df_long[col].nunique() for col in ['block','levels','noise','position','voice']} 
print(unique_levels)
unique_targets = sum(unique_levels.values()) + 4 # there are 4 possible items  at each position in this experiment   

# Summarize Accuracies 
df_long_accu = df_long.groupby(['subj','noise', 'block', 'voice', 'levels','position'])['accu'].sum().reset_index()
df_long_accu['accu']/=unique_targets


show(df_long_accu)


{'block': 2, 'levels': 3, 'noise': 2, 'position': 3, 'voice': 2}


subj,noise,block,voice,levels,position,accu
Loading... (need help?),,,,,,


#### Linear mixed models

FIRST make sure the variables are of the right type: 

In [118]:

# Define variable types ! 
df_long_accu["block"] = df_long_accu["block"].astype("category")
df_long_accu["noise"] = df_long_accu["noise"].astype("category")
df_long_accu["levels"] = df_long_accu["levels"].astype("category")
df_long_accu["accu"] = df_long_accu["accu"].astype("int")


##### Model including noise type
Just for orientation. Note that the two noise types are distinct . EEG analysis will be performed separately for each noise type


In [122]:
import statsmodels.api as sm
import statsmodels.formula.api as smf


# Specify models 
md = smf.mixedlm("accu ~  noise * levels + block", df_long_accu, groups = df_long_accu["subj"]) 
mdf = md.fit()
print(mdf.summary())

                  Mixed Linear Model Regression Results
Model:                   MixedLM       Dependent Variable:       accu     
No. Observations:        1008          Method:                   REML     
No. Groups:              14            Scale:                    0.1603   
Min. group size:         72            Log-Likelihood:           -532.9194
Max. group size:         72            Converged:                Yes      
Mean group size:         72.0                                             
--------------------------------------------------------------------------
                                Coef.  Std.Err.   z    P>|z| [0.025 0.975]
--------------------------------------------------------------------------
Intercept                        0.438    0.042 10.376 0.000  0.356  0.521
noise[T.SiSSN]                   0.101    0.044  2.317 0.021  0.016  0.187
levels[T.2_mid]                 -0.220    0.044 -5.042 0.000 -0.306 -0.135
levels[T.3_hard]                -0.345    0.



##### Models per noise type 

In [123]:
import statsmodels.api as sm
import statsmodels.formula.api as smf

# Empty list to store the fitted models
fitted_models = []

# Group the DataFrame by 'noise' column
grouped = df_long_accu.groupby('noise')

# Iterate over each group
for noise, group_df in grouped:
    # Fit the mixed-effects model for the current group
    model = sm.MixedLM.from_formula("accu ~ levels + block", group_df, groups=group_df["subj"])
    result = model.fit()
    
    # Store the fitted model
    fitted_models.append((noise, result))

# Now, fitted_models contains the fitted model for each group defined by 'noise'
for noise, result in fitted_models:
    print("Noise:", noise)
    print(result.summary())




Noise: NV
           Mixed Linear Model Regression Results
Model:              MixedLM  Dependent Variable:  accu     
No. Observations:   504      Method:              REML     
No. Groups:         14       Scale:               0.1565   
Min. group size:    36       Log-Likelihood:      -263.8810
Max. group size:    36       Converged:           Yes      
Mean group size:    36.0                                   
-----------------------------------------------------------
                 Coef.  Std.Err.   z    P>|z| [0.025 0.975]
-----------------------------------------------------------
Intercept         0.427    0.045  9.542 0.000  0.339  0.514
levels[T.2_mid]  -0.220    0.043 -5.103 0.000 -0.305 -0.136
levels[T.3_hard] -0.345    0.043 -7.999 0.000 -0.430 -0.261
block[T.2]        0.016    0.035  0.450 0.652 -0.053  0.085
Group Var         0.011    0.015                           

Noise: SiSSN
           Mixed Linear Model Regression Results
Model:               MixedLM  Dependen

NameError: name 'warnings' is not defined

# Discussion

## Potential secondary analyses

### Exploring data qualities and minimal preprocessing

-   Neurofeedback studies deal with very noisy data in real-time and this requires some basic preprocessing to ensure we are targetting the intenteded neural processes, or at least approximating them as much as possible. Thus a potential question is how different preprocessing pipelines may affect our main analysis results (e.g., multivariate pattern analysis and classification)

-   A related question is how individual differences in data quality before/after preprocessing may affect performance of classifiers

### Utilizing the resting-state recordings

-   Explore individual alpha peaks (IAF) in resting state recordings. Use IAF-based band as target for decoding. If neurofeedback settings are to work out, these individual differences would be important. For example, we may not be able to adequately capture alpha activity in all subjects, due to SNR or to actual differences in oscillatory activity. Here the emphasis is on alpha as an example of typically clear feature, but this may apply to other features or frequency ranges.

-   Explore consistency of IAF (before and after task). This can give an impression of robustness of features like individual alpha peak and by extension of using power averaged by the classical frequency bands.

-   Explore correlations between resting state power and task performance


# Appendix

## Appendix 1 - Additional methods information

### Participant information {#sec-Supplement1}

This file was downloaded from Redcap, any potentially identifying information like date of birth was masked from the file by Redcap.


In [None]:
import os
# Reading quality assessment tab
thisScriptDir = os.getcwd()
baseDir = thisScriptDir[:thisScriptDir.find("Scripts")]
fileinput = os.path.join(baseDir + 'Data','SiN','sourcedata','Participant_info','SINEEG_MASKED_DATA_LABELS_2023-10-06_1101.csv')

# Find file read data set 
df = pd.read_csv(fileinput)
df = df[df['Study subject ID'].str.startswith('s')].reset_index(drop=True) # filters out the 2 pilot subjects (starting with p) which are not usable

# render table 
show(df, search = True, 
     scrollY='400px', paging=False,  
     columnDefs=[{"width": "100px", "targets": "_all"}],
     scrollX=True,
     style="width:1200px", autoWidth=False)

## Manual preprocessing: bad channel selection {#sec-Supplement2}

The raters Sybille and Sam checked the data and marked bad channels. This was compared with the automatic bad channel selection from Automagic (Note there was some preprocessing before manual inspection that was not done exactly the same as in automagic, see the corresponding method section).


In [None]:
# Reading quality assessment tab
thisScriptDir = os.getcwd()
baseDir = thisScriptDir[:thisScriptDir.find("Scripts")]
fileinput = os.path.join(baseDir + 'Data','SiN','derivatives_SM','QualityAssessment.xlsx')
# Find file read data set 
df = pd.read_excel(fileinput, 'QA')
# render table 
show(df, search = True, 
     scrollY='400px', paging=False,  
     columnDefs=[{"width": "100px", "targets": "_all"}],
     scrollX=True,
     style="width:1200px", autoWidth=False)

## Appendix 2: Alternative pipeline for a MSc Thesis

The following preprocessing and analysis were performed for the Master Thesis of Sybille Meyer.

### Semiautomatic preprocessing

A semiautomatic pipeline (find scripts [here \>\>](https://github.com/Neuroling/SPINCO_SINEEG/tree/main/Analysis/SiN/EEG/2_preprocessing/Semi_manual) (scripts numbered to indicate run sequence) was set for the master Thesis of Sybille Meier. For this pipeline we did **not** used Automagic. This semimanual approach is largely (but loosely) based on the [EEGlab tutorials](https://eeglab.org/tutorials/). The data folder for this pipeline was named **derivatives_SM**

The main steps were:

##### 1. (Script) Initial filters and downsampling for Filter

-   EEGlab raw data sets were loaded, mastoids and external Ergo1 (with the audio signal) were excluded.
-   A 1 hz high pass filter was applied (slightly aggressive for ICA) together with a 50 Hz notch filter (using pop_cleanline).
-   Data were re-reference to average of the 64 scalp channels
-   Data was downsample to 128 Hz (original datasets were large, containing the entire experimental task \~ 1 h of recording)
-   Files were saved before and after downsampling

##### 2. (Manual step) Visual inspection for bad channel identification and rejection

Researchers were instructed to search for channels that are persistently unstable, noisy or flat. For further information the [eeglab Channel_rejection tutorial](https://eeglab.org/tutorials/06_RejectArtifacts/Channel_rejection.html) is proposed. Two raters performed this inspection and the bad channels identified by them were compared with those selected by automagic. Researchers were also asked to describe any potential quality issue on the data. The channels that had been selected by the two raters and Automagic were selected for rejection. You can find the table with this info at the supplementary table @sec-Supplement2.

##### 3. (Script) Exclude bad channels and run Independent Component Analysis-ICA

Channels flagged as bed were excluded and ICA decomposition run

##### 4. (Manual step) Visual inspection of Independent Components

The Master student was instructed to visual inspect the ICs and mark bad components using the information from the automatic labeling provided by EEGlab. The suggestion was to flag for rejection only the components:

-   Labeled with a probability \> 90 % as EYE artifacts

-   Labeled with a \> 90 % probability of being MUSCLE

These were rather arbitrary thresholds proposed to reject only those components that were very clearly artifactual and avoid rejection of too many mixed components. More detailed guidelines on criteria for rejecting ICs were provided in the official [eeglab documentation](https://eeglab.org/tutorials/06_RejectArtifacts/RunICA.html). However, within the time constrains of a Master thesis we decided to just apply these two basic criteria. A [short screen recording](https://clipchamp.com/watch/s31smXGqWwJ) performing this action was provided.

::: {.callout-warning collapse="false"}
### Potential issues with this pipeline

A potential issue is doing the average reference before ICA, this doesn't seem to be a major issue [as discussed here](https://www.fieldtriptoolbox.org/faq/should_I_rereference_prior_to_or_after_ica_for_artifact_removal/). Another issue for the further analysis is the 1 Hz high pass filter which is rather 'heavy'. This may benefit ICA as it doesn't deal well with slow drifts but may be problematic for, e.g., ERP analysis. An alternative pipeline could consider running ICA in filter data like this but afterwards apply the IC matrix after rejection of the bad components into the unfiltered data or data using a less strong filter.
:::

### Analysis

::: {.callout-warning collapse="false"}
# ⏳ This is work in progress
:::

#### Behavioral data

#### EEG data

# References