## Background
Most of the simulation/plotting work so far has used `mne` in some way, but as this is supposed to be a generalizable `element workflow`, I want to try to strip out MNE as much as possible (I can put those mne bits into a contextualized `community workflow` - mne-eeg-viewer).

## Tasks: 
1. generate data
    - create an eeg data generator that returns eeg data as a numpy array (channels, times)
        - Right now, it's not essential for us to simulate EEG data that appears convincingly real. Instead, our priority should be to generate EEG-like time-series that embody the specifications that could influence the development and visualization process. For this reason, variable frequency sine waves with some level of noise might suffice, as long as they are produced at with relevant aspects like sampling rate, dtype, and signal scale.
        - :point_right: update - [neurodsp](https://github.com/neurodsp-tools/neurodsp) seems to have a simple powerlaw eeg data simulator.. I may try it
    - also generate a times array, channel names,
    - maybe: generate events - condition, epoch arrays
    - maybe maybe: generate - locations
    - next: other data types (meg, eog)
    - don't create anything on-disk (yet)... the most generalizable starting point is an in-memory numpy array and some info objects.


## Generate pink noise EEG data

The term "pink noise" is commonly used to refer to the power spectral density (PSD) characteristics of the noise signal, where the PSD follows a power-law distribution with a slope of approximately -1. This term is widely recognized and associated with noise signals that exhibit an equal amount of energy per octave or per frequency interval. This property of pink noise captures the statistical characteristics and fractal-like properties observed in the background activity of the brain.

In [None]:
import numpy as np
from scipy import signal

def generate_eeg_pinknoise(channels: int, duration: float, sampling_rate: int,
                           highpass: float = 1.0) -> tuple[np.ndarray, np.ndarray]:
    """
    Generate synthetic EEG data with pink noise characteristics.

    Args:
        channels (int): Number of EEG channels.
        duration (float): Duration of the EEG data in seconds.
        sampling_rate (int): Sampling rate of the EEG data in Hz.
        highpass (float, optional): High-pass filter factor in Hz. Frequencies lower than
            this value will be attenuated. Defaults to 1.0. Should be greater than 0.

    Returns:
        tuple[np.ndarray, np.ndarray]: Synthetic EEG data as a NumPy array of shape (channels, total_samples),
                                       and time array as a NumPy array of shape (total_samples,).

    """
    total_samples = int(duration * sampling_rate)
    time = np.arange(total_samples) / sampling_rate

    # Generate white noise
    white_noise = np.random.normal(0, 1, (channels, total_samples))

    # Apply 1/f filter to shape the noise spectrum
    b, a = signal.butter(1, highpass / (sampling_rate / 2), btype='highpass')
    pink_noise = signal.filtfilt(b, a, white_noise, axis=1)

    # Scale the pink noise by the desired amplitude
    amplitude = 100 # EEG signals are typically plus or minus 100 microvolts
    scaled_noise = pink_noise * amplitude

    # Check dimensions of the generated data
    assert pink_noise.shape == (channels, total_samples), "Incorrect dimensions for pink_noise array"

    return pink_noise, time


In [None]:
channels = 50
duration = 10  # seconds
sampling_rate = 500  # Hz
highpass = 1  # Hz (default)

data, time = generate_eeg_pinknoise(channels, duration, sampling_rate, highpass)

In [None]:
data.shape

## Visualize EEG

In [None]:
import holoviews as hv
hv.extension('bokeh')

def view_eeg(data, time, ch_names=None, spacing=2):

    n_channels = data.shape[0]
    
    if ch_names is None:
        # Create a channel names list
        ch_names = [f'EEG {i+1}' for i in range(n_channels)]

    # Calculate the offset between channels to avoid visual overlap
    offset = np.max(np.abs(data)) * spacing

    # Create a hv.Curve element per chan
    channel_curves = {}
    for i, channel_data in enumerate(data):
        channel_curves[ch_names[i]] = hv.Curve((time, channel_data + (i * offset)), 'Time').opts(color='black', line_width=1, tools=['hover'])

    # Create mapping from yaxis location to ytick for each channel
    yticks = [(i * offset, channel_name) for i, channel_name in enumerate(ch_names)]

    # Create hv overlay of curves
    eeg_viewer = hv.NdOverlay(channel_curves, kdims='Channel').opts(
        width=600, height=600, padding=.01, xlabel='Time', ylabel='Channel', yticks=yticks, show_legend=False)

    return eeg_viewer

In [None]:
view_eeg(data, time)

## that still does not look great.. try brown noise:

In [None]:
import numpy as np
from scipy import signal

def generate_eeg_brownnoise(channels: int, duration: float, sampling_rate: int,
                            highpass: float = 2.0, amplitude: float = 100.0) -> tuple[np.ndarray, np.ndarray]:
    """
    Generate synthetic EEG data with brown noise characteristics.

    Args:
        channels (int): Number of EEG channels.
        duration (float): Duration of the EEG data in seconds.
        sampling_rate (int): Sampling rate of the EEG data in Hz.
        highpass (float, optional): High-pass filter factor in Hz. Frequencies lower than
            this value will be attenuated. Should be greater than 0. Defaults to 2.0.
        amplitude (float, optional): Amplitude scaling factor for the generated EEG data.
            Defaults to 100.0 microvolts.

    Returns:
        tuple[np.ndarray, np.ndarray]: Synthetic EEG data as a NumPy array of shape (channels, total_samples),
                                       and time array as a NumPy array of shape (total_samples,).

    """
    desired_samples = int(duration * sampling_rate)
    total_samples = int(desired_samples * 1.2)  # Generate a longer time series

    time = np.arange(total_samples) / sampling_rate

    # Generate white noise
    white_noise = np.random.normal(0, 1, (channels, total_samples))

    # Apply 1/f^2 filter to shape the noise spectrum
    b, a = signal.butter(2, highpass / (sampling_rate / 2), btype='highpass')
    brown_noise = signal.filtfilt(b, a, white_noise, axis=1)

    # Integrate the brown noise to achieve the brown noise characteristics
    integrated_noise = np.cumsum(brown_noise, axis=1)

    # Extract the desired duration from the center of the time series
    start_idx = int((total_samples - desired_samples) / 2)
    end_idx = start_idx + desired_samples
    extracted_noise = integrated_noise[:, start_idx:end_idx]

    # Scale the extracted noise by the desired amplitude
    scaled_noise = extracted_noise * amplitude

    # Adjust the time vector to match the dimensions of the scaled noise
    time = time[start_idx:end_idx]

    # Check dimensions of the generated data
    assert scaled_noise.shape == (channels, desired_samples), "Incorrect dimensions for scaled_noise array"
    assert time.shape == (desired_samples,), "Incorrect dimensions for time array"

    return scaled_noise, time


In [None]:
channels = 50
duration = 10  # seconds
sampling_rate = 500  # Hz
highpass = 2  # Hz (default)

data, time = generate_eeg_brownnoise(channels, duration, sampling_rate, highpass)

In [None]:
view_eeg(data, time)

## that still looks pretty bad. try neurodsp instead

In [None]:
import numpy as np
from neurodsp.sim import sim_powerlaw

def generate_eeg_brownnoise(n_channels: int, n_seconds: float, fs: int,
                            highpass: float = 2.0, amplitude: float = 50.0) -> tuple[np.ndarray, np.ndarray]:
    """
    Generate synthetic EEG data with brown noise characteristics.

    Args:
        n_channels (int): Number of EEG channels.
        n_seconds (float): Duration of the EEG data in seconds.
        fs (int): Sampling rate of the EEG data in Hz.
        highpass (float, optional): High-pass filter factor in Hz. Frequencies lower than
            this value will be attenuated. Should be greater than 0. Defaults to 2.0.
        amplitude (float, optional): Amplitude scaling factor for the generated EEG data.
            Defaults to 50.0 microvolts.

    Returns:
        tuple[np.ndarray, np.ndarray]: Synthetic EEG data as a NumPy array of shape (channels, total_samples),
                                       and time array as a NumPy array of shape (total_samples,).

    """
    total_samples = int(n_seconds * sampling_rate)

    # Generate high-passed brown noise for each channel
    scaled_noise = np.empty((n_channels, total_samples))
    for ch in range(channels):
        brown_noise = sim_powerlaw(n_seconds, fs, f_range=(highpass, None))
        scaled_noise[ch] = brown_noise * amplitude

    time = np.arange(total_samples) / sampling_rate

    # Check dimensions of the generated data
    assert scaled_noise.shape == (channels, total_samples), "Incorrect dimensions for scaled_noise array"
    assert time.shape == (total_samples,), "Incorrect dimensions for time array"

    return scaled_noise, time


In [None]:
channels = 50
duration = 10  # seconds
sampling_rate = 500  # Hz

data, time = generate_eeg_brownnoise(channels, duration, sampling_rate, highpass)

In [None]:
data.shape

In [None]:
view_eeg(data, time)

## That looks good enough!

In [None]:
import pandas as pd
import hvplot.pandas
df = pd.DataFrame(data.T)
df.hvplot.hist()

In [None]:
import scipy.signal as signal

frequencies, psd = signal.periodogram(data.flatten(), fs=sampling_rate)

In [None]:
df_psd = pd.DataFrame({'Frequency': frequencies, 'PSD': psd})
df_psd.hvplot.line(x='Frequency', y='PSD', logx=True, logy=True, xlim=[.1,100], ylim=[.001, 10000])

In [None]:
from neurodsp.spectral import compute_spectrum
from neurodsp.plts.spectral import plot_power_spectra

freqs, psd = compute_spectrum(data[0,:], sampling_rate)
plot_power_spectra(freqs, psd)

## from script

In [None]:
from neurodatagen.eeg import generate_eeg_brown

n_channels = 50
n_seconds = 10
fs = 500

data, time = generate_eeg_brown(n_channels, n_seconds, fs)

In [None]:
view_eeg(data, time)