# Augmentation analysis

Goal is to align the augmentations of the github repo with the Soltanieh et al paper.
1.	Gaussian Noise
2.	Channel scaling
3.	Negation
4.	Baseline wander = 
5.	EMG
6.	Masking - 
7.	Time warping
8.	Combination of augmentaitons


Remarks 
- 'T' classes are with Torch. should be default.

### 1. Gaussian Noise - Changes 
should have std=scale with 

variances as argument:
$$\sigma_{G} = [0.01, 0.1, 1]$$

Changes:
made to accept a variance and scale with the std (sqrt of var)

In [None]:
class TGaussianNoise(Transformation):
    """Add gaussian noise to sample.
    """

    def __init__(self, var=0.01):
        super(TGaussianNoise, self).__init__(var=var)
        self.var = var

    def __call__(self, sample):
        if self.var == 0:
            return sample
        else:
            std = math.sqrt(self.var)
            data, label = sample
            data = data + std * torch.randn(data.shape)
            return data, label
        
    def __str__(self):
        return "GaussianNoise"

### 2.	Channel scaling - Correct

should use scaling_factor = magnitude_range

$$S_{ch} = [(0.33, 3), (0.33, 1), (0.5, 2)]$$

In [None]:
class ChannelResize(Transformation):
    """Scale amplitude of sample (per channel) by random factor in given magnitude range"""

    def __init__(self, magnitude_range=(0.5, 2)):
        super(ChannelResize, self).__init__(magnitude_range=magnitude_range)
        self.log_magnitude_range = np.log(magnitude_range)

    def __call__(self, sample):
        data, label = sample
        timesteps, channels = data.shape
        resize_factors = np.exp(np.random.uniform(
            *self.log_magnitude_range, size=channels))
        resize_factors_same_shape = np.tile(
            resize_factors, timesteps).reshape(data.shape)
        data = np.multiply(resize_factors_same_shape, data)
        return data, label

    def __str__(self):
        return "ChannelResize"

class TChannelResize(Transformation):
    """Scale amplitude of sample (per channel) by random factor in given magnitude range"""
    
    def __init__(self, magnitude_range=(0.33, 3)):
        super(TChannelResize, self).__init__(magnitude_range=magnitude_range)
        self.log_magnitude_range = torch.log(torch.tensor(magnitude_range))
        
        
    def __call__(self, sample):
        data, label = sample
        timesteps, channels = data.shape
        resize_factors = torch.exp(torch.empty(channels).uniform_(*self.log_magnitude_range))
        resize_factors_same_shape = resize_factors.repeat(timesteps).reshape(data.shape)
        data = resize_factors_same_shape * data
        return data, label
    
    def __str__(self):
        return "ChannelResize"

### 3.	Negation - Correct

Vertical flipping across the time axis

In [None]:
class Negation(Transformation):
    """Flip signal horizontally"""

    def __init__(self):
        super(Negation, self).__init__()
        pass

    def __call__(self, sample):
        data, label = sample
        return -1*data, label

    def __str__(self):
        return "Negation"

class TNegation(Transformation):
    """Flip signal horizontally"""
    
    def __init__(self):
        super(TNegation, self).__init__()
    
    
    def __call__(self, sample):
        data, label = sample 
        return -1*data, label
    
    def __str__(self):
        return "Negation"

### 4.	Baseline wander - NOT suited
add low-frequency sunisoidal with freq $f_w$ to signal. Use frequency $f_w=100$ and scales $$S_{bw}=[0.1, 0.7, 1]$$

<bold>NOTE:</bold> The 100 frequency is certainly incorrect, use the standard value of 0.5 Hz. Soltanieh et al. confused it with the sampling frequency of 100 Hz.

Required changes:
- Implementation chooses a scale between 0 and Cmax, this should be set to the value $S_{bw}$ directly.
    - Remove the  ```C = random.uniform(0,self.params["Cmax"])``` and take a C argument directly instead of Cmax
- Call with fdelta=False such that $\text{fdelta} = f_s / N$
- let $f_c$ stay at 0.5 Hz

In [None]:
class TBaselineWander(Transformation):
    """Adds baseline wander to the sample.
    """

    def __init__(self, fs=100, C=0.1, fc=0.5, fdelta=0.01,independent_channels=False):
        super(TBaselineWander, self).__init__(fs=fs, C=C, fc=fc, fdelta=fdelta, independent_channels=independent_channels)
        self.C = C
        self.fs = fs
        self.fc = fc
        self.fdelta = fdelta
        self.independent_channels = independent_channels
        
    def __call__(self, sample):
        data, label = sample
        timesteps, channels = data.shape
        noise = Tnoise_baseline_wander(fs=self.fs, N=len(data), C=self.C, fc=self.fc, fdelta=self.fdelta, channels=channels, independent_channels=self.independent_channels)
        data += noise.repeat(channels).reshape(channels, timesteps).T
        return data, label

    def __str__(self):
        return "BaselineWander"

### 5.	EMG - Changes
Accepts Cmax and K

Should be tested on a Gaussian with stds $$\sigma_G = [0.1, 0.7, 1]$$

Changes:
- remove the uniform range of Cmax and replace with direct C.

In [None]:
def Tnoise_electromyographic(N=1000, std=1, channels=1):
    '''electromyographic (hf) noise inspired by https://ieeexplore.ieee.org/document/43620
    N: lenght of the signal (timesteps)
    std: relative scaling factor (default scale: 1)
    channels: number of output channels
    '''
    #C *=0.3 #adjust default scale

    signal = torch.empty((N, channels)).normal_(0.0, std)
    
    return signal

import math
class TEMNoise(Transformation):
    """Adds electromyographic hf noise to the sample.
    """

    def __init__(self, var=0.1, K=3):
        super(TEMNoise, self).__init__(var=var)
        self.var = var

    def __call__(self, sample):
        data, label = sample 
        std = math.sqrt(self.var)
        data = data + Tnoise_electromyographic(N=len(data), std=std, channels=len(data[0]))
        return data, label

    def __str__(self):
        return "EMNoise"

### 6.	Masking 
- choose a crop_ratio c from the range [a, b]
- set c% of the signals in batch to 0 at a random location.

Implementation aligns with description
- accepts crop_ratio_range which is [a, b]

should be using values:
[10%, 20%], [0%, 50%], [40%, 50%]. 

In [None]:
import random

class TTimeOut(Transformation):
    """ Replace random crop by zeros
    """

    def __init__(self, crop_ratio_range=[0.0, 0.5]):
        super(TTimeOut, self).__init__(crop_ratio_range=crop_ratio_range)
        self.crop_ratio_range = crop_ratio_range

    def __call__(self, sample):
        data, label = sample
        data = data.clone()     # clone data to avoid in-place modification
        timesteps, channels = data.shape
        crop_ratio = random.uniform(*self.crop_ratio_range)     # random ratio from [a, b]
        crop_timesteps = int(crop_ratio * timesteps)    # number of timesteps to crop
        start_idx = random.randint(0, timesteps - crop_timesteps - 1)   # random start index
        data[start_idx:start_idx + crop_timesteps, :] = 0   # set that segment to zero
        return data, label

    def __str__(self):
        return "TimeOut"

### 7.	Time warping

We start this augmentation by dividing $x(t)$ into $w$ segments, denoted as $x_1(t), · · · , x_w(t)$. Next, we randomly select half of the segments and apply time warping to stretch them by a scaling factor of $r%$ while simultaneously squeezing the other half by the same amount. Finally, we concatenate the segments in the original order to produce the augmented signal, denoted as $\tilde{x}(t)$. We test the following list of parameters for this augmentation: $$(w, r) = [(1, 10), (3, 5), (3, 10)]$$.

In [None]:
class TDynamicTimeWarp(Transformation):
    """Stretch and squeeze signal randomly along time axis"""
    
    def __init__(self, warps=3, radius=10, step=2):
        super(TDynamicTimeWarp, self).__init__(
            warps=warps, radius=radius, step=step)
        self.warps=warps
        self.radius = radius
        self.step = step
        self.min_center = self.radius*(self.step+1)
    
    
    def __call__(self, sample):
        data, label = sample 
        timesteps, channels = data.shape
        for _ in range(self.warps):
            center = random.randint(self.min_center, timesteps-self.min_center-self.step-1)
            data = Tsqueeze(data, center, self.radius, self.step)
            data = Trefill(data, center, self.radius, self.step)
            data = Tinterpolate(data, float("inf"))
        return data, label
    
    def __str__(self):
        return "DynamicTimeWarp"

In [None]:

class DynamicTimeWarp(Transformation):
    """Stretch and squeeze signal randomly along time axis as described in the instructions."""
    
    def __init__(self, w=3, r=10):
        """
        Parameters:
        - w: Number of segments to divide the signal into. 
        - r: Percentage to stretch (positive) or squeeze (negative) the segments.
        """
        super(DynamicTimeWarp, self).__init__(w=w, r=r)

    def __call__(self, sample):
        data, label = sample
        timesteps, _ = data.shape

        # divide the signal into w segments
        seg_len = timesteps // self.w
        segments = [data[i * seg_len:(i + 1) * seg_len] for i in range(self.w)]

        # stretch even segments by r% and squeeze odd segments by r% (with -r)
        for i in range(self.w):
            segments[i] = self.time_warp(segments[i], self.r if i % 2 == 0 else -self.r)

        # Concatenate the segments back in the original order to produce the augmented signal
        return torch.cat(segments, dim=0), label

    def time_warp(self, segment, r):
        timesteps = segment.shape[0]
        
        # stretch or squeeze by r%.
        factor = torch.ones(timesteps) * (1 + r / 100) if r > 0 else torch.ones(timesteps) * (1 - abs(r) / 100)
        
        # warp the time axis by applying the cumulative sum of the scaling factor
        warped_time = torch.cumsum(factor, dim=0).long()
        
        # ensure the warped time does not exceed the original signal length
        return segment[torch.clamp(warped_time, 0, timesteps - 1)]

    def __str__(self):
        return "DynamicTimeWarp"


### 8.	Combination of augmentations

To further analyze the
impact of augmentations on our pre-training process, we conduct
additional experiments where we apply a combination of
four augmentations simultaneously. This allows us to observe
how different augmentations interact with each other and how
this affects the performance of the models. In each iteration
of pre-training, we randomly select four augmentations from
our previously described list of augmentations. We select the
parameters through experimentation with the goal of maximizing
performance. The parameters for the augmentations
when combined are as follows: Gaussian noise (σG = 1),
channel scaling (Sch = (0.33, 3)), baseline wander (Sbw = 1),
EMG noise (σEMG = 0.01), masking ([40%, 50%]), and time
warping ((w, r) = (1, 10)).

Find out how they are applied normally and how to apply a combination. 

Also how are the different values tested? should I have many many runs? Hope not.
