**<center><font size=5>sEMG Data Analysis</font></center>**
***<center>Calculation of sEMG Atomic Metrics</center>***
***
**author**: Daniel Tse

**date**: 27th June, 2024

**[GitHub Repository](https://github.com/Xiezhibin/Neurodata-Analysis)**

#### Table of Contents
- <a href='#intro'>1. Project Overview</a> 
- <a href='#pre'>2. Data Preprocessing</a>
- <a href='#sample'>3. sEMG Data Analysis</a> 
- <a href='#s1'>3.1. Time-field analysis</a>
- <a href='#s2m'>3.2.Frequency-field analysis</a>
- <a href='#s2nm'>3.3. Fuzzy entropy</a>

# <a id='intro'>1. Project Overview</a>

The primary goal of this analysis is to investigate how neural and muscular activities are related and how these relationships can be used to improve rehabilitation outcomes for stroke survivors. The sEMG data measures the electrical activity produced by skeletal muscles, while NIRS data provides insights into the neural activity by measuring blood flow and oxygenation in the brain.

We aim to determine if there are significant correlations between the neural and muscular activities and to identify which specific muscles and brain regions are involved. This understanding can potentially lead to more effective rehabilitation protocols and improved quality of life for stroke survivors.

# <a id='pre'>2. Data Preprocessing</a>

In this study, raw sEMG signals from BIC and TRI underwent initial filtering with a fourth-order bandpass Butterworth filter (20–450 Hz) to remove motion artifacts, low-frequency drift, and high-frequency noise. Subsequently, a notch filter (adjustable bandwidth, centered at 50 Hz) was applied to mitigate power line interference and harmonics from environmental sources. The processed data was then smoothed and normalized to facilitate comparison. To ensure data stability post-exercise onset acceleration, the initial 20 seconds were excluded, retaining a stable 40-second signal segment for subsequent analysis.

In [None]:
import pandas as pd
from io import StringIO
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
import EntropyHub as EH
import numpy as np
import math


# 文件路径
file_path = "xxx.txt"  # 请替换为你的文件路径

# 打开文件
with open("../" + file_path, "r") as file:
    # 跳过前10行
    for _ in range(10):
        next(file)
    
    # 读取剩余的内容
    remaining_content = file.read()

    # 使用 StringIO 将字符串内容转为文件对象
    file_like_object = StringIO(remaining_content)

    # 读取数据到 Pandas DataFrame
    df = pd.read_csv(file_like_object, delimiter=r"\s+")

# 打印剩余的内容
df.set_index('Frame', inplace=True)


column = df.columns[1]


# 找到NaN值所在的行索引
nan_indices = df.index[df[column].isna()].tolist()

# 检查是否有足够的NaN值来拆分成五段
if len(nan_indices) < 6:
    print("NaN值的数量不足以拆分成五段。")
else:
    # 创建一个用于存储拆分后的DataFrame的列表
    split_dfs = []

    # 初始化拆分段的起始索引
    start_index = 0

    # 拆分文件成五段
    for i in range(5):
        # 寻找下一个连续的NaN值段
        while start_index < len(nan_indices) - 1 and nan_indices[start_index + 1] - nan_indices[start_index] == 1:
            start_index += 1

        # 计算当前段的起始和结束索引
        act_start = nan_indices[start_index] + 1
        if start_index < len(nan_indices) - 1:
            act_end = nan_indices[start_index + 1] - 1
        else:
            act_end = len(df) - 1

        print(f"第 {i + 1} 段起始索引：{act_start}，结束索引：{act_end}")

        # 切片并将段添加到拆分后的DataFrame列表中
        split_df = df.iloc[act_start:act_end + 1]
        split_dfs.append(split_df)

        # 移动起始索引到下一个不连续的NaN值段
        start_index += 1

# 打印每个拆分段的数据框（示例）
for i, split_df in enumerate(split_dfs):
    print(f"拆分段 {i + 1}:\n{split_df}")


# 创建一个用于存储拆分后的DataFrame的列表
split_rms = []

# 计算均方根
for i, split_df in enumerate(split_dfs):
    rms = math.sqrt(sum(x ** 2 for x in split_df[column]) / len(split_df))
    split_rms.append(rms)



# 使用 MinMaxScaler 进行最小-最大归一化
scaler_minmax = MinMaxScaler()

# 创建一个用于存储拆分后的DataFrame的列表
split_normal_dfs = []

for i, split_df in enumerate(split_dfs):
    split_normal_dfs.append(scaler_minmax.fit_transform(pd.DataFrame(split_df[column])))




for i, split_df in enumerate(split_normal_dfs):

    # 创建时间序列图
    plt.figure(figsize=(10, 6))  # 可以调整图形大小
    plt.plot(split_df, marker='o', linestyle='-')  # 使用圆点连接线
    plt.xlabel("Frame")
    plt.ylabel(column)
    plt.title(f"{i}th Time Series")
    plt.grid(True)  # 添加网格线
    plt.show()


# <a id='sample'>3. sEMG Data Analysis</a> 
# <a id='s1'>3.1. Time-field analysis</a>

Integrated Electromyography (iEMG) is a measure used to quantify the total activation or energy expenditure of a muscle over a specified time interval. It is computed by integrating the absolute amplitude of the electromyography (EMG) signal over time. Here’s how it is typically calculated using the script you provided:

In [None]:
# 创建一个用于存储拆分后的DataFrame的列表
split_iEMG = []
sample_rate = 1000  # 采样率（Hz）
x_column = df.columns[0]

from scipy.signal import welch


for i, split_df in enumerate(split_dfs):
    abs_emg = split_df[column]

    # 计算iEMG
    split_iEMG.append(np.trapz(y = np.abs(abs_emg), x = split_df[x_column], dx=0.001))


# 打印 iEMG
for i, iEMG_df in enumerate(split_iEMG):
    print(f"拆分段 {i + 1} iEMG:\n{iEMG_df}")


# <a id='s2m'>3.2.Frequency-field analysis</a>


Median Power Frequency (MPF) is a commonly used parameter in the frequency domain analysis of surface electromyography (sEMG) signals. It serves as a crucial metric for assessing muscle fatigue and activation patterns. By identifying the frequency at which half of the signal's power is above and half below, MPF provides insights into muscle recruitment strategies and the spectral characteristics of muscle activity during various tasks. This makes it valuable in fields such as sports science, rehabilitation, and ergonomics, where understanding muscle function and fatigue dynamics is essential for optimizing performance and preventing injury.

In [None]:
# 创建一个用于存储拆分后的DataFrame的列表
sample_rate = 1000  # 采样率（Hz）
split_MPF = []
split_MF = []
x_column = df.columns[0]

from scipy.signal import welch


for i, split_df in enumerate(split_dfs):
    abs_emg = split_df[column]


    # 这里使用welch方法来估计功率谱密度
    freq, power_density = welch(abs_emg, fs=sample_rate, nperseg=1024)

    # 计算MPF#@
    weighted_freq = np.sum(freq * power_density) / np.sum(power_density)
    split_MPF.append(weighted_freq)

    # 计算MF
    MF = freq[np.argmax(np.cumsum(power_density) >= 0.5)]
    split_MF.append(MF)




# 打印MPF
for i, iEMG_df in enumerate(split_MPF):
    print(f"拆分段 {i + 1} iEMG:\n{iEMG_df}")

# 打印MF
for i, iEMG_df in enumerate(split_MF):
    print(f"拆分段 {i + 1} iEMG:\n{iEMG_df}")

# <a id='s2nm'>3.3. Fuzzy entropy</a>
 calculating Root Mean Square (RMS), normalizing data using MinMaxScaler, and computing Fuzzy Entropy for each segment of the data. Adjustments may be necessary based on your specific data structures and functions (EH.FuzzEn in this case).

In [None]:
# Print each split segment dataframe (example)
for i, split_df in enumerate(split_dfs):
    print(f"Split Segment {i + 1}:\n{split_df}")

# Calculate Root Mean Square (RMS)
split_rms = []
for i, split_df in enumerate(split_dfs):
    rms = math.sqrt(sum(x ** 2 for x in split_df[column]) / len(split_df))
    split_rms.append(rms)

# Normalize using MinMaxScaler
scaler_minmax = MinMaxScaler()
split_normal_dfs = []
for i, split_df in enumerate(split_dfs):
    split_normal_dfs.append(scaler_minmax.fit_transform(pd.DataFrame(split_df[column])))

# Calculate Fuzzy Entropy
split_entropyhub = []
r = 0.2
n = 2

print(f"{column}\n")
for i, split_df in enumerate(split_normal_dfs):
    raw = split_df[-35000:]  # Adjusted according to your specific needs
    th = r * np.std(raw)
    Fuzz, Ps1, Ps2 = EH.FuzzEn(raw, m=3, r=(th, n))
    split_entropyhub.append(Fuzz[-1])

average_entropy = np.mean(split_entropyhub)
print(f"{file_path} + {column} Results:")
print("\nFuzzy Entropy Values:", split_entropyhub)
print("Average Fuzzy Entropy:", average_entropy)
print("Root Mean Square (RMS):", split_rms)
