**REMEMBER:** Google/Wikipedia are your friends!! Search for any term you are not familiar with!
Try to get a rough idea of what each term is about. Be brave!


# An excuse to "play" with data

We have seen in class that *Michele Giugliano* (MG) is not so different than... a *frog*. His heart and muscles do generate electrical signals! This has been demonstrated in class, using an electronic analog *amplifier*.

### Biological signals were corrupted by noise

You probably remember that the signal was corrupted by **50Hz electrical noise** (also called *hum*) due to the electromagnetic interference of the room's light and electrical cables running inside the walls and carrying the 220v 50Hz AC power.

There was also another source of *faster* noise, presumably coming from the HDMI video projector, and that corrupted even more the *biological signals*.

### At MG's home the noise was slightly better

As MG is paranoid, the night before the class he rehearsed the same experiment
at home where noise was not so strong. He used a (cheap) analog-to-digital converter, that is an electronic device that 1) samples at a sequence discrete time steps an otherwise continuously-varying signal and 2) makes the numerical value of these samples compatibile with the digital architecture of computers (e.g. say with 16 bit, or 32 bit, or 64 bit of resolution, available today).

### An encouragment to get your hands "dirty" and play with MG's data

if you discover something wrong in MG's heart of muscle behavior, speake up! You might save his life! :-)

Data are in the course repository, so here I show you a cool way to bring the data into Google Colab. I am using what is called *shell escaping* that allows me to launch commands instead of Python code instructions.

In [1]:
# Let's download with "curl" a zipped data file, dataECG_EMG_Oct2024.zip, that is in our Course public repository...
!curl -L https://github.com/mgiugliano/ePhysSignals/raw/refs/heads/main/data/dataECG_EMG_Oct2024.zip -o data.zip

# Note that the file is downloaded and named data.zip, for simplicity

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 1289k  100 1289k    0     0  1289k      0  0:00:01  0:00:01 --:--:-- 1289k


In [2]:
# The file is now unzipped (while overwriting files, in case they already exist)
!unzip -o data.zip

# Let's now list all the content of the folder we are in, displaying the size of files in "human-redable" form...
!ls -lh

Archive:  data.zip
  inflating: ECG_MG_long.txt         
  inflating: __MACOSX/._ECG_MG_long.txt  
  inflating: ECG_MG_short.txt        
  inflating: __MACOSX/._ECG_MG_short.txt  
  inflating: EMG_MG1.txt             
  inflating: __MACOSX/._EMG_MG1.txt  
  inflating: EMG_MG2.txt             
  inflating: __MACOSX/._EMG_MG2.txt  
  inflating: EMG_MG3.txt             
  inflating: __MACOSX/._EMG_MG3.txt  
total 11M
-rw-r--r-- 1 root root 1.3M Oct  2 20:40 data.zip
-rw-r--r-- 1 root root 2.2M Sep 30 20:49 ECG_MG_long.txt
-rw-r--r-- 1 root root 2.1M Sep 30 20:50 ECG_MG_short.txt
-rw-r--r-- 1 root root 2.2M Sep 30 20:53 EMG_MG1.txt
-rw-r--r-- 1 root root 2.2M Sep 30 20:54 EMG_MG2.txt
-rw-r--r-- 1 root root 447K Sep 30 20:56 EMG_MG3.txt
drwxr-xr-x 2 root root 4.0K Oct  2 20:40 __MACOSX
drwxr-xr-x 1 root root 4.0K Oct  1 16:12 sample_data


In [3]:
# It contains 5 (text) files, each containing the analog-to-digital conversion
# of the ECG and EMG of MG..

!ls *.txt

ECG_MG_long.txt  ECG_MG_short.txt  EMG_MG1.txt	EMG_MG2.txt  EMG_MG3.txt


## Let's now write some Python!

Now that the files are in the present working directory, and I know their names, we use Python to load and plot the time series...

Before starting, why don't you click omn the "folder" icon in the vertical bar in the left part of the window? You shall see the files displayed in a GUI (graphical user interface) that might be more familiar to you. If you double click, the text file will be open and you will realise it contains floating points values organised in 2 columns and multiple rows.

Data on the same line are separated by a Tab, and cross lines by an obvious "new line". There is also a "header" containing some information, at the first three lines of each file.

In [4]:
# Let's now use PlotLy as a graphical library and numpy as numerical library

import plotly.express as px
import numpy as np

In [17]:
# Let's load the data, from the "disk" into the "memory" (i.e. into an 2d array)
filename = 'ECG_MG_short.txt'
#filename = 'ECG_MG_long.txt'
#filename = 'EMG_MG1.txt'
#filename = 'EMG_MG2.txt'
#filename = 'EMG_MG3.txt'

data = np.loadtxt(filename, skiprows=3)  # Skip the first 3 header lines

# Extract from all lines, the time and signal columns. ":" means ALL
time = data[:, 0]
signal = data[:, 1]

In [18]:
# Let's plot the data as a x-y plot...
mylabels =  {"x": "time (s)", "y": "Electrostatic Potential (mV)"}

fig = px.line(x = time, y = signal, title = filename, labels = mylabels)

fig.update_traces(line=dict(color='red', width=3))

fig.show()

In [None]:
# It is your turn now..
#
#
# ...write some code and play with the data!





# Some ideas or questions for you to answer (with increasing difficulty)

To answer these questions, you have to write code.

- what was the sampling interval, of the acquisition system used?
- what was the sampling rate?
- what was the average value of each signals?
- what happens to to the variance through time (say estimated in chunks of 1s each)?
- what was the average heart beat of MG?
- can you remove the 50 Hz noise from the trace?
- by spending no more than 5 min researching ECG on pubmed.com or Google, can you point in MG's ECG traces the (conventionally-chosen) *fiducial* points known as Q, R, S in the ECG, from which a doctor locate the P waves, the QRS complexes and the T waves ?
-  can you compute the typical parameters of interest: **height**  and   **interval**  of  each  wave,  such  as  the **R-R interval**, the **P-R interval**, the **QT interval** and the **S-T segment**?
- how does the power spectrum of the signal look like?
- how would a spectrogram look like for these signals?
