
## Data Analytics in Health Care and Connected Care/Health Information Systems
 *Practical Session III: Data Analytics in Healthcare and Connected Care/Health  Information Systems - working with data from wearable devices*

---
---
The aims of this practical session are: (a) to make explore and analyze in details data derived from a consumer wearable device; (b) understand what is ambient intelligence and its main properties. One of the group members must submit the completed notebook of the second practical session in Canvas by March 27th, 23:59:59 (GMT+1).

---


---

*Lecturer/s: Bart Jansen (bjansen@etrovub.be),  Jef Vandemeulebroucke (jefvdmb@etrovub.be), Nikolaos Deligiannis (ndeligia@etrovub.be)*

*Teaching assistant/s: Joris Wuts(jwuts@etrovub.be), Jasabanta Patro (jpatro@etrovub.be), Redona Brahimetaj (rbrahime@etrovub.be)*




In [None]:
student1_full_name = str(input('Enter your full name: '))
student1_id = int(input('Enter your student ID: '))
student2_full_name = str(input('Enter your full name: '))
student2_id = int(input('Enter your student ID: '))

---
---
#Data Description
---

---


> The data you will be using during this lab session are collected via Apple Watch wearable device as well as labeled sleep scored from gold-standard polysomnography. Data were collected from 31 subjects in total at the University of Michigan from June 2017 to March 2019. Subjects wore an Apple Watch for 7 - 14 days to collect their ambulatory steps. On the last day, they spent the night in the lab for an eight hour sleep opportunity, and acceleration and heart rate from their Apple Watch while they slept were recorded. If any of you have any interest to see sample code on how to access these sensors on the Apple Watch, can find it [here](https://github.com/ojwalch/sleep_accel). The following types of data are provided:

*   **motion (acceleration):**  date (in seconds since PSG start), x acceleration (in g), y acceleration, z acceleration.
*   **heart rate (bpm):** date(in seconds since PSG start), heart rate (bpm)
*   **steps (count):** date (in seconds since PSG start), steps (total in bin from this timestamp to next timestamp)
*   **labeled sleep:** date (in seconds since PSG start), stage (0-5, wake = 0, N1 = 1, N2 = 2, N3 = 3, REM = 5)

For your information, apple watch uses a triaxial MEMS accelerometer that measures acceleration in the x, y, and z directions, in units of g (9.8 m/s2). Heart rate is measured with  photoplethysmography (PPG) (an optical technique that measures blood volume changes) on the dorsal aspect of the wrist. Raw acceleration signal and heart rate data are obtained from the device by creating a “Workout Session” and using functions built-in to the iOS WatchKit and HealthKit frameworks. For more detailed information/description of the dataset, please refer [here](https://academic.oup.com/sleep/article/42/12/zsz180/5549536).


---
---

##Downloading and Preparing the Data


* To use the data, you should click [here](https://drive.google.com/drive/folders/19DAPxpjTlXy3ZEJpO05BiG7O4CxiTjEU), click on the folder named 'cropped' and then click on 'Add shortcut to Drive'. Select the folder you want to add a shortcut to (to avoid having issues with folder paths, we suggest you add the file to 'My Drive' drectory). After doing so, run the two lines of code below.

*  The data we have shared, consists of multiple files ('cleaned_counts', 'cleaned_hr', 'cleaned_motion', 'cleaned_psg') per subject. As a first task, you need to read all these files and create a dataframe: rows -> all the subject IDs; columns -> all the saved values we have for all the types of data provided.


Keywords: [list files in a directory](https://docs.python.org/3/library/os.html#os.listdir), [list subject IDs](https://docs.python.org/3/library/re.html#re.findall), [list unique values fom a list](https://numpy.org/doc/stable/reference/generated/numpy.unique.html)

In [1]:
import shutil, os, re, pandas as pd, numpy as np, matplotlib.pyplot as plt

In [4]:
## -- PART 0 -- ##
#create the dataframe with all subjects ids and their corresponding types of data (the headings should look like below)
id = [re.findall('\d+',i) for i in os.listdir(path='/Users/aneruthmohanasundaram/Documents/GitHub/Data-Analytics/Practical Session 3/cropped')]
# If we extract all directory name with findall we return it as list.
# then we convert it to a 1D array where we find the unique values of the list using set operation and appending to a dataframe.
flat_list = [j for i in id for j in i] 
df = pd.DataFrame(list(set(flat_list)), columns=["Subject_ID"])
df.head(10)

Unnamed: 0,Subject_ID
0,46343
1,5132496
2,4426783
3,1066528
4,3997827
5,1360686
6,8692923
7,781756
8,844359
9,8000685


##Getting some visuals and understandings of the dataset

*   To get an overview of the data distribution, for each of the types of data we have, plot the amount of records we have per subject. Interpret the results. In case the amount of records stored per subject is different, list some potential reasons/opinions why they are not the same. 

*   If you would like to use this dataset for classification tasks, a crucial stepis consists on knowing the records we have per each sleep stage. Show/plot the total amount of records per sleep stage. 

*   Choose one random subject id and plot: 

     --> (a) in one graph all three acceleration axes values per  timestamp;

     --> (b) nr of steps per timestamp; 
    
     --> (c) heart rate  values per timestamp; 
    
     --> (d) sleep stages per timestamp; 
     
  For each of these graphs, interpret the results (focusing more on the abnormal behaviors observed)! 

In [None]:
## -- PART I -- ##
#plot amount of records per subject, interpret the results, list potential reasons/opinions why the amount of records recorded per subject are not the same.

In [None]:
## -- PART II -- ##
#show/plot the total amount of records per sleep stage (for  all subjects together)


In [None]:
## -- PART III -- ##
#using  the data collected from a  randomly chosen subject id plot: 


In [None]:
######   all three acceleration axes per timestamp (in one graph) ######

In [None]:
######   heart rate values per timestamp ######

In [None]:
#####    sleep stages per timestamp ######

In [None]:
#####    sleep stages per timestamp ######

In [None]:
######   nr of steps per timestamp ######

In [None]:
## Interpret all the results carefully. Focus more on the abnormal behaviors observed.  

---
---
#Ambient Intelligence in Health Care
---

---



The following paragraph is an excert from [[1]](https://www.researchgate.net/publication/259767867_A_Survey_on_Ambient_Intelligence_in_Health_Care).


“Imagine a day when a small tricoder-like device monitors your health status in a continuous
manner, diagnoses any possible health conditions, has a conversation with you to persuade you to change your lifestyle for maintaining better health, and communicates with your doctor, if needed. The device might even be embedded into your regular clothing fibers in the form of very tiny sensors and it might communicate with other devices around you, including the variety of sensors embedded into your home to monitor your lifestyle. For example, you might be alarmed about the lack of a healthy diet based on the items present in your fridge and based on what you are eating outside regularly. This might seem like science fiction for now, but many respecters in the field of “Ambient Intelligence” expect such scenarios to be part of our daily life in not so far future.”
Based on the following papers ([1](https://www.researchgate.net/publication/259767867_A_Survey_on_Ambient_Intelligence_in_Health_Care), [2](https://link.springer.com/article/10.1007/s11042-015-2745-8), [3](https://www.researchgate.net/publication/312025777_Internet_of_Things_for_Measuring_Human_Activities_in_Ambient_Assisted_Living_and_e-Health)), you should answer the following questions:




1.   What is ambient intelligence and which are its main properties?
2.   What is ambient assisted living?
3.   What is a Body Area Network?
4.   What is the difference between inter-BAN and intra-BAN communication?
5.   Describe five body sensors.
6.   Describe the application of heart monitoring with the use of smartphones
7.   What is the IoT and how does it relate with ambient assisted living?





Write you answers here:

* Answer for Question 1: Ambient intelligence is the notion that our environment would interact with the people in it in an intelligent way to stimulate the well-being of the people that interact with this environment. Ambient intelligence systems are characterized by the following: they are context aware and embedded into our everyday environment and can thus constantly collect data from the environment, they are personalized to each individual, they can anticipate needs of the individual (this can be guided by the individual) and can adapt if necessary. To top it off, they should do so in an uninvasive manner.

* Answer for Question 2: Elderly and disabled people often have special needs and are especially prone to accidents within the household. This danger is amplified for those who live alone or do not posses a mobile phone, making it harder to report serious ingury. Oftentimes a choice has to be made between the liberty of the person in question or their safety. Ambient assisted living aims to solve this solution. Here certain characteristics of the individual such as vital signs, location, behaviour can be monitored so that help can be provided when necessary.For this sensors in/on the body as well as at home can be used. This would promote an active lifestyle for the individual in question without having to sacrifice their individuality.

* Answer for Question 3: A body area network (BAN) is one of the important innovations for AmI that have been made possible due large improvement in sensor networks. The main idea is that there would be sensors attached to our body and/or clothing that collect data on a multitude of vital body functions such as brainwaves, heartbeat, temperature, blood pressure, etc. These can be recorded in databases and can be consulted by medical professionals to intervene if needed (in, for example emergency situations) or to generally improve the well-being of the patient.

* Answer for Question 4: Body area networks can be integrated in already existing networks to streamline communication. The architechure of the BAN communication can be split up in inter-BAN communication, intra-BAN communication and beyond BAN communication. Inter-BAN communication is restricted to 2m around the body. Radiowaves are used for connection between different BAN sensors and a portable personal server. Inter-BAN on the other hand, will then connect the body sensors to either an access point, that can easily access other networks (e.g. internet and/or telecommunication networks to store information in medical databases or alert healthcare providers). Inter-BAN communication could also be used to alert an on-site electronic aid.

* Answer for Question 5: ' Body sensors can measure a vast array of characteristics. Five will be described.
Blood glucose measurements can be done by non-invasive technices such as infra-red measurements and optical sensing.
Blood pressure using the oscillometric technique. This is the classic technique where the pressure around your arm is increased until bloodflow stops, then blood pressure can be read. Humidity and temperature sensors that describe the humidity/temperature in the immediate surroundings or the body temperature itself.Pulse Oximetry describes oxygen saturation. This is done in a similar way to what we are currently seeing everywhere with covid: a little clip is attached to the finger and sends a light signal through the skin. Since optical density of oxygenated hemoglobin and de-oxygenated hemoglobin differ, the oxygen saturation can be described as the optical density of the ratio of the two.EEG sensor (electroencyphalography) is a technique where electrodes are attached to the persons scalp and measure brain activity from outer brain cells that give off a strong enough signal to be measured. This is generally a very weak signal and needs to be amplified and filtered. Because of the realtively weak strength of the signal, noise that comes from different bodily functions (such as blinking) can be detrimental to the signal and the monitoring can not happen in a continuous way.

* Answer for Question 6: Heartrate monitoring with smartphones relies on the principle that skin color will change according to heartrate. These changes are caused by a change of blood volume within the fingertip, which obviously correlates to heartrate. Placing the finger over the flashlight of the phone and the camera, allows the camera to fairly accurately measure skin color and can afterwards relate this to a heartrate in beats/minute. For this, the flash and camera would have to be within a distance of 1 fingertip.

* Answer for Question 7: The internet of things is the fairly recent development that refers to networks created by everyday objects with unique serial numbers that can identify each other. These objects form so called "Low power and LOSSY networks" and are characterized by a low power consumption, fairly simple operating system, etc. The technology that enables the IoT can be used for sensors that will also enable ambient assisted living (since this will largely be build up of IoT devices).