# Data Loading Example:

## DataFrame Loading:

In [1]:
from data_loader import TIHM # import the TIHM loader

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# Please change the path with the path of your dataset
DPATH = '../Dataset/'

In [3]:
tihm = TIHM(root=DPATH)

Data can be easily accessed through attributes of the TIHM class, such as in the below:

In [4]:
tihm.activity_raw.shape # raw activity data

(1030559, 3)

In [5]:
activity = tihm.activity # aggregated activity data
activity.shape

(2722, 10)

In [6]:
activity.describe()

location_name,Back Door,Bathroom,Bedroom,Fridge Door,Front Door,Hallway,Kitchen,Lounge
count,2135.0,2496.0,2526.0,2472.0,2453.0,2523.0,2556.0,2281.0
mean,22.597658,32.863381,51.913302,44.014159,23.761517,76.886247,88.932707,79.249014
std,25.593749,22.500095,35.156386,26.63597,22.906572,40.08337,41.767526,45.354658
min,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
25%,7.0,16.0,30.0,23.0,10.0,48.5,59.0,46.0
50%,14.0,30.0,44.0,40.0,18.0,73.0,86.0,74.0
75%,28.0,46.0,63.0,60.0,30.0,102.0,114.0,108.0
max,314.0,250.0,264.0,173.0,244.0,264.0,255.0,259.0


Similarly, the sleep and physiology raw and aggregated data can be loaded:

In [7]:
physiology_raw = tihm.physiology_raw
physiology = tihm.physiology

In [8]:
sleep_raw = tihm.sleep_raw
sleep = tihm.sleep

In [9]:
sleep.describe()

Unnamed: 0,heart_rate_mean,heart_rate_std,respiratory_rate_mean,respiratory_rate_std
count,835.0,834.0,835.0,834.0
mean,60.684513,4.699769,15.080796,1.661542
std,5.887328,1.691474,1.626671,0.523421
min,46.479691,0.730949,11.649194,0.767287
25%,56.099285,3.59562,13.924752,1.342108
50%,60.907692,4.349723,14.678947,1.546222
75%,64.889231,5.408334,15.682249,1.822515
max,77.882353,15.47092,20.598055,5.791331


And finally, a combined aggregated version, along with the targets:

In [10]:
data = tihm.data
target = tihm.target

In [11]:
target.describe()

Unnamed: 0,Blood pressure,Agitation,Body water,Pulse,Weight,Body temperature
count,251.0,114.0,63.0,76.0,4.0,1.0
mean,1.215139,1.175439,1.079365,1.25,1.0,1.0
std,0.53808,0.425838,0.326348,0.613732,0.0,
min,1.0,1.0,1.0,1.0,1.0,1.0
25%,1.0,1.0,1.0,1.0,1.0,1.0
50%,1.0,1.0,1.0,1.0,1.0,1.0
75%,1.0,1.0,1.0,1.0,1.0,1.0
max,4.0,3.0,3.0,5.0,1.0,1.0


Similarly, the demographics can be loaded using the dataset attribute `demographic_raw` and `demographic`:

In [12]:
tihm.demographic_raw.head()

Unnamed: 0,patient_id,age,sex
0,b9d58,"(70, 80]",Female
1,c55f8,"(80, 90]",Female
2,16f4b,"(80, 90]",Male
3,fd100,"(90, 110]",Female
4,1fbe4,"(80, 90]",Male


In [13]:
tihm.demographic.head()

Unnamed: 0,patient_id,age,sex
49,0697d,"(80, 90]",Male
35,099bc,"(80, 90]",Female
16,0cda9,"(70, 80]",Female
7,0d5ef,"(70, 80]",Male
20,0efe8,"(70, 80]",Female


In [14]:
tihm.demographic_types

['age', 'sex']

The documentation for this data loading class is:

In [15]:
help(TIHM)

Help on class TIHM in module data_loader:

class TIHM(builtins.object)
 |  TIHM(root: 'str' = './')
 |  
 |  Methods defined here:
 |  
 |  __init__(self, root: 'str' = './')
 |      The TIHM dataset
 |      as is here: https://github.com/PBarnaghi/TIHM1.5-Data.
 |      
 |      This class allows you to load the different csvs
 |      using the attributes of the class. If the data
 |      is not at the :code:`root` given, it will be
 |      downloaded.
 |      
 |      
 |      Examples
 |      ---------
 |      
 |      The following are examples for loading the data
 |      using this class.
 |      
 |      .. code-block::
 |      
 |          >>> dataset = TIHM(root='./data/')
 |          >>> activity_data = dataset.activity
 |          >>> all_data = dataset.data
 |          >>> len(dataset)
 |          2802
 |      
 |      
 |      Arguments
 |      ---------
 |      
 |      - root: str, optional:
 |          The file path to where the TIHM
 |          data files are stored.
 |

## Pytorch Dataset (Compatible with DataLoader):

In [16]:
import torch
from torch.utils import data as torchdata

from data_loader import TIHMDataset # import the pytorch dataset TIHM loader

In [17]:
# Please change the path with the path of your dataset
DPATH = '../Dataset/'

In [18]:
tihm_torch = TIHMDataset(root=DPATH, train=True, n_days=7, normalise='id')

Data is already normalised using Scikit-Learn's StandardScaler unless `normalise=None` is given. The normalisation is done before the data is rolled to create sequences of days. The normalisation can be done based on the global statistics (`normalise='global'`) or the same patient ID statistics (`normalise='id'`)

This can be used like any torch dataset:

In [19]:
train_dl = torchdata.DataLoader(
    tihm_torch,
    batch_size=100,
    shuffle=True,
    )

In [20]:
for x, y in train_dl:
    print(x.shape)

torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([100, 7, 28])
torch.Size([90, 7, 28])


The documentation for this dataset class is:

In [21]:
help(TIHMDataset)

Help on class TIHMDataset in module data_loader:

class TIHMDataset(torch.utils.data.dataset.Dataset)
 |  TIHMDataset(root: 'str' = './', train=True, imputer=SimpleImputer(), n_days: 'int' = 1, normalise: 'typing.Union[str, None]' = 'global')
 |  
 |  Method resolution order:
 |      TIHMDataset
 |      torch.utils.data.dataset.Dataset
 |      typing.Generic
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __getitem__(self, index: 'int')
 |  
 |  __init__(self, root: 'str' = './', train=True, imputer=SimpleImputer(), n_days: 'int' = 1, normalise: 'typing.Union[str, None]' = 'global')
 |      A pytorch dataset which wraps the TIHM data. If 
 |      the TIHM data is not downloaded, then it will be
 |      in the directory given.
 |      
 |      This can be used with dataloaders or on its own.     
 |      
 |      
 |      Examples
 |      ---------
 |      
 |      .. code-block::
 |      
 |          >>> dataset = TIHMDataset(root='./data/')
 |      
 |      Arguments
 