# Manipulating light exposure data with pyActigraphy

![Light manipulation](img/ryunosuke-kikuno-nJJo1vD9Xug-unsplash.jpg)

Photo by <a href="https://unsplash.com/@ryunosuke_kikuno?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Ryunosuke Kikuno</a> on <a href="https://unsplash.com/s/photos/light-bulb-repair?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>

## Disclaimer

The development of the pyActigraphy module for analysing light exposure data was led and financially supported by members of the Daylight Academy Project *The role of daylight for humans* (led by Mirjam Münch, Manuel Spitschan). The module is part of the Human Light Exposure Database. For more information about the project, please see
https://daylight.academy/projects/state-of-light-in-humans/.

## Introduction

Just like any recording, light exposure data recordings often require some preprocessing before analysis. These steps may include cleaning, resampling, filtering, etc...

The light exposure data analysis module of *pyActigraphy* allows users to perform many of steps easily. 

This tutorial presents how to:

* truncate or mask spurious light exposure periods;
* resample, binarize or filter light exposure data.

## Imports and input data

As usual, let's import the necessary packages:

In [None]:
import pyActigraphy

In [None]:
import pandas as pd

In [None]:
import plotly.graph_objects as go

In [None]:
import os

Similarly to the [introduction tutorial](https://ghammad.github.io/pyActigraphy/pyLight-Intro.html), we will use as input data a sample file recorded by a ActTrust device (Condor Instrument), located in the test directory of the pyActigraphy package itself.

In [None]:
fpath = os.path.join(
    os.path.dirname(pyActigraphy.__file__),
    'tests','data/',
    'test_sample_atr.txt'
)

In [None]:
raw = pyActigraphy.io.read_raw_atr(fpath)

## Truncation and data masking

The period of recording often exceeds the period during which the light recording device was actually worn by the participant; the recording might have started before giving the device to the participant or the device was removed after a certain amount of days by the participant while still recording. Or, you might simply want to analysis a fixed number of days for all your participants.
Additionaly, the device might also have been removed temporarily by the participant.

In any case, it is mandatory to truncate and/or mask these periods in the recording prior to analysis.

Let's first inspect our recording:

In [None]:
raw.light.data.head(1)

In [None]:
raw.light.data.tail(1)

The recording starts at 9:00 on the fictional date of the 1st of January 1918 and ends exactly 4 days later.

### Truncation

To truncate the recording, there are two ways:

1. Specify a start and stop datetime on the light data:

In [None]:
raw.light.start_time = '1918-01-02 09:00:00'

In [None]:
raw.light.stop_time = '1918-01-04 09:00:00'

In [None]:
raw.light.data.head(1)

In [None]:
raw.light.data.tail(1)

2. Specify a start and stop datetime when reading the input recording:

In [None]:
raw_trunc = pyActigraphy.io.read_raw_atr(
    fpath,
    start_time = '1918-01-02 09:00:00',
    period='2D' # restrict input data to 2 days
)

In [None]:
raw_trunc.light.data.head(1)

In [None]:
raw_trunc.light.data.tail(1)

Both methods are equivalent.

It is possible to reset these start and stop times at any moment:

In [None]:
raw.light.reset_times()

The start and stop times have been set to `None`: 

In [None]:
raw.light.start_time

In [None]:
raw.light.stop_time

### Masking

The 'machinery' to mask some periods of light data acquisition in the recording is similar to the one used for activity; it consists in dynamically masking the light data upon access. The underlying original data are kept intact and the mask can easily be turned off and on.

For more info, please see this [tutorial](https://ghammad.github.io/pyActigraphy/pyActigraphy-Masking.html).

Let's first visualize the white light channel ('LIGHT') as well as the activity:

In [None]:
layout = go.Layout(
    xaxis=dict(title="Date time"),
    yaxis=dict(title="Activity counts/period"),
    yaxis2=dict(title='Light intensity',overlaying='y',side='right'),
    showlegend=True
)

In [None]:
fig1 = go.Figure([
    go.Scatter(
        x=raw_trunc.data.index.astype(str),
        y=raw_trunc.data,
        name='Activity'),
    go.Scatter(
        x=raw_trunc.light.get_channel('LIGHT').index.astype(str),
        y=raw_trunc.light.get_channel('LIGHT'),
        yaxis='y2', opacity=0.5,
        name='Light')
], layout=layout)

In [None]:
fig1.show()

First, create a simple mask (i.e a series of 1, meant for being further edited by the users):

In [None]:
raw.light.create_light_mask()

To simply mask a period of light data acquistion on all channels:

In [None]:
raw.light.add_light_mask_period(
    start='1918-01-03 06:00:00',
    stop='1918-01-03 10:00:00'
)

However, it is possible apply such mask period on a specific channel:

In [None]:
raw.light.get_channel_list()

In [None]:
raw.light.add_light_mask_period(
    start='1918-01-03 05:00:00',
    stop='1918-01-03 11:00:00',
    channel='RED LIGHT'
)

However, as long as the `apply_mask` boolean is not set to True, the data are not actually masked:

In [None]:
raw.light.apply_mask

In [None]:
raw.light.get_channel(
    'LIGHT'
)['1918-01-03 05:58:00':'1918-01-03 10:02:00']

In [None]:
raw.light.get_channel(
    'RED LIGHT'
)['1918-01-03 04:58:00':'1918-01-03 11:02:00']

In [None]:
raw.light.apply_mask = True

In [None]:
raw.light.get_channel(
    'LIGHT'
)['1918-01-03 05:58:00':'1918-01-03 10:02:00']

In [None]:
raw.light.get_channel(
    'RED LIGHT'
)['1918-01-03 04:58:00':'1918-01-03 11:02:00']

The masked data have been replaced with NaN, *on the fly*.

To visualize this, let's superimpose the light exposure data and the mask:

In [None]:
layout = go.Layout(
    xaxis=dict(title="Date time"),
    yaxis=dict(title="Activity counts/period"),
    yaxis2=dict(title='Mask',overlaying='y',side='right'),
    showlegend=True
)

In [None]:
fig2 = go.Figure([
    go.Scatter(
        x=raw.light.get_channel('LIGHT').index.astype(str),
        y=raw.light.get_channel('LIGHT'),
        name='Light'),
    go.Scatter(
        x=raw.light.mask.index.astype(str),
        y=raw.light.mask,
        yaxis='y2', opacity=0.5,
        name='Mask')
], layout=layout)

In [None]:
fig2.show()

## Data resampling & binarization

In order to analyse the light exposure data, it is sometimes more convenient to work with data resampled at a lower frequency than thr acquisition frequency. It could be also useful to binarize the light exposure data; light data are replaced with '1' if they are above a certain threshold and with '0' otherwise.

Both functionalities are readily available within the light expsoure data analysis module of *pyActigraphy*.

### Resampling

To resample the light data to a 5-min period:

In [None]:
help(raw.light.resampled_data)

It is possible to specify the aggregation function used during the resampling. By default, the data are summed over each resampled periods. However, it is possible to use a 'mean':

In [None]:
raw.light.resampled_data(rsfreq='5min', agg='mean').head(5)

The resulting time series has now an index with an epoch length of 5 min.

### Binarization

In [None]:
help(raw.light.binarized_data)

In [None]:
raw.light.data.head(5)

To binarize data with a specific threshold:

In [None]:
raw.light.binarized_data(threshold=2.5).head(5)

As expected, light data timepoints with a value below 2.5 are set to '0' and to '1' otherwise.

### Resampling and binarization at the same time

With the binarization function, it is also possible to first resample the light exposure data before binarization

To resample to a 5-min period and then binarize the data:

In [None]:
raw.light.binarized_data(
    threshold=2.5,
    rsfreq='5min',
    agg='mean'
).head(5)

## Data filtering

Instead of resampling the data, it is possible to retain the original sampling frequency while getting rid of the high frequency fluctuations  by simply filtering the data.

The light exposure data analysis module of *pyActigraphy* provides an easy way to instaciate and apply a Butterworth filter on the light data. This is essentially a wrapper around the Scipy's [scipy.signal.butter](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.butter.html) function.

In [None]:
raw.light.apply_mask = False

In [None]:
help(raw.light.filter_butterworth)

To filter the light exposure data with a 4th order low-pass Butterworth filter with a frequency cut a 100th of the original sampling frequency:

In [None]:
wlight_filtered = raw.light.filter_butterworth(
    fc_low=None,
    fc_high=(1/30)/100,
    N=4,
    channels=['LIGHT']
)

In [None]:
fig3 = go.Figure(
    data=[
        go.Scatter(
            x=raw.light.get_channel('LIGHT').index.astype(str),
            y=raw.light.get_channel('LIGHT'),name='Raw light data'),
        go.Scatter(
            x=wlight_filtered.index.astype(str),
            y=wlight_filtered.loc[:,'LIGHT'],
            name='Filtered data'
        ),
    ]
)

In [None]:
fig3.show()

Et voilà! For now...