# Read GT3X Files with PAAT

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/trybnetic/paat/HEAD?urlpath=%2Fdoc%2Ftree%2Fdocs%2Fsource%2Fexample_notebooks%2Fread_gt3x_files.ipynb)

In this tutorial, we show how you can use PAAT to load GT3X files you downloaded from ActiLife. We will also show some tricks, that are very useful when you deal with study data and for example have a bunch of GT3X files that you need to process.

## Import PAAT

In the first step, we import PAAT:

In [1]:
%%capture
import os
import glob2

import paat

## Load a single GT3X file

To load a single GT3X file, PAAT has implemented the `read_gt3x()` function, which will give you a [Pandas](https://pandas.pydata.org/) DataFrame and an integer with the sampling frequency in which the data was recorded. The sampling frequency can be relevant for several methods, but depending on your usecase, you might also ignore it. It can be also later calculated from the DataFrame's index which are the timestamps of when the data was recorded. In the following, we see a GT3X file loaded which recorded 10min for testing purposes of the package. As you can see in the following cell, the data recoridng started on 03.01.2022 at 10:20 and stopped ten minutes later at 10:30 with the last data point recorded 10ms before (as this data was recorded at 100hz = one recording each 10ms).

In [2]:
data, sample_freq = paat.read_gt3x("data/10min_recording.gt3x")
data

Unnamed: 0,X,Y,Z
2022-01-03 10:20:00.000,0.804688,0.621094,0.085938
2022-01-03 10:20:00.010,0.804688,0.597656,0.085938
2022-01-03 10:20:00.020,0.804688,0.585938,0.078125
2022-01-03 10:20:00.030,0.804688,0.582031,0.074219
2022-01-03 10:20:00.040,0.800781,0.585938,0.074219
...,...,...,...
2022-01-03 10:29:59.950,0.289062,0.960938,-0.050781
2022-01-03 10:29:59.960,0.289062,0.960938,-0.054688
2022-01-03 10:29:59.970,0.285156,0.957031,-0.054688
2022-01-03 10:29:59.980,0.289062,0.957031,-0.054688


## Load multiple GT3X files

Very often when dealing with GT3X files, one does not deal only with one file. Most often, there are dozens, hundreds or even thousands of GT3X files that you need to process. Using a programming language like Python enables you here to easily process them all in a batch. To do so, the first thing you need to do is to create one list with all the file paths.

When you are lucky, you have all files already in one directory. In this case, you can use the `os` module to list all files in this directory:

In [3]:
base_path = "data/"

gt3x_files = [os.path.join(base_path, file_path) for file_path in os.listdir(base_path)]
gt3x_files

['data/10min_recording.gt3x', 'data/nwt_recording.gt3x']

If you have a directory which does not only contain GT3X files or you are unsure about it, you can also just include files to the list that end with ".gt3x":

In [4]:
gt3x_files = [os.path.join(base_path, file_path) for file_path in os.listdir(base_path) if file_path.endswith(".gt3x")]
gt3x_files

['data/10min_recording.gt3x', 'data/nwt_recording.gt3x']

If your files are stored in multiple subdirectories, you can use the `glob2` library to find all GT3X files in the subdirectories:

In [5]:
gt3x_files = glob2.glob(os.path.join('**', '*.gt3x'))
gt3x_files

['data/10min_recording.gt3x', 'data/nwt_recording.gt3x']

After you have created a list with all GT3X files you want to process, you can iterate over them and define how you want to process them. Note that the following example will only load the files, but will not further process them:

In [6]:
for file_path in gt3x_files:
    data, sample_freq = paat.read_gt3x("data/10min_recording.gt3x")
    print(f"Loaded DataFrame from {file_path} with shape {data.shape[0]}x{data.shape[1]} " \
          f"sampled at {sample_freq}hz.")

Loaded DataFrame from data/10min_recording.gt3x with shape 60000x3 sampled at 100hz.
Loaded DataFrame from data/nwt_recording.gt3x with shape 60000x3 sampled at 100hz.
