In [None]:
%pip install --upgrade firebase-admin
%pip install pandas
%pip install scipy
%pip install numpy
%pip install seaborn
%pip install matplotlib
%pip install statsmodels
%pip install scikit-learn
%pip install shap

# ***Note:*** 

This is optional. The developers of this notebook series have already provided a sample of the haptic touch dataset used for data analysis and machine learning stages of the thesis, *Integrating Haptic Touch on Android Devices for Emotion Recognition in an Emotionally Stimulated Environment*. 

# Welcome to Data Collection From Firebase

This notebook shows how to get the data from the GetEmotion application and convert it into an organized and readable Dataframe. We will be exporting this dataset at the end of this notebook to use for the Data Preprocessing stage.

This is for printing the dataframe.

In [None]:
def print_df(df):
    # Print the DataFrame
    with pd.option_context('display.max_rows', 20, 'display.max_columns', None): 
        display(df)

## Firebase Firestore Import

In [None]:
import firebase_admin
from firebase_admin import credentials
from firebase_admin import firestore
import pandas as pd

First we need to have an access to our Firebase firestore database. To do this, we need a JSON file called **Firebase Admin SDK** key that is downloaded from your project in Firebase. Follow these instructions to download the said file:

1. Go to your project in [Firebase](https://console.firebase.google.com/).
2. Go to **Project Settings**
3. Click **Service Accounts**
4. Under **Firebase Admin SDK**, make sure that the you have selected *Python* as the *Admin SDK configuration snippet*.
5. Click **Generate new private key**.
6. Move the private key to this folder.

Store the filename of the downloaded private key to `key`.

In [None]:
key = ""

Call `credentials.Certificate()` and pass `key` as parameter to generate a certificate. Store its result to `cred`.

In [None]:
cred = credentials.Certificate(key)

Then call `firebase_admin.initialize_app()` and pass `cred` as its parameter.

In [None]:
firebase_admin.initialize_app(cred)

You can now call `firestore.client()` and store it in a variable called `db`.

In [None]:
db = firestore.client()

## Creating a dataframe out of the Firebase Firestore

First, get the collection of interactions by calling `db.collection('Interactions')` and storing its result to `collection_ref`

In [None]:
collection_ref = db.collection('Interactions')

Then collect the documents by calling `collection_ref.get()`. Store them to `docs`.

In [None]:
docs = collection_ref.get()

Initialize an empty list called `data_list` to store document data

In [None]:
data_list = []

Features *acceleration*, *velocity*, and *coordinates* all have nested structures, and should be divided into atomic key-value pairs because their values are crucial to our data analysis and machine learning stages. 

To do this, we need to flatten them into multiple columns. **Run the code below** to create multiple columns of key-value pairs.

In [None]:
for doc in docs:
    data = doc.to_dict()

    #Flatten acceleration data
    acceleration_data = data.get('acceleration')
    if isinstance(acceleration_data, dict):
        for key, value in acceleration_data.items():
            data[f'acceleration_{key}_min'] = value.get('min')
            data[f'acceleration_{key}_mean'] = value.get('mean')
            data[f'acceleration_{key}_max'] = value.get('max')
    # Flatten velocity data
    velocity_data = data.get('velocity')
    if isinstance(velocity_data, dict):
        for key, value in velocity_data.items():
            data[f'velocity_{key}_min'] = value.get('min')
            data[f'velocity_{key}_mean'] = value.get('mean')
            data[f'velocity_{key}_max'] = value.get('max')

    # Flatten coordinates data
    coordinates_data = data.get('coordinates')
    if isinstance(coordinates_data, dict):
        for key, value in coordinates_data.items():
            if key in ['start', 'end']:
                for sub_key, sub_value in value.items():
                    data[f'coordinates_{key}_{sub_key}'] = sub_value

    data_list.append(data)

Once done, instantiate a pandas DataFrame by calling `pd.DataFrame()`, passing `data_list` as its parameter. Store the result to `df`.

In [None]:
df = pd.DataFrame(data_list)

You can now check the dataset stored in Dataframe.

In [None]:
print_df(df)

Remove unnecessary columns `velocity`, `coordinates`, and `acceleration` as these are already flattened.

In [None]:
dropColumns = ['velocity', 'coordinates', 'acceleration']
df.drop(columns = dropColumns, inplace=True)

## Exporting the dataset for the next stage!

To export the dataset, use `df.to_csv()` to convert the dataframe into a CSV file, where the paramaters are:

- *path_or_buf*: `'collected-haptic-dataset.csv'` (the filename)
- *index*: `False`

In [None]:
df.to_csv("collected-haptic-dataset.csv", index=False)