# MongoDB Handling

After installing the MongoDB server in your machine, you can use this notebook for handling the initial processes with the database.

Specifically, in this step, we utilize Python's `pymongo` library to exploit its capabilities for MongoDB server interaction.

**Important Note: Be sure that the MongoDB server is up and running as a service in the background.**

For example, in macOS, to run MongoDB (i.e. the mongod process) as a service, run:

* `brew services start mongodb-community`

To stop a mongod running as a macOS service, use the following command as needed:

* `brew services stop mongodb-community`

To install MongoDB in your system, follow the instructions here:

* https://www.mongodb.com/docs/manual/administration/install-community/


**Note:** You can modify any of the processes below, however, you have to explain your thoughts.

In [131]:
# import library for various processes with the OS
import os

## Load configuration

In [132]:
# import library for yaml handling
import yaml

In [133]:
config_path = os.path.join(os.getcwd(), "config.yml")

with open(config_path) as file:
    config = yaml.load(file, Loader=yaml.FullLoader)

## MongoDB database instantiation

The relevant information for the MongoDB client connection, the database name, and collection name is located in the configuration file.

```
# DB Connection with the uri (host)
client: "mongodb://localhost:27017/"

# db name
db: "aiot_course"

# db collection
col: "NAME YOUR COLLECTION"
```

In [134]:
# import library for hanlding the MongoDB client
import pymongo
# import library for retrieving datetime
from datetime import datetime

### Create the database

To create a database in MongoDB, start by creating a MongoClient object, then specify a connection URL with the correct ip address and the name of the database you want to create.

MongoDB will create the database if it does not exist, and make a connection to it.

In [135]:
config = {
    "client": "mongodb://localhost:27017/",
    "db": "AIoT_project",
    "col": "AIoT_project"
}

client = pymongo.MongoClient(config["client"])

In [136]:
db = client[config["db"]]

### Instantiate the collection

To create a collection in MongoDB, use database object and specify the name of the collection you want to create.

MongoDB will create the collection if it does not exist.

In [137]:
col = db[config["col"]]

Initially, no collection will be shown in MongoDB before you enter the first document!

## Create the data collection

Uploading the gathered data to MongoDB collection. The data directory structure should be as follows:

```
.
└── data/
    ├── class_A/
    │   ├── data_A_01.csv
    │   ├── data_A_02.csv
    │   └── ..
    ├── class_B/
    │   ├── data_B_01.csv
    │   ├── data_B_02.csv
    │   └── .
    └── class ...
```

In [138]:
# import library for hanlding the csv data and transformations
import pandas as pd
import json

Get data path:

In [139]:
data_path = os.path.join(os.getcwd(), "data")
print(data_path)

g:\Other computers\My Computer\8 Εξάμηνο\Αλγοριθμικές Θεμελιώσεις Δικτύων Αισθητήρων\Human Gesture Recognition Project\data


List all files in a path:

In [140]:
classes_folders_list = [f for f in os.listdir(data_path) if os.path.isdir(os.path.join(data_path, f))]
print(classes_folders_list)

['class_A', 'class_B']


In [141]:
# print files in folder
folder_path = os.path.join(data_path, classes_folders_list[0])
files_in_folder = [f for f in os.listdir(folder_path) if os.path.isfile(os.path.join(folder_path, f))]
print(files_in_folder)

['data_A_1.csv', 'data_A_2.csv', 'data_A_3.csv', 'data_A_4.csv', 'data_A_5.csv', 'data_A_6.csv', 'data_A_7.csv', 'data_A_8.csv', 'data_A_9.csv', 'data_A_10.csv', 'data_A_11.csv', 'data_A_12.csv', 'data_A_13.csv', 'data_A_14.csv', 'data_A_15.csv', 'data_A_16.csv', 'data_A_17.csv', 'data_A_18.csv', 'data_A_19.csv', 'data_A_20.csv', 'data_A_21.csv', 'data_A_22.csv', 'data_A_23.csv', 'data_A_24.csv', 'data_A_25.csv', 'data_A_26.csv', 'data_A_27.csv', 'data_A_28.csv', 'data_A_29.csv', 'data_A_30.csv', 'data_A_31.csv', 'data_A_32.csv', 'data_A_33.csv', 'data_A_34.csv', 'data_A_35.csv', 'data_A_36.csv', 'data_A_37.csv', 'data_A_38.csv', 'data_A_39.csv', 'data_A_40.csv', 'data_A_41.csv', 'data_A_42.csv', 'data_A_43.csv', 'data_A_44.csv', 'data_A_45.csv', 'data_A_46.csv', 'data_A_47.csv', 'data_A_48.csv', 'data_A_49.csv', 'data_A_50.csv', 'data_A_51.csv', 'data_A_52.csv', 'data_A_53.csv', 'data_A_54.csv', 'data_A_55.csv', 'data_A_56.csv', 'data_A_57.csv', 'data_A_58.csv', 'data_A_59.csv', 'data

Each document in the MongoDB database should have the following schema:

```json
{
  "data": {
    "acc_x": ["array", "of", "values"],
    "acc_y": ["array", "of", "values"],
    "acc_z": ["array", "of", "values"],
  },
  "label": "The label of the instance",
  "datetime": "MongoDB datetime object (it can be generated with the datetime.datetime.now() function"
}
```

Accordingly, if you are using gyroscope or both accelerometer and gyroscope, the following order and naming of the sensor keys should be defined:

* for gyroscope: `gyr_x`, `gyr_y`, `gyr_z` for the three axes
* for accelerometer and gyroscope: `acc_x`, `acc_y`, `acc_z`, `gyr_x`, `gyr_y`, `gyr_z` for the six axes

**Note: Be careful, the document is mandatory to have the aforementioned schema, in order to argue and proceed with the rest of the processes later on, in data engineering, plotting, etc.**

In [142]:
from utils import df_rebase

## Provide the code to upload the data to MongoDB

In [143]:
def upload_data_to_mongodb(data_path, collection):
    classes_folders_list = [f for f in os.listdir(data_path) if os.path.isdir(os.path.join(data_path, f))]

    for class_folder in classes_folders_list:
        folder_path = os.path.join(data_path, class_folder)
        files_in_folder = [f for f in os.listdir(folder_path) if os.path.isfile(os.path.join(folder_path, f))]

        for file_name in files_in_folder:
            file_path = os.path.join(folder_path, file_name)
            df = pd.read_csv(file_path)

            # Extract data from DataFrame
            data = {
                "data": {
                    "acc_x": df['acc_x (g)'].tolist(),
                    "acc_y": df['acc_y (g)'].tolist(),
                    "acc_z": df['acc_z (g)'].tolist(),
                    "gyr_x": df['gyr_x (deg/s)'].tolist(),
                    "gyr_y": df['gyr_y (deg/s)'].tolist(),
                    "gyr_z": df['gyr_z (deg/s)'].tolist(),
                    
                },
                "label": class_folder,
                "datetime": datetime.now()
            }

            collection.insert_one(data)
            print(f"Uploaded {file_name} from {class_folder}")

# Run the function to upload data
upload_data_to_mongodb(data_path, col)

Uploaded data_A_1.csv from class_A
Uploaded data_A_2.csv from class_A
Uploaded data_A_3.csv from class_A
Uploaded data_A_4.csv from class_A
Uploaded data_A_5.csv from class_A
Uploaded data_A_6.csv from class_A
Uploaded data_A_7.csv from class_A
Uploaded data_A_8.csv from class_A
Uploaded data_A_9.csv from class_A
Uploaded data_A_10.csv from class_A
Uploaded data_A_11.csv from class_A
Uploaded data_A_12.csv from class_A
Uploaded data_A_13.csv from class_A
Uploaded data_A_14.csv from class_A
Uploaded data_A_15.csv from class_A
Uploaded data_A_16.csv from class_A
Uploaded data_A_17.csv from class_A
Uploaded data_A_18.csv from class_A
Uploaded data_A_19.csv from class_A
Uploaded data_A_20.csv from class_A
Uploaded data_A_21.csv from class_A
Uploaded data_A_22.csv from class_A
Uploaded data_A_23.csv from class_A
Uploaded data_A_24.csv from class_A
Uploaded data_A_25.csv from class_A
Uploaded data_A_26.csv from class_A
Uploaded data_A_27.csv from class_A
Uploaded data_A_28.csv from class_A
U