# How to Use stdatalog_core Package - \[Data Format Conversion Features\]
---
<br>
<p>In this notebook, we will explore how to use the `stdatalog_core` package, specifically focusing on data format conversion features. This guide will walk you through the process of importing necessary modules, initializing objects, and converting data into various formats.</p>

## Important Notice
Please ensure to check the names of the exported files to avoid overwriting any existing files. This is crucial when running multiple conversions or working with different datasets.

In [None]:
import sys
import os

# Add the STDatalog SDK root directory to the sys.path to access the SDK packages
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '../..')))

from stdatalog_core.HSD.HSDatalog import HSDatalog

<span style="font-style: italic; color:#909090;"><span style="color:cyan;">*[Notebook utils] </span> -> Update this string to select the right acquisition folder</span>

In [5]:
#acquisition_folder = "path/to/your/acquisition_folder"
acquisition_folder = "C:\\00_PROJECTs\\00_STDATALOG_PYSDK_DEV\\STDATALOG_PYSDK\\20241127_18_03_04"

## Initialize HSDatalog Object and Validate Acquisition Folder
We will initialize the `HSDatalog` object and validate the acquisition folder. This step ensures that the folder contains the necessary data and is in the correct format.
<span style="color:#909090;">[FP-SNS-DATALOG1 and FP-SNS-DATALOG2 acquisition formats are supported]</span>

In [6]:
hsd = HSDatalog()
hsd_version = hsd.validate_hsd_folder(acquisition_folder)
print(f"HSD Version: {hsd_version}\n")

hsd_instance = hsd.create_hsd(acquisition_folder=acquisition_folder)

HSD Version: HSDVersion.V2

2024-11-28 11:44:31,878 - HSDatalogApp.stdatalog_pnpl.DTDL.device_template_manager - INFO - dtmi found locally in base supported models
2024-11-28 11:44:31,878 - HSDatalogApp.stdatalog_pnpl.DTDL.device_template_manager - INFO - dtmi: dtmi/appconfig/steval_stwinbx1/FP_SNS_DATALOG2_Datalog2-7.json


## Extract DataFrames from HSDatalog Instance
The created `HSDatalog` instance will be used to extract Pandas dataframes from an HSDatalog acquisition folder. Ensure to change the `sensor_name` string to a valid sensor name available in your acquisition folder.

In [7]:
sensor_name = "ism330dhcx_acc"
try:
    sensor =  hsd.get_sensor(hsd_instance, sensor_name)
    dataframe = hsd.get_dataframe(hsd_instance, sensor)
    print(dataframe)
except Exception as e:
    print(f"Error: No [{sensor_name}] sensor available in your selected acquisition folder. Please check the sensor name and try again.")

[             Time   A_x [g]   A_y [g]   A_z [g]
0        0.037796  0.402112  0.110776  0.899384
1        0.037940  0.401624  0.109312  0.896456
2        0.038084  0.399184  0.107360  0.893040
3        0.038229  0.396744  0.111264  0.895968
4        0.038373  0.395768  0.102480  0.895968
...           ...       ...       ...       ...
131995  18.990073  0.441152 -0.219600  0.811544
131996  18.990217  0.467504 -0.212280  0.807640
131997  18.990361  0.489952 -0.206424  0.807152
131998  18.990504  0.494344 -0.202032  0.807640
131999  18.990648  0.497272 -0.199104  0.806176

[132000 rows x 4 columns],           Time   A_x [g]   A_y [g]   A_z [g]
0    18.990791  0.496296 -0.203496  0.801784
1    18.990935  0.487024 -0.201544  0.800808
2    18.991078  0.494832 -0.205448  0.792512
3    18.991222  0.501664 -0.206912  0.785192
4    18.991366  0.481168 -0.208864  0.793488
..         ...       ...       ...       ...
995  19.133648  0.597800 -0.200080  0.795928
996  19.133792  0.595360 -0.200568 

# Data Conversion
---
<p>Using the `HSDatalog` module, it is possible to convert the data of an acquisition into various formats. The supported formats include:</p>
<ul>
<li> CSV, TSV, Apache Parquet, HDF5, TXT (also split into different folders by tags)</li>
<li> Nanoedge format</li>
<li> ST UNICO format (with variants)</li>
<li> WAV</li>
</ul>

First, create an output folder to host the converted files.

In [8]:
import os

output_folder = ".\\HSD_Exported_data_folder"
if not os.path.exists(output_folder):
    os.makedirs(output_folder)

## CSV, TSV, Apache Parquet, HDF5 and TXT Conversion
In this section, we will demonstrate how to convert data into CSV, TSV, Apache Parquet and HDF5 formats. We will also show how to handle TXT or CSV conversion by tags.

### CSV and TSV Conversion Examples
Below are examples of how to convert data to CSV and TSV formats. Note that the `convert_dat_to_xsv` function is used for these conversions with the `file_format` parameter set to either `CSV` or `TSV`.

In [9]:
from stdatalog_core.HSD_utils.exceptions import MissingTagsException

exported_sensor_path = os.path.join(output_folder, sensor_name)

print(f"To CSV conversion started...")
hsd.convert_dat_to_xsv(hsd_instance, sensor, start_time=0, end_time=-1, labeled=False, raw_data=False, output_folder=output_folder, file_format="CSV")
print(f"To CSV conversion completed.")
print(f"To TSV conversion started...")
try:
    hsd.convert_dat_to_xsv(hsd_instance, sensor, start_time=0, end_time=-1, labeled=True, raw_data=True, output_folder=output_folder, file_format="TSV")
    print(f"To TSV conversion completed.")
except MissingTagsException as e:
    print(f"Error: No tags are present in your input acquisition folder. Please use labeled=False parameter or select a labeled acquisition.")

To CSV conversion started...
To CSV conversion completed.
To TSV conversion started...
To TSV conversion completed.


### Apache Parquet Conversion Example
Below is an example of how to convert data to Apache Parquet format. Note that the `convert_dat_to_xsv` function is used for this conversion with the `file_format` parameter set to 'PARQUET'.

In [10]:
print(f"To Apache Parquet conversion started...")
hsd.convert_dat_to_xsv(hsd_instance, sensor, start_time=0, end_time=-1, labeled=False, raw_data=False, output_folder=output_folder, file_format="PARQUET")
print(f"To Apache Parquet conversion completed.")

To Apache Parquet conversion started...
To Apache Parquet conversion completed.


### HDF5 Format Conversion Example
Below is an example of how to convert data to HDF5 format. Note that the `convert_acquisition_to_hdf5` function is used for this conversion.
The output file will be saved in the specified output folder named as `acquisition.h5`. You can also specify other parameters to personalize the converted output file.
- *Please refer to the `convert_acquisition_to_hdf5` docstring for more information.*

In [12]:
print(f"Converting acquisition to HDF5 format started...")
components = hsd.get_sensor_list(hsd_instance, only_active=True)
hsd.convert_acquisition_to_hdf5(hsd_instance, components, start_time=0, end_time=-1, labeled=False, output_folder=output_folder, raw_data=False, which_tags=[])
print(f"Converting acquisition to HDF5 format completed.")

Converting acquisition to HDF5 format started...
Converting acquisition to HDF5 format completed.


### TXT, CSV Conversion by Tags
When an acquisition has been recorded with labels, it is possible to convert .dat files considering those labels. Below are examples of how to convert data to TXT format, filtered by tags.

In [None]:
print(f"To TXT Conversion filtered by tags started...")
hsd.convert_dat_to_txt_by_tags(hsd_instance, sensor, start_time=0, end_time=-1, output_folder=output_folder, with_untagged = False, out_format="TXT")
print(f"To TXT Conversion filtered by tags completed.")
print(f"To TXT Conversion filtered by tags (with untagged folder) started...")
hsd.convert_dat_to_txt_by_tags(hsd_instance, sensor, start_time=0, end_time=-1, output_folder=output_folder, with_untagged = True, out_format="CSV")
print(f"To TXT Conversion filtered by tags (with untagged folder) completed.")

## Nanoedge Format Conversion
Below an example of how to convert .dat files to Nanoedge format.

In [None]:
print(f"To Nanoedge format conversion started...")
hsd.convert_dat_to_nanoedge(hsd_instance, sensor, signal_length=1000, signal_increment=500, start_time=0, end_time=-1, raw_data=False, output_folder=output_folder)
print(f"To Nanoedge format conversion completed.")

## ST UNICO Format Conversion
Below are examples of how to convert .dat files to ST UNICO format for both unlabeled and labeled acquisitions.

### Unlabeled Acquisition
Even if present, the tag data in the acquisition is not used.

In [None]:
print(f"To Unico format conversion started...")
hsd.convert_dat_to_unico(hsd_instance, [sensor], start_time=0, end_time=-1, use_datalog_tags=False, output_folder=output_folder, out_format="TXT")
print(f"To Unico format conversion completed.")

### Labeled Acquisition
If available, the tag data are read and used to split the converted files.

In [None]:
print(f"To Unico format conversion started...")
hsd.convert_dat_to_unico(hsd_instance, [sensor], start_time=0, end_time=-1, use_datalog_tags=True, output_folder=output_folder, out_format="CSV")
print(f"To Unico format conversion completed.")

### Aggregated Data
Aggregated data can be saved in different files (one per tag group) or as a single file.

- ##### Single File Aggregated Data
A single aggregated file is generated as output.

In [None]:
print(f"To Unico (\"single_file\" Aggregated) format conversion started...")
hsd.convert_dat_to_unico_aggregated(hsd_instance, aggregation="single_file", start_time=0, end_time=-1, use_datalog_tags=False, output_folder=output_folder, with_untagged = True, out_format="CSV")
print(f"To Unico (\"single_file\" Aggregated) format conversion completed.")

- ##### Split Per Tags Aggregated Data
Aggregated data are saved in different files (one per tag group, organized in one folder per tag class).

In [None]:
print(f"To Unico (\"split_per_tags\" Aggregated) format conversion started...")
hsd.convert_dat_to_unico_aggregated(hsd_instance, aggregation="split_per_tags", start_time=0, end_time=-1, use_datalog_tags=True, output_folder=output_folder, with_untagged = False, out_format="CSV")
print(f"To Unico (\"split_per_tags\" Aggregated) format conversion completed.")
print(f"To Unico (\"split_per_tags\" Aggregated, with untagged folder) format conversion started...")
hsd.convert_dat_to_unico_aggregated(hsd_instance, aggregation="split_per_tags", start_time=0, end_time=-1, use_datalog_tags=True, output_folder=output_folder, with_untagged = True, out_format="TXT")
print(f"To Unico (\"split_per_tags\" Aggregated), with untagged folder format conversion completed.")

## WAV Conversion
The WAV format is used for audio data. Below is an example of how to convert data to WAV format.

In [None]:
sensor_name = "imp23absu_mic"
try:
    sensor =  hsd.get_sensor(hsd_instance, sensor_name)
except Exception as e:
    print(f"Error: No [{sensor_name}] sensor available in your selected acquisition folder. Please check the sensor name and try again.")
print(f"To Wav conversion started...")
hsd.convert_dat_to_wav(hsd_instance, sensor, start_time=0, end_time=-1, output_folder=output_folder)
print(f"To Wav conversion completed.")