If you're running this jupyter notebook in a Google Colab please uncomment lines and run this cell.
Otherwise ignore it since the files will be available to you locally

In [None]:
# !wget -P data/ https://raw.githubusercontent.com/paramm-team/data_processing/main/src/input/data/Digatron.csv
# !wget -P data/ https://raw.githubusercontent.com/paramm-team/data_processing/main/src/input/data/Gamry.DTA
# !wget -P data/ https://raw.githubusercontent.com/paramm-team/data_processing/main/src/input/data/Maccor.csv
# !wget -P data/ https://raw.githubusercontent.com/paramm-team/data_processing/main/src/input/data/Novonix.csv

When you execute this command in a Jupyter Notebook, pip will clone the data_processing repository from GitHub and install it into your Python environment. This method of installation is often used for packages that are in development or when you want to install a specific version of a package that is not available through PyPI.

In [None]:
%pip install git+https://github.com/paramm-team/data_processing.git

The line import src in a Jupyter Notebook is a Python statement that imports a module named src into the current namespace, allowing you to use its functions, classes, and variables within your notebook.

In [1]:
import pbdp

The result of this function call will be a string that represents the path to the data directory. This path will be platform-independent, meaning it will use the correct path separators for Unix (/) or Windows (\\).

**Files available in this folder for testing purposes are: Digatron.csv, Maccor.csv, Novonix.csv, Gamry.DTA**

In [2]:
import os
import platform
from pathlib import Path

if 'google.colab' in str(get_ipython()):
    path = Path('data/').absolute()
else:
    if platform.system() == "Windows":
        path = Path(pbdp.__path__[0], "input", "data").absolute()
    else:
        path = Path(pbdp.__path__[0], "input", "data").absolute()
    print(path)

/Users/pipgrylls/Code/data_processing/pbdp/input/data


**The following example contains the maximum you could achieve using the data_importer method which returns a dataframe. For a more simpler approach please find an example below and there will be a breakdown for each options separately with an example.**

parser = src.Parser(): This line creates an instance of the Parser class from the src package.

**path_or_file**='/usr/local/lib/python3.10/dist-packages/src/input/data/Digatron.csv': This is the path to the file that the data_importer method will process. If you are not using Google Colab, adjust this option with the result from the above and the filename as in the example.

**print_option**="diff": This is the custom-made plotting functionality of the package. It creates interactive plots for Current, Voltage, Temperature, Steps over time.

**file_type**='csv': This indicates the type of the file being saved is CSV. Other options are: parquet, pickle, and feather

**save_option**="save all": This is instructing the data_importer method to save all the processed data, including the metadata separatelly from the pre-processed data. If you want to save onlly the data itself, remove this option.

**state_option**="yes": This will be the simplest form of data processing that the package is able to achieve. It will add a column to your data named "Battery State" which will say at every row the battery is either, charging, discharging, or resting. If you don't require this just remove the option.

In [5]:
parser = pbdp.Parser()
data = parser.data_importer(path_or_file=path / "Digatron.csv", print_option="diff", file_type="csv", save_option="save all", state_option="yes")

AttributeError: 'PosixPath' object has no attribute 'endswith'

**This example contains the bare minimum that the method will do for you in terms of preprocessing. Save only the data from the files in a parquet, removing the metadata and adjusting the column names and units, removing empty rows and returning the final dataframe**

parser = src.Parser(): This line creates an instance of the Parser class from the src package.

**path_or_file**='/usr/local/lib/python3.10/dist-packages/src/input/data/': This is the path to the folder that the data_importer method will process. Put the result from the above plus / the file name inside the '' to test the package outside of Google Colab.

In [None]:
parser = src.Parser()
data = parser.data_importer(path_or_file=path)

**This example focuses on saving options available after preprocessing**

**save_option=""** the available options are **"save"** **(default)** and **"save all"**. The first keeps only the data and the latter saves the metadata in a separate file in case it's needed.

**file_type=""** the available options are **"parquet" (default), "csv", "feather", and "pickle"**

In [None]:
parser = src.Parser()
data = parser.data_importer(path_or_file=path + 'Digatron.csv', save_option="save", file_type="pickle")

 **This example focuses on displaying options available after preprocessing**

**print_options=""** available options are **"yes" or "diff"**. This is the custom-made plotting functionality of the package. It creates interactive plots for Current, Voltage, Temperature, Steps over Time automatically adjusting for the availability (data must record time).

**"yes"** will be plotting up to 4 graphs mentioned and a preview of the columns and first rows of data, while **"diff"** will add two extra plots of Current and Voltage with the derivative (diff between points) on top for easier track of changes over time.

*This method will display the graphs in Google Colab or open a browser if you're running the code locally.*

In [None]:
parser = src.Parser()
data = parser.data_importer(path_or_file=path + 'Digatron.csv', print_option="save")

**This example focuses on a simple processing of Battery State based on Current**

This method adds a column named "Battery State" which describes at every row the actions of charging, discharging or resting based on a threshold for the current values.

**state_option="yes"** this option is optional, hence not required.

In [None]:
parser = src.Parser()
data = parser.data_importer(path_or_file=path + "Digatron.csv", state_option="yes")