
### TLDR:
1. Download and install the software.
2. Import your data
3. Train your model
4. See the test accuracy

The following is an example on Human Activity Recognition use-case. Lightscline is always happy to help if you are stuck anywhere. Feel free to e-mail/text us anytime. 

## 1. Download and Install the software

### 1.1 Download the software 


1. <a href="https://calendly.com/lightscline/lightscline-ai-demo" target="_blank">Book a meeting here </a>.
2. Within 24 hrs of the meeting, You will get the link to download our software

### 1.2 Install our software

Our current software works with **python 3.9, 3.10 and 3.11** . Please let us know for any more version support.

0. (optional) Create a virtual environment 

 - run `python -m venv env` to create a new virtual environment. This should create a folder named env.
 - run `.\env\Scripts\activate` to activate the virtual environment.
 - (optional) install jupyter notebook for iteractive session `pip install jupyter`
 
1. Download and install Microsoft Visual C++ Redistributable: https://aka.ms/vs/16/release/vc_redist.x64.exe  
2. Install lightscline package by `pip install <path to the .whl file>`


### 1.3 Import the package to check if it is correctly installed

In [1]:
from lightscline.lightscline import LightsclineCompute

This is a pilot version with limited capabilities.
Please contact us at info@lightscline.com for the full version. Our team will guide you through the complete suite of functionalities, tailored for your specific needs.
License is valid till  20 November 2024


## 2. Import your data

You can either import your data or use a dummy dataset to see how it work. 

Code for Dummy Data:
```
fs=[1200,1800]
time_for_each_class=[10,16,18]

no_of_classes = len(time_for_each_class)
no_of_channels = len(fs)
data = []
for i in range(no_of_classes):
    data_class = []
    for j in range(no_of_channels):
        data_channel = []
        data_channel.append(float((i+1)*10+(j+1))+0.1)
        data_channel*=int(fs[j]*time_for_each_class[i])
        data_class.append(data_channel)
    data.append(data_class)
```

This will create data with 3 classes with unequal data and each class would have 2 channels of data with different sampling frequencies. The following example is on <a href="https://archive.ics.uci.edu/dataset/231/pamap2+physical+activity+monitoring" target="_blank"> PAMAP2 Dataset </a>

### 2.1 Data Preparation for ingestion


1. Download the dataset <a href="https://archive.ics.uci.edu/static/public/231/pamap2+physical+activity+monitoring.zip" target="_blank">here</a>

2. Unzip the files

Or Uncomment the below commands for the above two steps. 

You can also use your own dataset here instead of using the example dataset

In [2]:
# !wget https://archive.ics.uci.edu/static/public/231/pamap2+physical+activity+monitoring.zip
# !tar -xf "pamap2+physical+activity+monitoring.zip"
# !tar -xf "PAMAP2_Dataset.zip"

In [3]:
import glob
import pandas as pd
import numpy as np

### Data Ingestion

In [4]:
files = glob.glob(r'<path>/PAMAP2_Dataset\Protocol\*')  
files.sort()

In [5]:
## Take data into all files into one dataframe
df_temp = []
for file in files:
    t = pd.read_csv(file, header = None, delimiter=' ')
    t.ffill(inplace=True) ##because sampling rate for each sensor maynot be same
    df_temp.append(t)
df = pd.concat(df_temp, ignore_index=True)

In [6]:
## Data cleaning.
df = df[df[1]!=0]
df.reset_index(inplace=True)
df.drop('index', axis=1, inplace=True)

In [7]:
labels = df[1].unique()

In [8]:
reverse_label = {labels[i]:i for i in range(len(labels))}

In [9]:
### seperating data for each label. Note that we are keeping continous data of same class in one column. If same class has 
## disconuity
breakpoints = df[df[1].shift() != df[1]].index 
label_data = []
c_label = []
for i in range(len(breakpoints)-1):
    temp_df = df.iloc[breakpoints[i]:breakpoints[i+1]]
    if (temp_df.shape[0])<1000:
        continue  ##skip too small windows of data
    assert(temp_df[1].nunique()==1) ## make sure only one class is present
    label_data.append(temp_df.loc[:,3:].to_numpy().T.tolist())
    t_c_label = temp_df[1].iloc[0]
    c_label.append(reverse_label[t_c_label])

In [10]:
len(label_data), len(c_label), np.unique(c_label)

(104, 104, array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]))

In [11]:
remove_idx = []

In [12]:
for i in label_data:
    if len(i[0]) < 1000:
        print(i.shape)

In [13]:
len(label_data), len(label_data[0]), len(label_data[0][0])

(104, 51, 27187)

### 2.3 Ingest your data into Lightscline Compute


In [14]:
ls = LightsclineCompute(data=label_data,fs = 100,labels = c_label)  ## Labels will be decided based on column number

License is valid till  20 November 2024


In [15]:
ls.reduce_and_preprocess_data(per_reduction=90, window_time=1, data_aug_multiplier=10) 

In [16]:
ls.train_model(layers=(50,30),verbose=True,n_iters = 1000, learning_rate=0.001)  

epoch:  1 loss:  2.515644073486328
epoch:  51 loss:  1.7156381607055664
epoch:  101 loss:  1.3024115562438965
epoch:  151 loss:  1.0106568336486816
epoch:  201 loss:  0.8239051103591919
epoch:  251 loss:  0.6998270153999329
epoch:  301 loss:  0.6100802421569824
epoch:  351 loss:  0.5407705903053284
epoch:  401 loss:  0.4867238700389862
epoch:  451 loss:  0.44622424244880676
epoch:  501 loss:  0.4141707122325897
epoch:  551 loss:  0.38581231236457825
epoch:  601 loss:  0.3604455292224884
epoch:  651 loss:  0.33924534916877747
epoch:  701 loss:  0.32002148032188416
epoch:  751 loss:  0.30491337180137634
epoch:  801 loss:  0.2901945412158966
epoch:  851 loss:  0.27855536341667175
epoch:  901 loss:  0.2674615979194641
epoch:  951 loss:  0.25677353143692017


In [17]:
ls.test_model()

Accuracy:  0.93


In [18]:
X_test = [label_data[0][ch][:300] for ch in range(51)]
ls.predict(X_test)

array([0, 0, 0], dtype=int64)

In [19]:
## for getting only one value, you can take out the "Mode" of predictions.
import statistics
statistics.mode(ls.predict(X_test))

0