# Data processing

This example walks through the basics for processing data and added metrics.

## Concepts

Devices in the framework contain _raw readings_ that are under the device.readings pandas dataframe. A list of the sensors raw metrics can be shown in device.sensors.

Devices can also contain processed values called metrics. These metrics can be added by passing a callable function and then processed.

In [None]:
from scdata.test import Test
from scdata.device import Device
from scdata._config import config

In [None]:
config._log_level='INFO'

In [None]:
test = Test(name='EXAMPLE')

In [None]:
await test.load()

## Process basics

In [None]:
for device in test.devices: 
    print (device.id)

In [None]:
## The readings for each device are accessible via
test.get_device(16784).data

## Basic example calculation

In [None]:
df = test.get_device(16871).data

In [None]:
df['METRIC'] = 8 * df['TEMP'] + 25 * df['PRESS']

In [None]:
df[['TEMP', 'PRESS', 'METRIC']]

## Making it repeatable

In [None]:
## The metrics for each device are accessible via
test.get_device(16784).metrics

In [None]:
help(Test.process)

In [None]:
d = test.get_device(16871)

In [None]:
## Process the metrics as a default
test.process()

Now we can see processed metrics in the `device.readings`

In [None]:
for device in test.devices:
    print (device.data.columns)

## Add metrics

In [None]:
help(Device.add_metric)

In [None]:
help(Device.process)

In [None]:
import scdata
help(scdata.device.process.timeseries)
# help(sc.device.process.alphasense)
# help(sc.device.process.regression)

### Basic polynomial

In [None]:
help(scdata.device.process.timeseries.poly_ts)

In [None]:
from scdata.models import Metric

In [None]:
metric = Metric(name='TP_Poly',
                description='Basic Polynomial calculation',
                function='poly_ts',
                kwargs= {'channels': ['TEMP', 'PRESS'], 'coefficients': [8, 25]}
               )

test.get_device(16871).add_metric(metric)

In [None]:
test.get_device(16871).process(lmetrics=['TP_Poly'])

In [None]:
test.get_device(16871).data.loc[:,['TEMP', 'PRESS', 'TP_Poly', 'METRIC']]

In [None]:
traces = {1: {'devices': 16871,
              'channel': 'TP_Poly',
              'subplot': 2},
          2: {'devices': 16871,
              'channel': 'TEMP',
              'subplot': 1},
          3: {'devices': 16871,
              'channel': 'PRESS',
              'subplot': 1},            
         }

options = {
            'frequency': '1H'
}
formatting = {'width': 800, 'height': 200, 'padding-bottom': 400}
test.ts_uplot(traces = traces, options = options, formatting=formatting)

### Basic smoothing

In [None]:
help(scdata.device.process.timeseries.rolling_avg)

In [None]:
metric = Metric(name='NOISE_A_SMOOTH',
                description='Basic smoothing calculation',
                function='rolling_avg',
                kwargs= {'name': ['NOISE_A'], 'window_size': 5}
               )
test.get_device(16871).add_metric(metric)

In [None]:
test.get_device(16871).process(lmetrics=['NOISE_A_SMOOTH'])

In [None]:
metric = Metric(name='NOISE_A_SMOOTH_10',
                description='Basic smoothing calculation',
                function='rolling_avg',
                kwargs= {'name': ['NOISE_A'], 'window_size': 10}
               )
test.get_device(16871).add_metric(metric)

In [None]:
test.get_device(16871).process(lmetrics=['NOISE_A_SMOOTH_10'])

In [None]:
metric = Metric(name='NOISE_A_SMOOTH_60',
                description='Basic smoothing calculation',
                function='rolling_avg',
                kwargs= {'name': ['NOISE_A'], 'window_size': 60}
               )
test.get_device(16871).add_metric(metric)

In [None]:
test.get_device(16871).process(lmetrics=['NOISE_A_SMOOTH_60'])

In [None]:
test.get_device(16871).data.columns

In [None]:
traces = {1: {'devices': 16871,
              'channel': 'NOISE_A',
              'subplot': 1},
          2: {'devices': 16871,
              'channel': 'NOISE_A_OUTLIERS',
              'subplot': 1},
          3: {'devices': 16871,
              'channel': 'NOISE_A_SMOOTH_10',
              'subplot': 1},
          4: {'devices': 16871,
              'channel': 'NOISE_A_SMOOTH_60',
              'subplot': 1},
          5: {'devices': 16871,
              'channel': 'TEMP',
              'subplot': 2} 
         }

options = {
            'frequency': '1Min'
}
formatting = {'width': 800, 'height': 400}
test.ts_uplot(traces = traces, options = options, formatting=formatting)

## Reprocessing

When adding a new metric, one can only process the added metric as above or the whole test: `test.process()`

If processes take too long, when adding a metric, the new ones can be processed as: `test.process(only_new = True)`

In [None]:
help(Test.process)

In [None]:
help(scdata.device.process.timeseries.clean_ts)

In this example, we will remove values between 35-50dBA and perform a rolling average on the data that is left:

In [None]:
metric = Metric(name='NOISE_A_CL',
                description='Clean Data calculation',
                function='clean_ts',
                kwargs= {'name': 'NOISE_A', 'limits': [35, 50], 'window_size': 3}
               )
test.get_device(16871).add_metric(metric)

In [None]:
test.process(only_new = True)

In [None]:
test.get_device(16871).data.loc[:,['NOISE_A', 'NOISE_A_CL']]

In [None]:
traces = {1: {'devices': 16871,
              'channel': 'NOISE_A',
              'subplot': 1},
          2: {'devices': 16871,
              'channel': 'NOISE_A_CL',
              'subplot': 1},          
         }

options = {
            'frequency': '1Min'
}
test.ts_uplot(traces = traces, options = options)