[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aerosense-ai/notebooks/blob/dev%2Fpre-process/pre_process.ipynb)

In [None]:
from google.colab import auth
auth.authenticate_user()

# Connection statistics
Checking out connection stats

In [None]:
# Run a BigQuery and safe output in a variable `df`,
# This will not run with Aerosense Tools installed!
%%bigquery --project aerosense-twined df
SELECT datetime,
sensor_value[ORDINAL(1)] as filtered_rssi,
sensor_value[ORDINAL(2)] as raw_rssi,
sensor_value[ORDINAL(3)] as tx_power,
sensor_value[ORDINAL(4)] as allocated_heap_memory
FROM `aerosense-twined.greta.sensor_data` 
WHERE sensor_type_reference="connection_statistics" and
node_id="2" and
datetime between "2022-07-21 10:30:00.0" and "2022-07-22 00:00:00.0"
ORDER BY datetime ASC
LIMIT 10000

In [None]:
import datetime as dt

#Process the data
df = df.set_index('datetime')
df.index.name = 'UTC Time'
connection_statistics = RawSignal(df, "connection-statistics")
connection_statistics.pad_gaps(dt.timedelta(seconds=5))


In [None]:
import plotly.express as px
import plotly.io as pio

#Define plot dicts
sensor_list = [
    {
        "sensor_type":"tx_power",
        "title":"Transmitting Power",
        "unit":"[W]"
    },
    {
        "sensor_type": "allocated_heap_memory",
        "title": "Allocated Heap Memory",
        "unit": "[Bytes]"
    },
    {
        "sensor_type": "raw_rssi",
        "title": "Received signal strength indication: Raw",
        "unit": "[dBm]"
    },
    {
        "sensor_type": "filtered_rssi",
        "title": "Received signal strength indication: Filtered",
        "unit": "[dBm]"
    }

]

#Do some plotting with plotly
pio.renderers.default = "colab"
for sensor in sensor_list:
    fig = px.line(
        connection_statistics.dataframe[sensor["sensor_type"]],
        title=sensor["title"]
    )
    fig.update_layout(
        showlegend=False,
        yaxis_title=sensor["unit"]
    )
    fig.show()


Question for PBL are we still actively controlling power even when not sending data?

# Data availability

Some WIP on checking when do we have data.


This query returns a table with a day of chosen month and a total number samples per sensor type reference:

In [None]:
%%bigquery --project aerosense-twined df
SELECT EXTRACT(DAY FROM datetime) as day_of_month, sensor_type_reference, 
COUNT(*) AS number_of_samples
FROM `aerosense-twined.greta.sensor_data` 
WHERE 
node_id="2" and
EXTRACT(YEAR FROM datetime) = 2022 and
EXTRACT(MONTH FROM datetime) = 7
GROUP BY day_of_month, sensor_type_reference
ORDER BY day_of_month ASC

In [None]:
df

Unnamed: 0,day_of_month,sensor_type_reference,number_of_samples
0,13,connection_statistics,200481
1,14,magnetometer,80
2,14,accelerometer,920
3,14,barometer_thermometer,1024
4,14,connection_statistics,402615
5,14,barometer,1024
6,14,gyroscope,920
7,15,connection_statistics,20784
8,21,gyroscope,71040
9,21,barometer_thermometer,77441


In [None]:
%%bigquery --project aerosense-twined bqdf
SELECT EXTRACT(HOUR FROM datetime) AS hour_of_day, sensor_type_reference, 
COUNT(*) AS number_of_samples
FROM `aerosense-twined.greta.sensor_data` 
WHERE 
node_id="2" AND
datetime BETWEEN "2022-07-22 00:00:00.0" AND "2022-07-23 00:00:00.0"

GROUP BY hour_of_day, sensor_type_reference
ORDER BY hour_of_day ASC

In [None]:
df

Unnamed: 0,hour_of_day,sensor_type_reference,number_of_samples
0,8,connection_statistics,1302
1,8,barometer_thermometer,1258
2,8,barometer,1258
3,9,barometer,76204
4,9,connection_statistics,30330
...,...,...,...
83,23,accelerometer,28320
84,23,barometer_thermometer,30000
85,23,magnetometer,3520
86,23,barometer,30000
