# Timestream demo

# Timestream

This notebook explores the Timestream API via an sample database that is used to store sensor data from Ruuvi sensors. If you do not own Ruuvi sensor you can simply simulate the data using the generator.

## Preparation

In order to use the boto3 SDK you have to do some preparation steps.
As this is a preview the SDK is not included in the standard boto3. You can add it using the following lines code:

```bash
aws configure add-model --service-name timestream-query --service-model file://./timestream-query/2018-11-01/service-2.json
aws configure add-model --service-name timestream-write --service-model file://./timestream-write/2018-11-01/service-2.json
```

You have also to configure your `~/.aws/config` file adding the following line to the profile you want to use (eg `default`):

```
[default]
endpoind_discovery_enabled=true
```

In [1]:
import boto3
boto3.setup_default_session(profile_name='iot',region_name='us-east-1')

## Databases

Timestream databases and tables can be created using the write client.

In [2]:
tsc = boto3.client('timestream-write')

In [3]:
tsc.list_databases()

{'Databases': [{'Arn': 'arn:aws:timestream:us-east-1:699391019698:database/ruuvi',
   'DatabaseName': 'ruuvi',
   'TableCount': 1,
   'KmsKeyId': 'arn:aws:kms:us-east-1:699391019698:key/85b0ac66-732f-4a71-b74a-e33073d64f86',
   'CreationTime': datetime.datetime(2020, 5, 18, 15, 51, 11, tzinfo=tzlocal())},
  {'Arn': 'arn:aws:timestream:us-east-1:699391019698:database/sampleDB',
   'DatabaseName': 'sampleDB',
   'TableCount': 1,
   'KmsKeyId': 'arn:aws:kms:us-east-1:699391019698:key/85b0ac66-732f-4a71-b74a-e33073d64f86',
   'CreationTime': datetime.datetime(2020, 5, 4, 21, 21, 14, tzinfo=tzlocal())},
  {'Arn': 'arn:aws:timestream:us-east-1:699391019698:database/sampleDBdevops',
   'DatabaseName': 'sampleDBdevops',
   'TableCount': 1,
   'KmsKeyId': 'arn:aws:kms:us-east-1:699391019698:key/85b0ac66-732f-4a71-b74a-e33073d64f86',
   'CreationTime': datetime.datetime(2020, 5, 27, 16, 29, 27, tzinfo=tzlocal())}],
 'ResponseMetadata': {'RequestId': 'OPT2MSDEHZS2YZDPE5INCFB6EA',
  'HTTPStatusCod

In [4]:
DB_NAME='demo_db'

In [9]:
tsc.create_database(DatabaseName=DB_NAME)

{'Database': {'Arn': 'arn:aws:timestream:us-east-1:699391019698:database/demo_db',
  'DatabaseName': 'demo_db',
  'TableCount': 0,
  'KmsKeyId': 'arn:aws:kms:us-east-1:699391019698:key/85b0ac66-732f-4a71-b74a-e33073d64f86',
  'CreationTime': datetime.datetime(2020, 6, 1, 14, 44, 12, tzinfo=tzlocal())},
 'ResponseMetadata': {'RequestId': '2WLLPQPWYFG4ERQ5GYLNJDDCGQ',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '2WLLPQPWYFG4ERQ5GYLNJDDCGQ',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '239',
   'date': 'Mon, 01 Jun 2020 12:44:11 GMT'},
  'RetryAttempts': 0}}

In [7]:
TABLE_NAME='delta_stream'
HOT_TIER_HOURS=24
COLD_TIER_TTL=90

In [12]:
tsc.create_table(DatabaseName=DB_NAME, TableName=TABLE_NAME, RetentionProperties= {
    'MemoryStoreRetentionPeriodInHours': HOT_TIER_HOURS,
    'MagneticStoreRetentionPeriodInDays': COLD_TIER_TTL
})

{'Table': {'Arn': 'arn:aws:timestream:us-east-1:699391019698:database/demo_db/table/delta_stream',
  'TableName': 'delta_stream',
  'DatabaseName': 'demo_db',
  'TableStatus': 'ACTIVE',
  'RetentionProperties': {'MemoryStoreRetentionPeriodInHours': 24,
   'MagneticStoreRetentionPeriodInDays': 90},
  'CreationTime': datetime.datetime(2020, 6, 1, 14, 44, 42, tzinfo=tzlocal()),
  'LastUpdatedTime': datetime.datetime(2020, 6, 1, 14, 44, 42, tzinfo=tzlocal())},
 'ResponseMetadata': {'RequestId': 'H6E57MUOBBXAVDU24A2SV6WIOA',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'H6E57MUOBBXAVDU24A2SV6WIOA',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '336',
   'date': 'Mon, 01 Jun 2020 12:44:42 GMT'},
  'RetryAttempts': 0}}

In [13]:
tsc.describe_table(DatabaseName=DB_NAME, TableName=TABLE_NAME)

{'Table': {'Arn': 'arn:aws:timestream:us-east-1:699391019698:database/demo_db/table/delta_stream',
  'TableName': 'delta_stream',
  'DatabaseName': 'demo_db',
  'TableStatus': 'ACTIVE',
  'RetentionProperties': {'MemoryStoreRetentionPeriodInHours': 24,
   'MagneticStoreRetentionPeriodInDays': 90},
  'CreationTime': datetime.datetime(2020, 6, 1, 14, 44, 42, tzinfo=tzlocal()),
  'LastUpdatedTime': datetime.datetime(2020, 6, 1, 14, 44, 42, tzinfo=tzlocal())},
 'ResponseMetadata': {'RequestId': 'SF425DF2HRVLTUOK3AVRBKATFI',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'SF425DF2HRVLTUOK3AVRBKATFI',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '336',
   'date': 'Mon, 01 Jun 2020 12:44:43 GMT'},
  'RetryAttempts': 0}}

## Write data

The API to write records accepts either single records with distinct dimensions or multiple records with the same dimension.

In [14]:
sensor1_dimensions = [
    { 'Name': 'location', 'Value': 'room1'},
    { 'Name': 'site', 'Value': 'home'}
]
sensor2_dimensions = [
    { 'Name': 'location', 'Value': 'room2'},
    { 'Name': 'site', 'Value': 'home'}
]
sensor3_dimensions = [
    { 'Name': 'location', 'Value': 'room3'},
    { 'Name': 'site', 'Value': 'home'}
]


In our case the data emitted by the sensors has the following format:

```json
{
  "presence": 0,
  "luminosity": 1,
  "temperature": 21.87
}
```

Each value corresponds to a measure. For the input above we can then create the records as follow:

In [15]:
from time import time

data = {
  "presence": 0,
  "luminosity": 1,
  "temperature": 21.87
}

def data_to_records(data):
    records = []
    for k,v in data.items():
        if k in ['presence', 'luminosity', 'temperature']:
            records.append({
                'MeasureName': k,
                'MeasureValue': str(v)
            })
    return records

print(data_to_records(data))

[{'MeasureName': 'presence', 'MeasureValue': '0'}, {'MeasureName': 'luminosity', 'MeasureValue': '1'}, {'MeasureName': 'temperature', 'MeasureValue': '21.87'}]


And finally we write the records:

In [16]:
tsc.write_records(DatabaseName=DB_NAME,
                 TableName=TABLE_NAME,
                 CommonAttributes= {
                     'Dimensions': sensor1_dimensions,
                     'MeasureValueType': 'DOUBLE',
                     'Timestamp': str(int(time()*1000)),
                     'TimestampUnit': 'MILLISECONDS'
                 },
                 Records=data_to_records(data))

{'ResponseMetadata': {'RequestId': 'A3QSEUHCO5C5RYUMPJCRNAGLO4',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'A3QSEUHCO5C5RYUMPJCRNAGLO4',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '0',
   'date': 'Mon, 01 Jun 2020 12:48:24 GMT'},
  'RetryAttempts': 1}}

In [65]:
import random
import time
def get_presence():
    x = random.randint(0, 1)
    return x

def get_luminosity():
    x = random.random()*10
    return x

def get_temperature():
    x = random.random()*30-5
    return x

In [None]:
for i in range(40):
    data = {
        "presence": get_presence(),
        "luminosity": get_luminosity(),
        "temperature": get_temperature()
    }
    print(data, end='')
    try: 
        tsc.write_records(DatabaseName=DB_NAME,
                 TableName=TABLE_NAME,
                 CommonAttributes= {
                     'Dimensions': sensor1_dimensions,
                     'MeasureValueType': 'DOUBLE',
                     'Timestamp': str(int(time.time()*1000)),
                     'TimestampUnit': 'MILLISECONDS'
                 },
                 Records=data_to_records(data))
        print('...done')
    except Exception as ex:
        print('...failed')
    
    time.sleep(random.randint(5, 15))
    

{'presence': 1, 'luminosity': 1.7855089150810899, 'temperature': 0.43525477868120976}...done
{'presence': 0, 'luminosity': 8.464481402347895, 'temperature': -1.5278091478763045}...done
{'presence': 1, 'luminosity': 4.986582513405988, 'temperature': 9.647631558221049}

## Queries

Queries are written in SQL-like format with some specific time sieries extensions. To run queries we have to instantiate a query client.

In [7]:
tsq = boto3.client('timestream-query')

In [66]:
query='''WITH interp_ts AS (
    SELECT location, INTERPOLATE_LINEAR(
        CREATE_TIME_SERIES(time, measure_value::double),
            SEQUENCE(ago(5m), now(), 10s)) AS temp
        FROM ruuvi.sensors
        WHERE measure_name='temperature' AND time >= ago(5m)
        GROUP BY location
)
SELECT location, avg(t.temp_unnest) FROM interp_ts
CROSS JOIN UNNEST(temp) AS t (time, temp_unnest)
GROUP BY location
'''

In [67]:
tsq.query(QueryString=query)

ValidationException: An error occurred (ValidationException) when calling the Query operation: Cannot interpolate outside of timeseries defined time range.

In [8]:

QUERY_MULTI = """select bin(time, {bin}) as binned_time, avg(case when measure_name='temperature' then measure_value::double else null end) as avg_temp,
avg(case when measure_name='humidity' then measure_value::double else null end) as avg_humidity,
avg(case when measure_name='pressure' then measure_value::double else null end) as avg_pressure
from ruuvi.sensors
where time > ago({time}) and location = '{location}'
group by bin(time, {bin})
order by bin(time, {bin})"""

In [9]:
query = QUERY_MULTI.format(time='3h', bin='10m', location='e7428453ecb1')

In [10]:
query

"select bin(time, 10m) as binned_time, avg(case when measure_name='temperature' then measure_value::double else null end) as avg_temp,\navg(case when measure_name='humidity' then measure_value::double else null end) as avg_humidity,\navg(case when measure_name='pressure' then measure_value::double else null end) as avg_pressure,\nfrom ruuvi.sensors\nwhere time > ago(3h) and location = 'e7428453ecb1'\ngroup by bin(time, 10m)\norder by bin(time, 10m)"

In [None]:
tsq.query(QueryString=query)