## Cache Datatream

You will now gather data from 2 energy counters. This is a very minimalistic data stream, which only provides the measurement device and the value, separated by a colon.

This looks as follows: C331,6020

Since there is no time in the measurement itself, you need to rely on the time the message was submitted to the MQTT broker. This timestamp is available as payload.timestamp in epoch format.

### 1

    Within the callback function, combine message time and payload, separated by a colon (,) and store it as data.
    Append data to the caching list cache.


    Make sure to append to the cache list, and not replace it with each iteration.
    timestamp and payload are attributes of message.


In [7]:
MQTT_HOST = "test.mosquitto.org"
MAX_CACHE = 15
# Import mqtt library
import paho.mqtt.subscribe as subscribe

In [8]:
cache = []

def on_message(client, userdata, message):
 	# Combine timestamp and payload
    data = f"{message.timestamp},{message.payload}"
    # Append data to cache
    cache.append(data)
    # Check cache length
    if len(cache) > MAX_CACHE:
        with Path("energy.txt").open("a") as f:
            # Save to file
            f.writelines(cache)
        # reset cache
        cache.clear()

# Connect function to mqtt datastream
subscribe.callback(on_message, topics="datacamp/energy", hostname=MQTT_HOST)

KeyboardInterrupt: 

## Date and Time

You will now convert the timestamp column of the previously gathered data into a datetime object.

Since the timestamp gathered is in Epoch time, pandas provides a simple and useful method to convert this for us.

The data is loaded as df.

### 1 Convert the column "ts" to datetime, without additional argument to the conversation call.

In [12]:
import pandas as pd
df = pd.read_csv("date-time.csv")

In [16]:
# Convert the timestamp
df["ts"] = pd.to_datetime(df["ts"])

# Print datatypes and first observations
print(df.dtypes)
print(df.head())


Unnamed: 0             int64
ts            datetime64[ns]
device                object
val                  float64
dtype: object
   Unnamed: 0                            ts device            val
0           0 1970-01-01 00:25:40.535443083  area1  347069.305500
1           1 1970-01-01 00:25:40.535460858  area1  347069.381205
2           2 1970-01-01 00:25:40.535470254  area2  673204.095708
3           3 1970-01-01 00:25:40.535470474  area1  347069.415853
4           4 1970-01-01 00:25:40.535479547  area2  673204.199130


### 2 Notice that the previous solution did provide the timestamps in the 1970s. Convert the column correctly now.

In [19]:
import pandas as pd
df = pd.read_csv("date-time.csv")

In [20]:
# Convert the timestamp
df["ts"] = pd.to_datetime(df["ts"], unit="ms")

# Print datatypes and first observations
print(df.dtypes)
print(df.head())


Unnamed: 0             int64
ts            datetime64[ns]
device                object
val                  float64
dtype: object
   Unnamed: 0                      ts device            val
0           0 2018-10-26 06:30:43.083  area1  347069.305500
1           1 2018-10-26 06:31:00.858  area1  347069.381205
2           2 2018-10-26 06:31:10.254  area2  673204.095708
3           3 2018-10-26 06:31:10.474  area1  347069.415853
4           4 2018-10-26 06:31:19.547  area2  673204.199130
