# Example of Data Analysis with DCD Hub Data

First, we import the Python SDK

In [1]:
from dcd.entities.thing import Thing

We provide the thing ID and access token (replace with yours)

In [2]:
from dotenv import load_dotenv
import os
load_dotenv()
THING_ID = os.environ['THING_ID']
THING_TOKEN = os.environ['THING_TOKEN']

We instantiate a Thing with its credential, then we fetch its details

In [3]:
my_thing = Thing(thing_id=THING_ID, token=THING_TOKEN)
my_thing.read()

INFO:dcd:things:my-test-thing-27aa:Initialising MQTT connection for Thing 'dcd:things:my-test-thing-27aa'
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): dwd.tudelft.nl:443
DEBUG:urllib3.connectionpool:https://dwd.tudelft.nl:443 "GET /api/things/dcd:things:my-test-thing-27aa HTTP/1.1" 200 1290


What does a Thing look like?

In [4]:
my_thing.to_json()

{'id': 'dcd:things:my-test-thing-27aa',
 'name': 'My Test Thing',
 'description': 'Just a Thing to test!',
 'type': 'Test',
 'properties': [{'id': 'my-random-property-6820',
   'name': 'My Random Property',
   'description': '',
   'type': 'THREE_DIMENSIONS',
   'dimensions': [{'name': 'Value1', 'description': '', 'unit': ''},
    {'name': 'Value2', 'description': '', 'unit': ''},
    {'name': 'Value3', 'description': '', 'unit': ''},
    {'name': 'Value1', 'description': '', 'unit': ''},
    {'name': 'Value2', 'description': '', 'unit': ''},
    {'name': 'Value3', 'description': '', 'unit': ''}]}]}

Which property do we want to explore and over which time frame?

In [8]:
from datetime import datetime
# What dates?
START_DATE = "2019-10-08 21:17:00"
END_DATE = "2019-11-08 21:25:00"

from datetime import datetime
DATE_FORMAT = '%Y-%m-%d %H:%M:%S'
from_ts = datetime.timestamp(datetime.strptime(START_DATE, DATE_FORMAT)) * 1000
to_ts = datetime.timestamp(datetime.strptime(END_DATE, DATE_FORMAT)) * 1000

Let's find this property and read the data.

In [9]:
PROPERTY_NAME = "My Random Property"

my_property = my_thing.find_property_by_name(PROPERTY_NAME)
my_property.read(from_ts, to_ts)

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): dwd.tudelft.nl:443
DEBUG:urllib3.connectionpool:https://dwd.tudelft.nl:443 "GET /api/things/dcd:things:my-test-thing-27aa/properties/my-random-property-6820?from=1570562220000.0&to=1573244700000.0 HTTP/1.1" 200 2


KeyError: 'property'

How many data point did we get?

In [None]:
print(len(my_property.values))

Display values

In [12]:
my_property.values

[]

# From CSV

In [16]:
from numpy import 
import pandas as pd
data = genfromtxt('data.csv', delimiter=',')
data_frame = pd.DataFrame(data[:,1:], index = pd.DatetimeIndex(pd.to_datetime(data[:,0], unit='ms')), columns = ['x', 'y', 'z'])
data_frame

SyntaxError: invalid syntax (<ipython-input-16-fd8c1726a4e5>, line 1)

# Plot some charts with Matplotlib
In this example we plot an histogram, distribution of all values and dimensions.

In [17]:
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
from numpy import ma
data = np.array(my_property.values)

ModuleNotFoundError: No module named 'matplotlib'

In [18]:
figure(num=None, figsize=(15, 5))
t = data_frame.index
plt.plot(t, data_frame.x, t, data_frame.y, t, data_frame.z)

NameError: name 'figure' is not defined

In [19]:
plt.hist(data[:,1:])
plt.show()

NameError: name 'plt' is not defined

# Generate statistics with NumPy and Pandas

In [None]:
import numpy as np
from scipy.stats import kurtosis, skew

In [None]:
np.min(data[:,1:4], axis=0)

In [None]:
skew(data[:,1:4])

You can select a column (slice) of data, or a subset of data. In the example below we select rows
from 10 to 20 (10 in total) and the colum 1 to x (i.e skiping the first column representing the time).

In [None]:
data[:10,1:]

Out of the box, Pandas give you some statistics, do not forget to convert your array into a DataFrame.

In [None]:
data_frame = pd.DataFrame(data[:,1:], index = pd.DatetimeIndex(pd.to_datetime(data[:,0], unit='ms')))
pd.DataFrame.describe(data_frame)

In [None]:
data_frame.rolling(10).std()

# Rolling / Sliding Window
To apply statistics on a sliding (or rolling) window, we can use the rolling() function of a data frame. In the example below, we roll with a window size of 4 elements to apply a skew()

In [None]:
rolling2s = data_frame.rolling('2s').std()
plt.plot(rolling2s)
plt.show()

In [None]:
rolling100_data_points = data_frame.rolling(100).skew()
plt.plot(rolling100_data_points)
plt.show()

# Zero Crossing

In [None]:
plt.hist(np.where(np.diff(np.sign(data[:,1]))))
plt.show()

https://docs.scipy.org/doc/scipy/reference/stats.html#discrete-distributions