# Analyzing IoT Data with Jupyter Notebooks

The prerequisite for this notebook is that you have access to an AWS IoT dataset that contains data. The corresponding configuration was shown in the lecture. 

To be able to read the IoT data from the dataset the following steps are necessary:

1. Installation of necessary tools
    - AWS Command Line Interfaces (CLI)
    - AWS Python SDKs
2. Configuration of access to the dataset
    - Create AWS Access Key
    - Configure AWS CLI 

## Installing the necessary tools

Download the latest version of the AWS CLI for your operating system [here](https://aws.amazon.com/cli/) #
and install it.

To install the [AWS Python SDK](https://aws.amazon.com/sdk-for-python/), use `pip`. The following command
installs the latest version of the AWS Python SDK:

```shell
pip install boto3
```

If you are working with virtual environments you need to make sure that the AWS Python SDK
is also installed there (e.g. via the corresponding function in your IDE).

## Configuration of access to the dataset 

First, you need to create an access key used to allow access to AWS resources using the AWS Python SDK.
To create an access key perform the following steps:

1. Open the [AWS Console](https://console.aws.amazon.com/). 
2. Navigate to the IAM service.
3. Open your user and navigate to the "Security Credentials" tab
4. Click on "Create access key".
5. Once the access key is created, save the "Access Key ID" and the "Secret Access Key". 

Next, the AWS Python SDK needs to be configured for using the access key. This is done using the AWS CLI. 
To perform the configuration execute the following command

```shell 
aws configure
```

The configuration program will ask for the access key ID and the secret access key. Furthermore, you
need to specify the AWS region in which your resources are located (e.g. eu-central-1).
 
After that the configuration is complete and you can read IoT data from the dataset and analyse it using Python 🐍.

## Example: Displaying IoT data with matplotlib

First we use the Jupyter notebook command `%matplotlib inline` to activate the inline display of the 
graphs.

In [None]:
%matplotlib inline

Next, the `boto3` library is imported and an `iotanalytics` client is created. This client can
then be used to read the AWS IoT Analytics dataset. For this, the correct name of the dataset must be passed. 

In [None]:
import boto3

client = boto3.client("iotanalytics")
data = client.get_dataset_content(datasetName="myiotanalytics_dataset")

The data is accessed via a URI that points to a CSV file containing the data. This CSV file can be processed, for example, using `pandas`. It might be necessary to install the pandas library first (`pip install pandas`).

In [None]:
import pandas

df = pandas.read_csv(data["entries"][0]["dataURI"])
df

Finally, the data of the dataset can plotted using `matplotlib`.

In [None]:
import matplotlib.pyplot as plt

ax = plt.gca()

df.plot(x="timestamp", y="humidity", kind="line", figsize=(20, 10), ax=ax)
df.plot(x="timestamp", y="temperature", kind="line", ax=ax)

ax.set_xlabel("Date")