Setup an MQTT broker and send messages

Use the diabetes.csv dataset to do the following:
1. Select the following 4 attributes (3 features + 1 class label) :
• Glucose, BloodPressure, Insulin, Outcome
2. Normalize Glucose, BloodPressure and Insulin to [0, 1] using MinMax.
3. Store the new data (3 normalized features + 1 class label) in another dataset S.
4. Modify the MQTT example to do the following:
• The publisher publishes records in S continuously. When it reaches the end of S, it continues to send from the
beginning again.
• The subscriber continuously receives the data. For each latest record r received, apply the 3NN classification to the
last 5 records before r, and compare the classification result with the Outcome label in r.
• Repeat this for 1000 times, and report the number of correct classifications

In [1]:
#Import Libraries
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

In [3]:
#Load data
df = pd.read_csv("Datasets/diabetes.csv")

In [4]:
#Quick EDA
df.info()
df.isnull().sum()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Pregnancies               768 non-null    int64  
 1   Glucose                   768 non-null    int64  
 2   BloodPressure             768 non-null    int64  
 3   SkinThickness             768 non-null    int64  
 4   Insulin                   768 non-null    int64  
 5   BMI                       768 non-null    float64
 6   DiabetesPedigreeFunction  768 non-null    float64
 7   Age                       768 non-null    int64  
 8   Outcome                   768 non-null    int64  
dtypes: float64(2), int64(7)
memory usage: 54.1 KB


Pregnancies                 0
Glucose                     0
BloodPressure               0
SkinThickness               0
Insulin                     0
BMI                         0
DiabetesPedigreeFunction    0
Age                         0
Outcome                     0
dtype: int64

1. Select the following 4 attributes (3 features + 1 class label) :
• Glucose, BloodPressure, Insulin, Outcome

In [5]:
#Store the data into 
X = df[["Glucose", "BloodPressure", "Insulin","Outcome"]]

2. Normalize Glucose, BloodPressure and Insulin to [0, 1] using MinMax.

In [9]:
#Scaling the data into MinMax
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
X = pd.DataFrame(scaler.fit_transform(X), columns=X.columns) 

4. Modify the MQTT example to do the following:
* The publisher publishes records in S continuously. When it reaches the end of S, it continues to send from the
beginning again.
* The subscriber continuously receives the data. For each latest record r received, apply the 3NN classification to the
last 5 records before r, and compare the classification result with the Outcome label in r.
* Repeat this for 1000 times, and report the number of correct classifications

In [15]:
import time
import paho.mqtt.client as mqtt
import json

#Create a client
mqtt_client = mqtt.Client()

#Connect to port 1883 and set parameter to 60
mqtt_client.connect("mqtt.eclipseprojects.io", 1883, 60)

#Send the following data repeatedly. Once we reach the end, we start from the beginning again
period = len(X)

if __name__  == "__main__":
    print("Publishing....")

    index = 0
    while(index <= 1000):
        #Get the current data rading to send out
        record = X.loc[index % period]

        #Publish the data reading as "diabetes/records"
        mqtt_client.publish("diabetes/records", record.to_json(orient="index"))

        #We send the next reading after 5 seconds
        time.sleep(0.2)
        index = index + 1

Publishing....
