## Simulating real-time data using Apache Kafka Producers.

In this section we will implement multiple Apache Kafka producers to simulate the real-time streaming of the data which will be processed by Apache Spark Streaming client and then inserted into MongoDB. 

Description: A python program that loads all the data from climate_streaming.csv and randomly feed the data to the stream every 5 seconds. Some additional information such as sender_id and created_time are also appended. 

In [None]:
# importing required libraries
from time import sleep
from json import dumps
from kafka import KafkaProducer
import random
import datetime as dt
import pandas as pd


#reading in data
df = pd.read_csv("climate_streaming.csv") # using pandas library to read the data set called climate_streaming.csv.
df = df.astype("object") # casting the data type of the entire dataset to primitive type so that it can be converted to json.


def publish_message(producer_instance, topic_name, data):
    try:
        producer_instance.send(topic_name, value=data)
        print('Message published successfully. Data: ' + str(data))
    except Exception as ex:
        print('Exception in publishing message.')
        print(str(ex))
        
def connect_kafka_producer():
    _producer = None
    try:
        _producer = KafkaProducer(bootstrap_servers=['localhost:9092'],
                                  value_serializer=lambda x: dumps(x).encode('ascii'),
                                  api_version=(0, 10))
    except Exception as ex:
        print('Exception while connecting Kafka.')
        print(str(ex))
    finally:
        return _producer
    
if __name__ == '__main__':
   
    topic = 'hotspot' # to create partitioning of data based on this topic in Kafka cluster
                      # all three producers belong to the same topic 'hotspot'
    print('Publishing records..')
    producer = connect_kafka_producer()

    
    while True:
        r = random.randint(0,len(df)-1)
        # Appending sender_id and created_time:
        data = df.loc[r].append(pd.Series({"created_at":dt.datetime.now().strftime("%X"), "sender_id": "producer_1"}))
        data = data.to_dict()
        publish_message(producer,topic,data)
        # To feed the data to the stream every five seconds:
        sleep(5)

Publishing records..
Message published successfully. Data: {'windspeed_knots': 7.2, 'relative_humidity': 53.6, 'max_wind_speed': 15.0, 'longitude': 149.341, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 18, 'created_at': '19:48:15', 'latitude': -37.382}
Message published successfully. Data: {'windspeed_knots': 7.3, 'relative_humidity': 43.8, 'max_wind_speed': 14.0, 'longitude': 148.115, 'sender_id': 'producer_1', 'precipitation ': ' 0.08G', 'air_temperature_celcius': 8, 'created_at': '19:48:20', 'latitude': -37.452}
Message published successfully. Data: {'windspeed_knots': 10.5, 'relative_humidity': 49.2, 'max_wind_speed': 20.0, 'longitude': 145.614, 'sender_id': 'producer_1', 'precipitation ': ' 0.12G', 'air_temperature_celcius': 14, 'created_at': '19:48:25', 'latitude': -35.889}
Message published successfully. Data: {'windspeed_knots': 9.0, 'relative_humidity': 51.0, 'max_wind_speed': 13.0, 'longitude': 144.0898, 'sender_id': 'producer_1', 'precipi

Message published successfully. Data: {'windspeed_knots': 9.3, 'relative_humidity': 48.1, 'max_wind_speed': 12.0, 'longitude': 149.325, 'sender_id': 'producer_1', 'precipitation ': ' 0.00G', 'air_temperature_celcius': 16, 'created_at': '19:50:51', 'latitude': -37.6}
Message published successfully. Data: {'windspeed_knots': 5.9, 'relative_humidity': 50.9, 'max_wind_speed': 13.0, 'longitude': 144.7505, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 14, 'created_at': '19:50:56', 'latitude': -36.3769}
Message published successfully. Data: {'windspeed_knots': 5.9, 'relative_humidity': 43.2, 'max_wind_speed': 14.0, 'longitude': 144.233, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 11, 'created_at': '19:51:01', 'latitude': -36.0856}
Message published successfully. Data: {'windspeed_knots': 5.5, 'relative_humidity': 43.2, 'max_wind_speed': 8.0, 'longitude': 148.042, 'sender_id': 'producer_1', 'precipitation ': ' 0.00G', 'a

Message published successfully. Data: {'windspeed_knots': 16.8, 'relative_humidity': 38.7, 'max_wind_speed': 22.9, 'longitude': 142.5679, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 17, 'created_at': '19:53:26', 'latitude': -35.2881}
Message published successfully. Data: {'windspeed_knots': 8.7, 'relative_humidity': 51.5, 'max_wind_speed': 15.0, 'longitude': 146.8907, 'sender_id': 'producer_1', 'precipitation ': ' 0.02G', 'air_temperature_celcius': 16, 'created_at': '19:53:31', 'latitude': -36.6859}
Message published successfully. Data: {'windspeed_knots': 7.9, 'relative_humidity': 55.5, 'max_wind_speed': 15.0, 'longitude': 141.505, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 24, 'created_at': '19:53:36', 'latitude': -36.2111}
Message published successfully. Data: {'windspeed_knots': 8.9, 'relative_humidity': 42.9, 'max_wind_speed': 15.9, 'longitude': 148.126, 'sender_id': 'producer_1', 'precipitation ': ' 0.12

Message published successfully. Data: {'windspeed_knots': 14.1, 'relative_humidity': 61.9, 'max_wind_speed': 23.9, 'longitude': 146.28799999999998, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 31, 'created_at': '19:56:01', 'latitude': -37.637}
Message published successfully. Data: {'windspeed_knots': 17.0, 'relative_humidity': 43.7, 'max_wind_speed': 27.0, 'longitude': 148.123, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 14, 'created_at': '19:56:06', 'latitude': -37.434}
Message published successfully. Data: {'windspeed_knots': 5.5, 'relative_humidity': 58.9, 'max_wind_speed': 14.0, 'longitude': 143.77200000000002, 'sender_id': 'producer_1', 'precipitation ': ' 0.08G', 'air_temperature_celcius': 18, 'created_at': '19:56:11', 'latitude': -36.1}
Message published successfully. Data: {'windspeed_knots': 7.1, 'relative_humidity': 47.4, 'max_wind_speed': 14.0, 'longitude': 141.278, 'sender_id': 'producer_1', 'precipi

Message published successfully. Data: {'windspeed_knots': 10.3, 'relative_humidity': 43.6, 'max_wind_speed': 15.0, 'longitude': 142.8935, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 20, 'created_at': '19:58:37', 'latitude': -37.3847}
Message published successfully. Data: {'windspeed_knots': 6.3, 'relative_humidity': 49.6, 'max_wind_speed': 13.0, 'longitude': 142.3405, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 23, 'created_at': '19:58:42', 'latitude': -36.1002}
Message published successfully. Data: {'windspeed_knots': 6.5, 'relative_humidity': 45.8, 'max_wind_speed': 9.9, 'longitude': 143.593, 'sender_id': 'producer_1', 'precipitation ': ' 0.00G', 'air_temperature_celcius': 14, 'created_at': '19:58:47', 'latitude': -37.692}
Message published successfully. Data: {'windspeed_knots': 9.1, 'relative_humidity': 52.8, 'max_wind_speed': 15.0, 'longitude': 144.384, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I'

Message published successfully. Data: {'windspeed_knots': 12.0, 'relative_humidity': 43.5, 'max_wind_speed': 16.9, 'longitude': 148.063, 'sender_id': 'producer_1', 'precipitation ': ' 0.04G', 'air_temperature_celcius': 10, 'created_at': '20:01:12', 'latitude': -37.375}
Message published successfully. Data: {'windspeed_knots': 7.7, 'relative_humidity': 39.3, 'max_wind_speed': 14.0, 'longitude': 148.091, 'sender_id': 'producer_1', 'precipitation ': ' 0.00G', 'air_temperature_celcius': 8, 'created_at': '20:01:17', 'latitude': -37.332}
Message published successfully. Data: {'windspeed_knots': 9.6, 'relative_humidity': 49.1, 'max_wind_speed': 16.9, 'longitude': 142.121, 'sender_id': 'producer_1', 'precipitation ': ' 0.01G', 'air_temperature_celcius': 15, 'created_at': '20:01:22', 'latitude': -34.282}
Message published successfully. Data: {'windspeed_knots': 9.2, 'relative_humidity': 50.7, 'max_wind_speed': 13.0, 'longitude': 142.986, 'sender_id': 'producer_1', 'precipitation ': ' 0.02G', 'a

Message published successfully. Data: {'windspeed_knots': 8.6, 'relative_humidity': 40.0, 'max_wind_speed': 15.0, 'longitude': 148.091, 'sender_id': 'producer_1', 'precipitation ': ' 0.00G', 'air_temperature_celcius': 9, 'created_at': '20:03:47', 'latitude': -37.434}
Message published successfully. Data: {'windspeed_knots': 10.1, 'relative_humidity': 60.6, 'max_wind_speed': 26.0, 'longitude': 143.358, 'sender_id': 'producer_1', 'precipitation ': ' 0.00I', 'air_temperature_celcius': 23, 'created_at': '20:03:52', 'latitude': -37.479}
Message published successfully. Data: {'windspeed_knots': 13.1, 'relative_humidity': 43.5, 'max_wind_speed': 21.0, 'longitude': 148.153, 'sender_id': 'producer_1', 'precipitation ': ' 0.24G', 'air_temperature_celcius': 11, 'created_at': '20:03:57', 'latitude': -37.465}
Message published successfully. Data: {'windspeed_knots': 17.7, 'relative_humidity': 39.3, 'max_wind_speed': 30.9, 'longitude': 142.1873, 'sender_id': 'producer_1', 'precipitation ': ' 0.01G',