### Task 1: Processing Data Stream
a. **Event Producer 1**: Write a python program that loads all the data from
climate_streaming.csv and randomly (with replacement) feed the data to the
stream every 10 seconds. You will need to append additional information such as
producer information to identify the producer and created date. Save the file as
**Assignment_PartB_Producer1.ipynb**.

In [2]:
# import necessary library
import pymongo
from pymongo import MongoClient
from time import sleep
import json
from kafka import KafkaProducer
import random
from datetime import datetime, timedelta
import pandas as pd
from pprint import pprint

print("Pandas Version: " + pd.__version__)

Pandas Version: 1.4.2


Read from the *climate_streaming.csv* file using pandas library.

In [3]:
# read the climate_streaming.csv using the read_csv function in pandas library
climate_streaming_df = pd.read_csv('climate_streaming.csv')

This producer writes records to Kafka topic `assignment`. Each record has `key = "climate_producer"` and `value` is the random climate data selected from the *climate_streaming.csv* file

In [4]:
def publish_message(producer_instance, topic_name, key, value):
    try:
        key_bytes = bytes(key, encoding='utf-8') # encode the key to bytes
        value_bytes = bytes(value, encoding='utf-8') # encode the value to bytes
        producer_instance.send(topic_name, key=key_bytes, value=value_bytes) # send the key and value to the topic
        producer_instance.flush()
        print('Message published successfully. Data: ' + value)
    except Exception as ex:
        print('Exception in publishing message.')
        print(str(ex))

def connect_kafka_producer():
    _producer = None
    try:
        _producer = KafkaProducer(bootstrap_servers=['localhost:9092'],
                                  api_version=(0, 10))
    except Exception as ex:
        print('Exception while connecting Kafka.')
        print(str(ex))
    finally:
        return _producer
    
if __name__ == '__main__':
    
    topic = 'assignment'
    
    print('Publishing records..')
    producer = connect_kafka_producer() # create a Kafka producer
    
    # Read from the MongoDB database to get the latest climate data date
    client = MongoClient()
    climate_historic_data = client.assignment_db.climate_historic
    latest_date = list(climate_historic_data.find({},{"_id":0, "date":1}).sort("date",-1).limit(1))[0]['date']
    n = len(climate_streaming_df)
    
    while True:
        
        # for every iteration add one day to the previous latest date
        latest_date += timedelta(days=1)
        
        # randomly select a climate streaming data
        rand_climate_data = climate_streaming_df.iloc[[random.randrange(0,n-1)]] 
        precipitation_data = list(rand_climate_data["precipitation "])[0].replace(" ", "")
        
        # create a new JSON object with all the random climate data in the correct datatype
        # a random number from 0 till the total number of data in the climate data dataframe will be obtained
        climate_data = {
            "latitude": float(rand_climate_data["latitude"]), 
            "longitude": float(rand_climate_data["longitude"]), 
            "air_temperature_celcius": int(rand_climate_data["air_temperature_celcius"]),
            "relative_humidity": float(rand_climate_data["relative_humidity"]),
            "windspeed_knots": float(rand_climate_data["windspeed_knots"]),
            "max_wind_speed": float(rand_climate_data["max_wind_speed"]),
            "precipitation": {"amount": float(precipitation_data[-len(precipitation_data):-1]), "flag": precipitation_data[-1]},
            "GHI_w/m2": int(rand_climate_data["GHI_w/m2"]),
            "date": latest_date.strftime("%d/%m/%Y"),
            "producer": "climate_producer"
           }
        publish_message(producer, topic, 'climate_producer', json.dumps(climate_data))
        sleep(10) # sleep for 10 seconds

Publishing records..
Message published successfully. Data: {"latitude": -37.434, "longitude": 148.091, "air_temperature_celcius": 9, "relative_humidity": 40.0, "windspeed_knots": 8.6, "max_wind_speed": 15.0, "precipitation": {"amount": 0.0, "flag": "G"}, "GHI_w/m2": 84, "date": "01/01/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -36.2669, "longitude": 143.1906, "air_temperature_celcius": 12, "relative_humidity": 42.7, "windspeed_knots": 10.0, "max_wind_speed": 15.9, "precipitation": {"amount": 0.01, "flag": "G"}, "GHI_w/m2": 109, "date": "02/01/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -37.0899, "longitude": 141.0238, "air_temperature_celcius": 9, "relative_humidity": 42.2, "windspeed_knots": 6.4, "max_wind_speed": 9.9, "precipitation": {"amount": 0.01, "flag": "G"}, "GHI_w/m2": 82, "date": "03/01/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -37.472,

Message published successfully. Data: {"latitude": -37.758, "longitude": 148.721, "air_temperature_celcius": 9, "relative_humidity": 41.2, "windspeed_knots": 9.8, "max_wind_speed": 15.9, "precipitation": {"amount": 0.0, "flag": "I"}, "GHI_w/m2": 83, "date": "28/01/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -36.0966, "longitude": 142.3635, "air_temperature_celcius": 22, "relative_humidity": 57.0, "windspeed_knots": 8.5, "max_wind_speed": 15.0, "precipitation": {"amount": 0.0, "flag": "G"}, "GHI_w/m2": 178, "date": "29/01/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -36.952, "longitude": 144.972, "air_temperature_celcius": 21, "relative_humidity": 57.3, "windspeed_knots": 5.4, "max_wind_speed": 9.9, "precipitation": {"amount": 0.0, "flag": "I"}, "GHI_w/m2": 169, "date": "30/01/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -37.719, "longitude": 142.154, "

Message published successfully. Data: {"latitude": -36.4325, "longitude": 144.3142, "air_temperature_celcius": 26, "relative_humidity": 61.0, "windspeed_knots": 9.3, "max_wind_speed": 15.0, "precipitation": {"amount": 0.2, "flag": "G"}, "GHI_w/m2": 202, "date": "24/02/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -37.087, "longitude": 145.37, "air_temperature_celcius": 9, "relative_humidity": 40.1, "windspeed_knots": 7.5, "max_wind_speed": 11.1, "precipitation": {"amount": 0.0, "flag": "I"}, "GHI_w/m2": 84, "date": "25/02/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -36.275, "longitude": 142.785, "air_temperature_celcius": 6, "relative_humidity": 30.8, "windspeed_knots": 5.7, "max_wind_speed": 11.1, "precipitation": {"amount": 0.0, "flag": "I"}, "GHI_w/m2": 60, "date": "26/02/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -37.453, "longitude": 148.111, "ai

Message published successfully. Data: {"latitude": -36.277, "longitude": 146.165, "air_temperature_celcius": 20, "relative_humidity": 57.0, "windspeed_knots": 8.7, "max_wind_speed": 13.0, "precipitation": {"amount": 0.0, "flag": "I"}, "GHI_w/m2": 161, "date": "23/03/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -37.406, "longitude": 148.123, "air_temperature_celcius": 12, "relative_humidity": 44.7, "windspeed_knots": 11.4, "max_wind_speed": 18.1, "precipitation": {"amount": 0.0, "flag": "G"}, "GHI_w/m2": 107, "date": "24/03/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -37.296, "longitude": 144.386, "air_temperature_celcius": 11, "relative_humidity": 40.8, "windspeed_knots": 12.2, "max_wind_speed": 20.0, "precipitation": {"amount": 0.24, "flag": "G"}, "GHI_w/m2": 102, "date": "25/03/2022", "producer": "climate_producer"}
Message published successfully. Data: {"latitude": -37.087, "longitude": 145.36

KeyboardInterrupt: 

#### Reference
- Date increment refer from https://www.adamsmith.haus/python/answers/how-to-increment-a-datetime-object-by-one-day-in-python
- Encodoing and decoding json refer from </br>https://www.geeksforgeeks.org/encoding-and-decoding-custom-objects-in-python-json/#:~:text=Python%20provides%20a%20built%2Din,and%20start%20using%20its%20functionality.&text=For%20encoding%2C%20we%20use%20json,%2C%20we'll%20use%20json.