### 3. Consuming data using Kafka (15%)

In this task, we will implement an Apache Kafka consumer to consume the data from task 2.8.

Important:

In this task, use Kafka consumer to consume the streaming data published from task 2.8.

Do not use Spark in this task 

Your program should consume the latest data and display them on the map(only for the latest data, abandon the old data). The map should be refreshed whenever a new batch of data is consumed. And the number of top-up customers of each state should be shown on maps (like a Choropleth map or heatmap with legend).Hint - you can use libraries like Plotly or folium to show data on a map, please also provide instructions on how to install your plotting library.

In [1]:
!pip install Plotly



In [2]:
from time import sleep
from kafka3 import KafkaConsumer
import datetime as dt
import matplotlib
import matplotlib.pyplot as plt

In [3]:
# this line is needed for the inline display of graphs in Jupyter Notebook
%matplotlib notebook

In [4]:

topic = "top_up_customer"
def connect_kafka_consumer():
    _consumer = None
    try:
         _consumer = KafkaConsumer(topic,
                                   consumer_timeout_ms=10000, # stop iteration if no message after 10 sec
                                   auto_offset_reset='earliest', # comment this if you don't want to consume earliest available message
                                   bootstrap_servers=['localhost:9092'],
                                   api_version=(0, 10))
    except Exception as ex:
        print('Exception while connecting Kafka')
        print(str(ex))
    finally:
        return _consumer

In [5]:
from urllib.request import urlopen
import pandas as pd
import plotly.express as px
import json
# loading json statewise data
with urlopen('https://raw.githubusercontent.com/Subhash9325/GeoJson-Data-of-Indian-States/master/Indian_States') as response:
    states = json.load(response)



In [6]:
states["features"][0]["properties"]

{'ID_0': 105,
 'ISO': 'IND',
 'NAME_0': 'India',
 'ID_1': 1,
 'NAME_1': 'Andaman and Nicobar',
 'NL_NAME_1': '',
 'VARNAME_1': 'Andaman & Nicobar Islands|Andaman et Nicobar|Iihas de Andama e Nicobar|Inseln Andamanen und Nikobare',
 'TYPE_1': 'Union Territor',
 'ENGTYPE_1': 'Union Territory',
 'filename': '',
 'filename_1': '',
 'filename_2': '',
 'filename_3': '',
 'filename_4': ''}

In [7]:
from IPython.display import clear_output
def consume_plot(consumer):
    try:
        # mapper to map states with unique ID
        state_mapper = {}
        for feature in states["features"]:
            feature['id'] = feature["properties"]['ID_1']
            state_mapper[feature["properties"]['NAME_1']] = feature['id']
        # print('Waiting for messages')
        for resp in consumer:
            response = eval(resp.value)
            # to store state and its count
            state_count_info = {}
            for itr in range(len(response)):
                state_count_info[response[itr]['State']] = response[itr]['count']
            list_of_states = []
            # loop to make list of obj with state, id, count
            for i,j in state_count_info.items():
                temp_obj = {}
                temp_obj["State"] = i
                temp_obj["id"] = state_mapper[i]
                temp_obj["count"] = j
                list_of_states.append(temp_obj)
            # pandas to create df of the list generated
            df_states = pd.DataFrame(list_of_states)
            # abaodon old data
            clear_output()
            # plot map, show state and count on hover
            fig = px.choropleth(df_states, geojson=states, 
                                locations='id', color='count',
                                color_continuous_scale="Viridis",
                                range_color=(0, 12),
                                hover_name="State",
                                hover_data=['count']
                                      )
            fig.update_geos(fitbounds="locations", visible=False)
            fig.show()
    except Exception as ex:
        print(str(ex))

In [8]:
if __name__ == '__main__':
    consumer = connect_kafka_consumer()
    consume_plot(consumer)

#### References

For loading JSON

https://plotly.com/python/choropleth-maps/

For Ploting india map

https://www.youtube.com/watch?v=aJmaw3QKMvk

For abandon old data
