## 3. Consumer

This is the last part for the assignment, it will start by installing the library plotly using `!pip install plotly`. This is done in order to be used for creating a map as below:


In [None]:
#Uncomment this to install the package for making a chloropleth map
#!pip install plotly

Once we have install our map plotting library, we will install both `urllib` and `json` to download geojson from a website that contain a map information for our data streaming (in this case it is a State from India) so we use `urlopen` and assign it as `counties`. Next, we will apply the loop for making a State store inside properties of file `ST_NM` to be uppercase which match our streaming dataframe from straming side as the following:

In [1]:
from urllib.request import urlopen
import json

#Download geojson of India state from website (Malineni,2019)
with urlopen("https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/india_states.geojson") as response:
    counties = json.load(response)

#Apply loop to make the State name in the geojson in an uppercase
for i in range(len(counties["features"])-1):
    counties['features'][i]["properties"]['ST_NM']=counties['features'][i]["properties"]['ST_NM'].upper()

After that we will create a map based on the data we receive from kafka consumer and geojson based on the code below:

In [17]:
from time import sleep
from kafka import KafkaConsumer
import datetime as dt
import matplotlib
import matplotlib.pyplot as plt
from json import loads
import pandas as pd
import numpy as np
import plotly.express as px
import plotly
from IPython.display import clear_output


# this line is needed for the inline display of graphs in Jupyter Notebook
%matplotlib notebook

#Our topic we create from streaming side
topic = 'Top_up_count_state'

#Function derived from week9-11 tutorial consumer files
def connect_kafka_consumer():
    """
    Function setting the consumer by apply our topic create from streaming side
    auto_offset_reset is use to make a plot based on the information that store before 
    kafka consumer send data, while consumer timeout is set to stop when data is come to late 
    we have to re run the code again
    """
    _consumer = None
    try:
         _consumer = KafkaConsumer(topic,
                                   auto_offset_reset='earliest',
                                   consumer_timeout_ms=1000000, # stop iteration if no message after 10 sec # 
                                   bootstrap_servers=['localhost:9092'],
                                   api_version=(0, 10))
    except Exception as ex:
        print('Exception while connecting Kafka')
        print(str(ex))
    finally:
        return _consumer
    

def consume_messages(consumer):
    """
    Function for receiving the data from consumer and turn it into a map plot
    by using a combining of for loop on consumer and decode the message value and 
    change the json format  we receive to dictionary and store it to dataframe
    and using plotly to build the map with count of top-up by state
    """
    try:
        for message in consumer:
            data = message.value.decode('utf-8')
            data = data.replace('[','')
            data = data.replace(']','')
            data = eval(data)
            df = pd.DataFrame([data])
            print(message)
            
            #Plotly plot based on the plotly website and stackoverflow
            fig = px.choropleth(df, geojson=counties, locations='State', color='count',
                                featureidkey='properties.ST_NM',color_continuous_scale="twilight",
                                range_color=(0, 50),scope="asia",labels={'count':'Top-up rate'})
            fig.update_geos(fitbounds="locations")
            fig.show()
            clear_output(wait=True)
            sleep(20) #Show image for 20 seconds
        

    except Exception as ex:
        print(str(ex))
    
# main
if __name__ == '__main__':

    consumer = connect_kafka_consumer()  
    consume_messages(consumer)   

ConsumerRecord(topic='Top_up_count_state', partition=0, offset=7, timestamp=1666062587343, timestamp_type=0, key=b'2022-10-18 03:07:30', value=b'[{"State":"WEST BENGAL","count":29}, {"State":"ORISSA","count":3}]', headers=[], checksum=1950726576, serialized_key_size=19, serialized_value_size=66, serialized_header_size=-1)
Value of 'locations' is not the name of a column in 'data_frame'. Expected one of [0, 1] but received: State


Once we run the code, we need to interupt it if we want to stop it.
The map above that I have provide is appending. In the future, if it is possible I want to make a changable map that change in a live stream form.

## References

- $\textit{Choropleth Maps in Python}$. (nd.). https://plotly.com/python/choropleth-maps/
- Jupyter Notebooks:FIT 5202 Data Processing in Big Data (2022). $\textit{Week 9 consumer notebook}$.
https://lms.monash.edu/course/view.php?id=140960&section=14
- Jupyter Notebooks:FIT 5202 Data Processing in Big Data (2022). $\textit{Week 10 consumer notebook }$. https://lms.monash.edu/course/view.php?id=140960&section=16
- Jupyter Notebooks:FIT 5202 Data Processing in Big Data (2022). $\textit{Week 11 consumer notebook}$.
https://lms.monash.edu/course/view.php?id=140960&section=17
- Malineni N.K. (2020, March 29). $\textit{Is there any way to draw INDIA Map in plotly?}$. https://stackoverflow.com/questions/60910962/is-there-any-way-to-draw-india-map-in-plotly