## Report Generation
This notebook serves as continuation of earlier notebooks (i.e. [Article Classifier](./News%20Classifier/FYP%20Article%20Classification%20Codes.ipynb) and [3D Plot Visualization](./3D%20Plotting%20and%20Visualization/3D%20Plot%20Visualization.ipynb))  

It takes in the resultant csv from the 3D Plot Visualization, reads one row and generate the report one at a time.

### Import Necessary Libraries
Spacy was used for its Named Entity Recognition (NER) text processing in reporting the content summary.  
Requests was used in making HTTP requests to Nominatim in retrieving longitude and latitude data for the countries identified.  
Termcolor was used in styling the report generated, aiding our stakeholders in gaining quick insights.  
Ipyleaflet was used in generating the interactive map as part of the report generated.

In [1]:
import pandas as pd
import numpy as np
import re
import copy

import spacy
from spacy import displacy

# !pip3 install requests
import requests

# !pip install termcolor
from termcolor import colored

# !pip install ipyleaflet
# !jupyter nbextension enable --py --sys-prefix ipyleaflet
from ipyleaflet import Map, Marker, AwesomeIcon, CircleMarker, LayerGroup, LegendControl

#### Creating a function to identify the stage of supply chain that is affected by the trigger event
The current list of words relating to each stage of the supply chain is with reference to Professor Paul Teng's framework that outlines the food supply chain. This list is non-exhaustive and should be expanded, to improve the robustness of our model, with advices from food supply chain experts.

In [2]:
def identify_supply_chain(artcl):
    supply_chain = {
        "Input Supplier" : ["seed", "fertiliser", "crop protection", "animal health", "farm machinery", "irrigation"],
        "Grower/Producer" : ["farmer", "grower", "animal raiser"],
        "Processor" : ["packaging products provider", "cannery", "dressing plant", "mill", "slaughter house", "packing plant"],
        "Distributor" : ["wholesaler", "importer", "exporter", "logistics provider"],
        "Retailer" : ["supermarket", "restaurant", "convenience store"]
    }
    
    stage_of_supply_chain = []
    
    for k,v in supply_chain.items():
        for keyword in v:
            search_term = "r'\b"+keyword+"\b"
            if (re.search(search_term, artcl) is not None):
                stage_of_supply_chain.append(k)
    unique_stages = np.array(stage_of_supply_chain)
    
    if len(stage_of_supply_chain) == 0:
        return "N/A"
    else: 
        return np.unique(unique_stages)

#### Function to display the interactive map:
1. To obtain all of the countries mentioned in the content summary
2. For all of these countries, get their longtitude and latitude data
3. Instantiate the map object, with markers corresponding to each of the countries

In [3]:
def displayMap(doc): # takes in the NER processed text!
    # get all countries mentioned
    countries = {}
    for word in doc.ents:
        if word.label_ == "GPE" and word.text not in countries:
            countries[word.text] = []
#     print(countries)
    
    copy_of_countries = copy.deepcopy(countries)
    # get all longitude and latitude of countries mentioned
    for country,center in countries.items():
        try: 
            url = f"http://nominatim.openstreetmap.org/search?country={country}&format=json&polygon=0"
            response = requests.get(url).json()[0]
            copy_of_countries[country] = [response['lat'], response['lon']]
        except IndexError:
            copy_of_countries.pop(country)

    # finding the center of all countries - for display sake!
    sumLat = 0
    sumLon = 0
    for geocode in copy_of_countries.values():
        sumLat += float(geocode[0])    
        sumLon += float(geocode[1])
    center = [sumLat/len(countries), sumLon/len(countries)]
    
    # instantiating an instance of the map
    currMap=Map(center=center, zoom=0)
    icon1 = AwesomeIcon(
        name='exclamation',
        marker_color='red',
        icon_color='white',
        spin=False
    )
    
    layers = [];    
    for country, data in copy_of_countries.items():
        marker = Marker(icon=icon1, location=data, title=country)
        layers.append(marker)
    
    layer_group = LayerGroup(layers=tuple(layers))
    currMap.add_layer(layer_group)

    return currMap

#### Function to generate report: 
Ideally, this function takes in one row of the resulting dataframe and retrive the relevant information, to finally generate a report, which includes the visualziation of the interactive map mentioned above. 

Relevant information includes: Date, Title, Source(URL), Content Summary, Trigger Event, Stages of Supply Chain that was affected, Risk Level

In [4]:
def report_writer(row):

    #assign values into callable instances
    date = row['date']
    title = row["news title"]
    url = row["Article Link"]
    trigger_event = row["trigger event"]
    
    nlp = spacy.load('en_core_web_sm')
    current_text = row['content summary']
    doc = nlp(current_text)
    
    scores = [row["New Severity Score"], row["New Human Control Score"], row["New Likelihood Score"]]
    average_score = sum(scores)/len(scores)
    if average_score < 2:
        RAG = ("Low", "green")
    elif average_score >= 2 and average_score <= 3:
        RAG = ("Moderate", "yellow")
    elif average_score > 3:
        RAG = ("High", "red")
    
    
    print (f"***************************START OF REPORT********************************")
    print ("\n")

    print (f"This is an automatically generated report for:")
    print (f"\n{colored('News Title: ', attrs=['bold']) + title}'\n{colored('News Source (URL): ', attrs=['bold']) + url}.")
    print ("\n")
    
    print(f"{colored('Trigger Event Identified: ', attrs=['bold']) + trigger_event}")    
    
    affected_stages = identify_supply_chain(current_text)
    print(f"{colored('Stage(s) of supply chain likely to be affected: ', attrs=['bold']) + affected_stages}")
    
    # color to be returned from the cube - yellow, red, green
    risk = colored(RAG[0], RAG[1], attrs=['reverse','bold'])
    print(f"{colored('Risk Level Identified: ', attrs=['bold']) + risk}")
    print("\n")
    
    print(f"{colored('Content Summary with following related entities highlighted in this article that should be reviewed:', attrs=['bold'])}")
    displacy.render(doc, style='ent', jupyter=True)
    print ("\n")
    
    print (f"***************************END OF REPORT********************************")    
    
    return displayMap(doc)



In [5]:
df = pd.read_csv("mid_working_article.csv", index_col=[0])
row = df.loc[0]

report_writer(row)

***************************START OF REPORT********************************


This is an automatically generated report for:

[1mNews Title: [0mUkraine says grain on ships in blocked Black Sea ports may deteriorate'
[1mNews Source (URL): [0mhttps://www.channelnewsasia.com/world/ukraine-invasion-russia-ships-grain-black-sea-ports-2628556.


[1mTrigger Event Identified: [0mMilitary Conflicts
[1mStage(s) of supply chain likely to be affected: [0mN/A
[1mRisk Level Identified: [0m[1m[7m[31mHigh[0m


[1mContent Summary with following related entities highlighted in this article that should be reviewed:[0m




***************************END OF REPORT********************************


Map(center=[24.7435984, 15.63591605], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_titl…

### Additional Notes

For stakeholders who are unsure of what the labels (e.g. GPE, LOC) mean, they can simply run <b>spacy.explain(\<label>\)</b> to get the official definition by spaCy. For instance (see below), <b>spacy.explain("GPE")</b> returns "Countries, cities, states" suggesting that the entities highlighted by spaCy and labeled as 'GPE' refer to either countries, cities or states.

In [6]:
spacy.explain("GPE")

'Countries, cities, states'

#### Limitations
Currently the interactive map only takes into considerations of countries and not cities and state. This is because cities and state names may not always be unique and our report generator lacks the capability to distinguish that.