### Tasks 

1. Alert on buses which have become delayed in the last 15 minutes
2. Visialize bus current location
3. Count of buses per operator which are actively moving in the last 15 minutes

### Needed data
1. Veichle id: MonitoredVehicleJourney -> VehicleRef
2. Veichle location: MonitoredVehicleJourney -> VehicleLocation
3. Event time: RecordedAtTime
4. Operator: MonitoredVehicleJourney -> OperatorRef
5. VehicleAtStop -> {true, false}, moving or not
6. Delayed status: MonitoredVehicleJourney -> OnwardCalls -> OnwardCall -> 0 -> ArrivalStatus
7. Veichle type: MonitoredVehicleJourney -> VehicleMode

#### Project Plan
Load data from API (at XXX seconds intervals)
Extract relevant fields into a dictionary
Push the dictionary to a kafka topic 
Consume the kafka topic with a spark job
Extract veiechles with delayed arrival status and print the alert
Extract veiehcle location data and feed the data into a map visualization tool
Extract the veiechles with the actively moveing status, group them by operator.




In [71]:
!pip install xmltodict
!pip install confluent_kafka



You should consider upgrading via the 'c:\users\ahsor\appdata\local\programs\python\python37-32\python.exe -m pip install --upgrade pip' command.




You should consider upgrading via the 'c:\users\ahsor\appdata\local\programs\python\python37-32\python.exe -m pip install --upgrade pip' command.


In [73]:
import itertools
import requests
from xml.etree import ElementTree
import xmltodict
from datetime import datetime
from confluent_kafka import Producer
import sched, time
import json
from collections import defaultdict

In [74]:
headers = {
    'content-type': 'application/json'
}


def apiCall(url):
    r = requests.get(url, headers=headers)
    return r.json()
    
url = "https://api.entur.io/realtime/v1/rest/vm?maxSize=100"
apiResult = apiCall(url)

In [75]:
def acked(err, msg):
    if err is not None:
        print("Failed to deliver message: %s: %s" % (str(msg), str(err)))
    else:
        print("Message produced: %s" % (str(msg)))

def produce_events(apiResult):
    conf = {'bootstrap.servers': "localhost:19092,localhost:29092"}
    producer = Producer(conf)
    producer.produce(topic="UsefullAPIResults", value=str(apiResult), callback=acked)
    producer.poll(1)
    producer.flush()

In [76]:
instances = apiResult['Siri']['ServiceDelivery']['VehicleMonitoringDelivery'][0]['VehicleActivity']

In [77]:
def value(d, k):
    if k in d:
        return d[k]
    return None

In [78]:
def actively_moving(df):

    df_moving = df[['OperatorRef', 'VehicleRef', 'VehicleAtStop']].drop_duplicates().reset_index(drop=True)
    df_moving = df[df['VehicleAtStop'] == False]['OperatorRef', 'VehicleRef']
    df_moving = df_moving.group_by('OperatorRef').count()

    return df_moving 

def positions(df):

    df_positions = df[['VehicleRef','RecordedAtTime', 'VehicleLocation']].sort_values(by=['VehicleRef', 'RecordedAtTime']).reset_index(drop=True)

    return df_positions

def delayed(df):

    df_delayed = df[['VehicleRef', 'ArrivalStatus']]
    # add information you want to include in alert above

    df_delayed = df[df['ArrivalStatus'] != 'ON_TIME']
    df = df.drop('ArrivalStatus', axis=1)

    return df_delayed 



In [79]:
def check_ArrivalStatus(val):
    if val != None:
        if len(val['OnwardCall']) > 1:
            val = value(val['OnwardCall'][0], 'ArrivalStatus')
    return val

def check_VehicleAtStop(val):
    if val != None:
        val = value(val, 'VehicleAtStop')

    return val


for instance in instances:   

    output = {"RecordedAtTime": value(instance, 'RecordedAtTime'),
                "VehicleMode": value(instance['MonitoredVehicleJourney'],'VehicleMode'),
                "OperatorRef": value(instance['MonitoredVehicleJourney'],'OperatorRef'),
                "VehicleLocation": value(instance['MonitoredVehicleJourney'],'VehicleLocation'),
                "VehicleRef": value(instance['MonitoredVehicleJourney'],'VehicleRef'),
                "ArrivalStatus": value(instance['MonitoredVehicleJourney'], 'OnwardCalls'),
                "VehicleAtStop": value(instance['MonitoredVehicleJourney'], 'MonitoredCall')
    }

    output['ArrivalStatus'] = check_ArrivalStatus(output['ArrivalStatus'])
    output['VehicleAtStop'] = check_VehicleAtStop(output['VehicleAtStop'])
    

    if output['VehicleMode'] != None:
        output['VehicleMode'] = output['VehicleMode'][0]

    if output['VehicleRef'] != None:
        output['VehicleRef'] = output['VehicleRef']['value']

    if output['OperatorRef'] != None:
        output['OperatorRef'] = output['OperatorRef']['value']

    if output['VehicleLocation'] != None:
        output['VehicleLatitude'] = output['VehicleLocation']['Latitude']
        output['VehicleLongitude'] = output['VehicleLocation']['Longitude'] 
    
    else:
        output['VehicleLatitude'] = None 
        output['VehicleLongitude'] = None 

            
    del output['VehicleLocation']
        

    produce_events(json.dumps(output))

Message produced: <cimpl.Message object at 0x07730710>
Message produced: <cimpl.Message object at 0x07730088>
Message produced: <cimpl.Message object at 0x07730088>
Message produced: <cimpl.Message object at 0x07730088>
Message produced: <cimpl.Message object at 0x07730088>
Message produced: <cimpl.Message object at 0x07733B88>
Message produced: <cimpl.Message object at 0x07730088>
Message produced: <cimpl.Message object at 0x07730088>
Message produced: <cimpl.Message object at 0x07730088>
Message produced: <cimpl.Message object at 0x07730088>
Message produced: <cimpl.Message object at 0x07730088>
Message produced: <cimpl.Message object at 0x077309D0>
Message produced: <cimpl.Message object at 0x077309D0>
Message produced: <cimpl.Message object at 0x077309D0>
Message produced: <cimpl.Message object at 0x077309D0>
Message produced: <cimpl.Message object at 0x07730768>
Message produced: <cimpl.Message object at 0x077309D0>
Message produced: <cimpl.Message object at 0x077309D0>
Message pr

In [80]:
output


{'RecordedAtTime': '2022-08-05T11:01:32+02:00',
 'VehicleMode': 'BUS',
 'OperatorRef': 'SKY:Operator:45',
 'VehicleRef': '3350453154',
 'ArrivalStatus': None,
 'VehicleAtStop': False,
 'VehicleLatitude': 60.4055900964886,
 'VehicleLongitude': 5.32819362357259}

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=abab48c6-a312-48d3-9d85-4f43cbfe0e3c' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>