### Tasks 

1. Alert on buses which have become delayed in the last 15 minutes
2. Visialize bus current location
3. Count of buses per operator which are actively moving in the last 15 minutes

### Needed data
1. Veichle id: MonitoredVehicleJourney -> VehicleRef
2. Veichle location: MonitoredVehicleJourney -> VehicleLocation
3. Event time: RecordedAtTime
4. Operator: MonitoredVehicleJourney -> OperatorRef
5. VehicleAtStop -> {true, false}, moving or not
6. Delayed status: MonitoredVehicleJourney -> OnwardCalls -> OnwardCall -> 0 -> ArrivalStatus
7. Veichle type: MonitoredVehicleJourney -> VehicleMode

#### Project Plan
Load data from API (at XXX seconds intervals)
Extract relevant fields into a dictionary
Push the dictionary to a kafka topic 
Consume the kafka topic with a spark job
Extract veiechles with delayed arrival status and print the alert
Extract veiehcle location data and feed the data into a map visualization tool
Extract the veiechles with the actively moveing status, group them by operator.




In [1]:
!pip install xmltodict
!pip install confluent_kafka



You should consider upgrading via the 'c:\users\ahsor\appdata\local\programs\python\python37-32\python.exe -m pip install --upgrade pip' command.




You should consider upgrading via the 'c:\users\ahsor\appdata\local\programs\python\python37-32\python.exe -m pip install --upgrade pip' command.


In [2]:
!jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10

usage: jupyter [-h] [--version] [--config-dir] [--data-dir] [--runtime-dir]
               [--paths] [--json] [--debug]
               [subcommand]

Jupyter: Interactive Computing

positional arguments:
  subcommand     the subcommand to launch

optional arguments:
  -h, --help     show this help message and exit
  --version      show the versions of core jupyter packages and exit
  --config-dir   show Jupyter config dir
  --data-dir     show Jupyter data dir
  --runtime-dir  show Jupyter runtime dir
  --paths        show all Jupyter paths. Add --json for machine-readable
                 format.
  --json         output paths as machine-readable json
  --debug        output debug information about paths

Available subcommands: kernel kernelspec migrate run troubleshoot

Jupyter command `jupyter-notebook` not found.


In [3]:
import itertools
import requests
from xml.etree import ElementTree
import xmltodict
from datetime import datetime
from confluent_kafka import Producer
import sched, time
import json
from collections import defaultdict

In [4]:
url = "https://api.entur.io/realtime/v1/rest/vm"
run_interval = 60

In [5]:
headers = {
    'content-type': 'application/json'
}


def apiCall(url):
    r = requests.get(url, headers=headers)
    return r.json()

def acked(err, msg):
    if err is not None:
        print("Failed to deliver message: %s: %s" % (str(msg), str(err)))
    else:
        print("Message produced: %s" % (str(msg)))

def produce_events(apiResult):
    conf = {'bootstrap.servers': "localhost:19092,localhost:29092"}
    producer = Producer(conf)
    producer.produce(topic="FullAPItest", value=str(apiResult), callback=acked)
    producer.poll(1)
    producer.flush()

In [6]:
def value(d, k):
    if k in d:
        return d[k]
    return None

def check_ArrivalStatus(val):
    if val != None:
        if len(val['OnwardCall']) > 1:
            val = value(val['OnwardCall'][0], 'ArrivalStatus')
    return val

def check_VehicleAtStop(val):
    if val != None:
        val = value(val, 'VehicleAtStop')

    return val

def clean(instance):

    output = {"RecordedAtTime": value(instance, 'RecordedAtTime'),
                "VehicleMode": value(instance['MonitoredVehicleJourney'],'VehicleMode'),
                "OperatorRef": value(instance['MonitoredVehicleJourney'],'OperatorRef'),
                "VehicleLocation": value(instance['MonitoredVehicleJourney'],'VehicleLocation'),
                "VehicleRef": value(instance['MonitoredVehicleJourney'],'VehicleRef'),
                "ArrivalStatus": value(instance['MonitoredVehicleJourney'], 'OnwardCalls'),
                "VehicleAtStop": value(instance['MonitoredVehicleJourney'], 'MonitoredCall')
    }

    output['ArrivalStatus'] = check_ArrivalStatus(output['ArrivalStatus'])
    output['VehicleAtStop'] = check_VehicleAtStop(output['VehicleAtStop'])
    

    if output['VehicleMode'] != None:
        output['VehicleMode'] = output['VehicleMode'][0]

    if output['VehicleRef'] != None:
        output['VehicleRef'] = output['VehicleRef']['value']

    if output['OperatorRef'] != None:
        output['OperatorRef'] = output['OperatorRef']['value']

    if output['VehicleLocation'] != None:
        output['VehicleLatitude'] = output['VehicleLocation']['Latitude']
        output['VehicleLongitude'] = output['VehicleLocation']['Longitude'] 
    
    else:
        output['VehicleLatitude'] = None 
        output['VehicleLongitude'] = None 

            
    del output['VehicleLocation']

    return output

In [8]:
from time import sleep
active = True

while active:
    try:
        apiResult = apiCall(url)
        instances = apiResult['Siri']['ServiceDelivery']['VehicleMonitoringDelivery'][0]['VehicleActivity']

        for instance in instances:   
            clean_instance = clean(instance)
            produce_events(json.dumps(clean_instance))

    except:
        continue
    
    sleep(run_interval)

Message produced: <cimpl.Message object at 0x068BCAD8>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x080B5660>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x03565710>
Message produced: <cimpl.Message object at 0x080B56B8>
Message pr

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=abab48c6-a312-48d3-9d85-4f43cbfe0e3c' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>