### Tasks 

1. Alert on buses which have become delayed in the last 15 minutes
2. Visialize bus current location
3. Count of buses per operator which are actively moving in the last 15 minutes

### Needed data
1. Veichle id: MonitoredVehicleJourney -> VehicleRef
2. Veichle location: MonitoredVehicleJourney -> VehicleLocation
3. Event time: RecordedAtTime
4. Operator: MonitoredVehicleJourney -> OperatorRef
5. VehicleAtStop -> {true, false}, moving or not
6. Delayed status: MonitoredVehicleJourney -> OnwardCalls -> OnwardCall -> 0 -> ArrivalStatus
7. Veichle type: MonitoredVehicleJourney -> VehicleMode

#### Project Plan
Load data from API (at XXX seconds intervals)
Extract relevant fields into a dictionary
(Push the dictionary to a kafka topic 
Consume the kafka topic with a spark job)
Extract veiechles with delayed arrival status and print the alert
Extract veiehcle location data and feed the data into a map visualization tool
Extract the veiechles with the actively moveing status, group them by operator.




In [27]:
!pip install xmltodict



You should consider upgrading via the 'c:\users\ahsor\appdata\local\programs\python\python37-32\python.exe -m pip install --upgrade pip' command.


In [28]:
!jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10

usage: jupyter [-h] [--version] [--config-dir] [--data-dir] [--runtime-dir]
               [--paths] [--json] [--debug]
               [subcommand]

Jupyter: Interactive Computing

positional arguments:
  subcommand     the subcommand to launch

optional arguments:
  -h, --help     show this help message and exit
  --version      show the versions of core jupyter packages and exit
  --config-dir   show Jupyter config dir
  --data-dir     show Jupyter data dir
  --runtime-dir  show Jupyter runtime dir
  --paths        show all Jupyter paths. Add --json for machine-readable
                 format.
  --json         output paths as machine-readable json
  --debug        output debug information about paths

Available subcommands: kernel kernelspec migrate run troubleshoot

Jupyter command `jupyter-notebook` not found.


In [29]:
import itertools
import requests
from xml.etree import ElementTree
import xmltodict
from datetime import datetime
from confluent_kafka import Producer


In [38]:

headers = {
    'content-type': 'application/json'
}


def apiCall(url):
    r = requests.get(url, headers=headers)
    return r.json()
    
url = "https://api.entur.io/realtime/v1/rest/vm?maxSize=2"
apiResult = apiCall(url)

In [52]:
single_instance = apiResult['Siri']['ServiceDelivery']['VehicleMonitoringDelivery'][0]['VehicleActivity'][0]
single_instance
#vehicle_activity = apiResult['Siri']['ServiceDelivery']['VehicleMonitoringDelivery']['VehicleActivity']

{'ValidUntilTime': '2022-08-04T13:07:31+02:00',
 'VehicleMonitoringRef': {'value': '811201'},
 'ProgressBetweenStops': {},
 'MonitoredVehicleJourney': {'LineRef': {'value': 'VKT:Line:808003'},
  'DirectionRef': {'value': 'Outbound'},
  'FramedVehicleJourneyRef': {'DataFrameRef': {'value': '2022-08-04'},
   'DatedVehicleJourneyRef': '593-80031242-81-507-20220804-1081'},
  'JourneyPatternRef': {'value': '593-507-9'},
  'JourneyPatternName': {'value': '9'},
  'VehicleMode': ['BUS'],
  'PublishedLineName': [{'value': 'M3'}],
  'DirectionName': [{'value': 'Outbound'}],
  'OperatorRef': {'value': '80'},
  'OriginRef': {'value': 'NSR:StopPlace:20015'},
  'OriginName': [{'value': 'Skjelsvik knutepunkt'}],
  'DestinationRef': {'value': 'NSR:StopPlace:19974'},
  'DestinationName': [{'value': 'Skien stasjon'}],
  'Monitored': True,
  'DataSource': 'VKT',
  'VehicleLocation': {'Longitude': 9.6881833, 'Latitude': 59.0942092},
  'LocationRecordedAtTime': '2022-08-04T13:01:35.878+02:00',
  'Bearing':

In [53]:
def acked(err, msg):
    if err is not None:
        print("Failed to deliver message: %s: %s" % (str(msg), str(err)))
    else:
        print("Message produced: %s" % (str(msg)))

def produce_events(apiResult):
    conf = {'bootstrap.servers': "localhost:19092,localhost:29092"}
    producer = Producer(conf)
    producer.produce(topic="FullAPItest", value=str(apiResult), callback=acked)
    print("polling")
    producer.poll(1)
    print("flushing")
    producer.flush()
    print("All done")

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=abab48c6-a312-48d3-9d85-4f43cbfe0e3c' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>

In [54]:
produce_events(single_instance)

polling
Message produced: <cimpl.Message object at 0x089853F8>
flushing
All done


In [34]:
apiResult

{'Siri': {'ServiceDelivery': {'ResponseTimestamp': '2022-08-04T13:10:25.913643434+02:00',
   'ProducerRef': {'value': 'ENT'},
   'EstimatedTimetableDelivery': [{'EstimatedJourneyVersionFrame': [{'EstimatedVehicleJourney': [{'LineRef': {'value': 'NOR:Line:3754'},
         'DirectionRef': {'value': '2'},
         'DatedVehicleJourneyRef': {'value': '184:1:3754:2025'},
         'Cancellation': False,
         'VehicleMode': ['BUS'],
         'OperatorRef': {'value': '11'},
         'Monitored': True,
         'DataSource': 'NOR',
         'BlockRef': {'value': '130722'},
         'EstimatedCalls': {'EstimatedCall': [{'StopPointRef': {'value': 'NSR:Quay:83439'},
            'Order': 1,
            'Cancellation': False,
            'PredictionInaccurate': False,
            'AimedDepartureTime': '2022-08-04T15:20:00+02:00',
            'ExpectedDepartureTime': '2022-08-04T15:20:00+02:00',
            'DepartureBoardingActivity': 'BOARDING'},
           {'StopPointRef': {'value': 'NSR:Quay: