# Final algorithm

_Now that we iterated on all the necessary steps, we can finally build the algorithm._

_The first final algorithm we build took a departure time as input, we will now reverse the algorithm to give a desired arrival time._

_With this algorithm, the user enters:_
- _the departure location_
- _the arrival location_
- _the desired arrival time_
- _the desired confidence probability_
- _**optional:** the minimum departure time_

_We then compute the top 3 routes with their respective probability._

### 1. There are directly some possible routes:
_If there are some paths, we output them directly such that it satisfies the desired probability._

### 2. There are no routes
_We reject the desired probability and considers it as a failure. The route closest to the user requirement is still returned._

In [1]:
%%configure
{"conf": {
    "spark.app.name": "group100_final"
}}

ID,YARN Application ID,Kind,State,Spark UI,Driver log,Current session?
9346,application_1589299642358_3926,pyspark,idle,Link,Link,
9363,application_1589299642358_3943,pyspark,idle,Link,Link,
9369,application_1589299642358_3949,pyspark,idle,Link,Link,
9376,application_1589299642358_3956,pyspark,idle,Link,Link,
9379,application_1589299642358_3959,pyspark,idle,Link,Link,
9380,application_1589299642358_3960,pyspark,idle,Link,Link,
9382,application_1589299642358_3962,pyspark,idle,Link,Link,
9383,application_1589299642358_3963,pyspark,idle,Link,Link,
9384,application_1589299642358_3964,pyspark,idle,Link,Link,
9385,application_1589299642358_3965,pyspark,idle,Link,Link,


In [2]:
# Initialization

Starting Spark application


ID,YARN Application ID,Kind,State,Spark UI,Driver log,Current session?
9403,application_1589299642358_3986,pyspark,idle,Link,Link,✔


FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

SparkSession available as 'spark'.


FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [3]:
username = 'mjouve'

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [4]:
from pyspark.sql.functions import udf
import pyspark.sql.functions as F
from datetime import time, datetime, timedelta
from collections import defaultdict
import numpy as np

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

# Loading filtered dataframes obtain previously

In [5]:
stops = spark.read.orc("/user/{}/zurich_stops.orc".format(username))

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [6]:
reachable_pair_grouped = spark.read.orc("/user/{}/reachable_pair_grouped.orc".format(username))

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [7]:
stop_times = spark.read.orc("/user/{}/stop_times_filtered.orc".format(username))

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [8]:
connexions = spark.read.orc("/user/{}/connexions.orc".format(username))

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

# Helper methods

In [9]:
def compute_footpaths_dict(reachable_pair_df):
    """
    Given a pyspark Dataframe of reachable pairs grouped,
    returns the footpaths dictionary used by our algorithm
    """
    return dict(((row.id_1, row.destinations) for row in reachable_pair_df.collect()))

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [10]:
def to_datetime(str_time):
    """
    Given a string representing a time (format 'H:M:s', H: hour, M: minute, s:second), convert it to a datetime object
    """
    hour, minute, second = str_time.split(':')
    
    # convert it to int and remove potential errors by taking a modulo
    hour = int(hour) % 24
    minute = int(minute) % 60
    second = int(second) % 60
    
    # the year, month and day are dummies heres
    return datetime(year=2020, month=1, day=1, hour=hour, minute=minute, second=second)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [11]:
def sort_connexions(connexions_df, departure = True):
    """
    Given a pyspark DataFrame of connexions, returns an array of sorted connexions in ascending order of departure
    if departure = True, else in descending order of arrival
    """

    connexions_array = [{'departure_location': row.stop_id_1, 
                         'departure_time': to_datetime(row.departure_time_1), 
                         'arrival_location': row.stop_id_2, 
                         'arrival_time': to_datetime(row.arrival_time_2), 
                         'trip_id': row.trip_id} for row in connexions_df.collect()]
    
    if departure:
        sorted_connexions = sorted(connexions_array, key = (lambda tup: tup['departure_time']))
    
    else:
        sorted_connexions = sorted(connexions_array, key = (lambda tup: tup['arrival_time']), reverse = True)
        
    return sorted_connexions

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

# Compute probability given lambda and a time left

## Recall:

_Our predictive model has for each (stop_id, arrival_time) a parameter for an exponential distribution. Once we get this parameter `lambda`, we can use it with the following method and a `time_left`. The time left corresponds to the amount of time (in seconds) the user can wait before taking the next transport._

_As a result, this method gives a probability of taking a transport given that you have `time_left` to catch it and you come from a transport that has `lambdaa` for parameter of its exponential distribution (modeling delay distribution)._

In [12]:
# for lambdaa we pass the arrival_delay median
def proba_trip(lambdaa, time_left): #time left in seconds
    
    # This is the average lambda for all possible entries, and we will use it as a default value
    default_value = 0.023447352748076224
    
    # corner case
    if lambdaa == -9999:
        lambdaa = default_value
    
    if lambdaa < 0:
        return 1
    else:
        
        return 1 - np.exp(- lambdaa * time_left)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

# Main algorithms

## Updating latest departure time

In [13]:
def updates_times_dict_given_arrival_top_K_with_proba(times, sorted_connexions, lambda_dict, footpaths, arrival_location, arrival_time, final_departure, K, desired_probability, min_time):
    """
    Given the dictionary of inialized times,
    the sorted connexions (by descending order of arrival time),
    the lambda_dict (to obtain the parameter for our predictive model),
    the footpaths dictionary,
    the arrival location,
    the final_departure (departure location),
    the desired arrival time,
    the number of routes to consider (K),
    the desired confidence probability,
    and a minimum time of departure
    
    this method updates the times dictionary (latest possible departure from each stop)
    """
    
    # initialize arrival location
    times[arrival_location][0] = ( (arrival_time, 1, None, None) )
    
    # first delay is 0 then 2 minutes
    cheminement_delay = timedelta(seconds = 0)


    # Initalize a dictionary of trips taken. For each trip already taken, 
    # we map it to the first departure location and departure time where we could have taken this trip. 
    # Returns None if the key is not assigned to another value thanks to defaultdict.
    trips_taken = defaultdict(lambda: None)

    # Iterate over connexions in sorted order
    for c in sorted_connexions:
    
        # trip_id of the current connexion
        trip_id = c['trip_id']
    
        # departure location of the current connexion
        departure_location = c['departure_location']
    
        # departure time of the current connexion
        departure_time = c['departure_time']
    
        # arrival location of the current connexion
        arrival_location = c['arrival_location']
    
        # arrival time of the current connexion
        arrival_time = c['arrival_time']
    
        # If current trip could have been taken earlier
        if trips_taken[trip_id]:
            
            
            # obtain data about this current trip (where we could have taken it and when)
            trip_data = trips_taken[trip_id]
        
            # obtain departure array
            departure_array = times[departure_location]
            
            # access proba
            if times[trip_data[0]][0][2] == None:
                next_proba = 1
                new_proba = 1
            else:
                next_connex_data = times[trip_data[0]][0][2]                   
                next_proba = times[trip_data[0]][0][1]
                next_departure_time = next_connex_data['departure_time']
                
                    
                if next_connex_data.get('walking', None):
                    next_departure_time = next_departure_time + timedelta(seconds = next_connex_data['walking'])
                    next_departure = next_connex_data['arrival_location']
                else:
                    next_departure = next_connex_data['departure_location']
            
                # obtain lambda for this transition of transport    
                lambdaa = lambda_dict[(trip_data[0], str(trip_data[1].time()))]
                
                walking_time = next_connex_data.get('walking', 0)
                
                # compute probability for taking the next transport
                new_proba = proba_trip(lambdaa, (next_departure_time - trip_data[1] - timedelta(seconds = 120) -  timedelta(seconds = int(walking_time))).seconds)
            
            # if we reached our goal we can store more than one route
            if departure_location == final_departure and len(departure_array) < K:
                
                if new_proba >= desired_probability:
                    
                    if departure_time >= min_time:
                        
                        # update best result
                        departure_array.append((departure_time, next_proba * new_proba, {'departure_location': departure_location,
                                                              'departure_time': departure_time,
                                                              'arrival_location':trip_data[0],
                                                              'arrival_time': trip_data[1],
                                                              'trip_id': trip_id}, new_proba))
                
                        departure_array.sort(key = (lambda tup: tup[0]), reverse = True)
            
            # otherwise update the entry
            elif departure_time >= times[departure_location][-1][0]:
                
                if new_proba >= desired_probability:
                    
                    if departure_time >= min_time:
                        # update departure time as well as connexion data for this departure location
                        times[departure_location][-1] = (departure_time, next_proba * new_proba, {'departure_location': departure_location,
                                                              'departure_time': departure_time,
                                                              'arrival_location':trip_data[0],
                                                              'arrival_time': trip_data[1],
                                                              'trip_id': trip_id}, new_proba)
                
                        departure_array.sort(key = (lambda tup: tup[0]), reverse = True)
            
            # obtain the stops reachable by walking
            reachable_stops_walking = footpaths.get(departure_location, None)
            
            
            if reachable_stops_walking:
                
                # for each possible destination
                for destination in reachable_stops_walking:
                    
                    # obtain the stop_id
                    location = destination[0]
                    
                    # obtain the walk duration from arrival_location (convert it to float)
                    walking_time = float(destination[1])
                    
                    # compute the new departure time if using this path
                    new_departure_time = departure_time - timedelta(seconds = walking_time)
                    
                    # obtain the current departure time
                    curr_departure_time_array = times[location]
                      
                    if location == final_departure and len(curr_departure_time_array) < K:
                        
                        if new_proba >= desired_probability:
                            
                        
                            if new_departure_time >= min_time:
                                curr_departure_time_array.append((new_departure_time, next_proba * new_proba, {'departure_location': location,
                                                              'departure_time': new_departure_time,
                                                              'arrival_location':departure_location,
                                                              'arrival_time': departure_time,
                                                              'trip_id': trip_id,
                                                              'walking': walking_time}, new_proba))
                        
                                curr_departure_time_array.sort(key = (lambda tup: tup[0]), reverse = True)
                    
                    
                    # if it improves the current best departure time, we update our dictionary
                    elif new_departure_time >= curr_departure_time_array[-1][0]:
                        
                        
                        if new_proba >= desired_probability:
                            if new_departure_time >= min_time:
                                curr_departure_time_array[-1] = (new_departure_time, next_proba * new_proba, {'departure_location': location,
                                                              'departure_time': new_departure_time,
                                                              'arrival_location':departure_location,
                                                              'arrival_time': departure_time,
                                                              'trip_id': trip_id,
                                                              'walking': walking_time}, new_proba)
                        
                                curr_departure_time_array.sort(key = (lambda tup: tup[0]), reverse = True)
    

        # if we can take this connexion
        elif times[arrival_location][0][0] >= arrival_time + cheminement_delay:
            
            cheminement_delay = timedelta(seconds = 120)

            # update trips taken with this new trip
            trips_taken[trip_id] = (arrival_location, arrival_time)
        
            departure_location_array = times[departure_location]
            
            # obtain proba
            if times[arrival_location][0][2] == None:
                next_proba = 1
                new_proba = 1
            else:
                next_connex_data = times[arrival_location][0][2]
                    
                next_proba = times[arrival_location][0][1]
                next_departure_time = next_connex_data['departure_time']
                    
                if next_connex_data.get('walking', None):
                    next_departure_time = next_departure_time + timedelta(seconds = next_connex_data['walking'])
                    next_departure = next_connex_data['arrival_location']
                else:
                    next_departure = next_connex_data['departure_location']
                    
                # get lambda for the exponential distribution
                lambdaa = lambda_dict[(arrival_location, str(arrival_time.time()))]
                
                # compute new probability
                walking_time = next_connex_data.get('walking', 0)
                new_proba = proba_trip(lambdaa, (next_departure_time - arrival_time - timedelta(seconds = 120) - timedelta(seconds = int(walking_time))).seconds)
             
            # if it is the goal station we can store more than one route
            if departure_location == final_departure and len(departure_location_array) < K:
                
                    if new_proba >= desired_probability:
                        
                        if departure_time >= min_time:
                            
                            # update result
                            departure_location_array.append((departure_time, next_proba * new_proba, {'departure_location': departure_location,
                                                              'departure_time': departure_time,
                                                              'arrival_location': arrival_location,
                                                              'arrival_time': arrival_time,
                                                              'trip_id': trip_id}, new_proba)  )
                
                            departure_location_array.sort(key=(lambda tup: tup[0]), reverse=True)
                
            # if the departure time is better than the current best
            elif departure_time > times[departure_location][-1][0]:
                
                if new_proba >= desired_probability:
            
                    # update the best time for the departure location
                    if departure_time >= min_time:
                        
                        departure_location_array[-1] = (departure_time, next_proba * new_proba, c, new_proba)  
                
                        departure_location_array.sort(key=(lambda tup: tup[0]), reverse = True)
            
            # obtain the stops reachable by walking
            reachable_stops_walking = footpaths.get(departure_location, None) 
            
            if reachable_stops_walking:
                
                # for each possible destination
                for destination in reachable_stops_walking:
                    
                    # obtain the stop_id
                    location = destination[0]
                    
                    # obtain the walk duration from departure_location (convert it to float)
                    walking_time = float(destination[1])
                    
                    # compute the new departure time if using this path
                    new_departure_time = departure_time - timedelta(seconds = walking_time)
                    
                    # obtain the current departure time
                    curr_departure_time_array = times[location]
                      
                    if location == final_departure and len(curr_departure_time_array) < K:
                        
                        if new_proba >= desired_probability:
                            
                            if new_departure_time >= min_time:
                                
                        
                                curr_departure_time_array.append((new_departure_time, next_proba * new_proba, {'departure_location': location,
                                                              'departure_time': new_departure_time,
                                                              'arrival_location':departure_location,
                                                              'arrival_time': departure_time,
                                                              'trip_id': trip_id,
                                                              'walking': walking_time}, new_proba))
                        
                                curr_departure_time_array.sort(key = (lambda tup: tup[0]), reverse = True)
                    
                    
                    # if it improves the current best departure time, we update our dictionary
                    elif new_departure_time > curr_departure_time_array[-1][0]:
                        
                        if new_proba >= desired_probability:
                            
                            if new_departure_time >= min_time:
                                curr_departure_time_array[-1] = (new_departure_time, next_proba * new_proba, {'departure_location': location,
                                                              'departure_time': new_departure_time,
                                                              'arrival_location':departure_location,
                                                              'arrival_time': departure_time,
                                                              'trip_id': trip_id,
                                                              'walking': walking_time}, new_proba)
                        
                                curr_departure_time_array.sort(key = (lambda tup: tup[0]), reverse = True)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [14]:
def find_routes(connexions, stops_array, lambda_dict, footpaths, arrival_stop, arrival_time, departure_stop, desired_probability, min_time = '01:00:00'):
    """
    Given
    the sorted connexions (by descending order of arrival time),
    the array of possible stops,
    the lambda_dict (to obtain the parameter for our predictive model),
    the footpaths dictionary,
    the arrival stop,
    the arrival time,
    the departure stop,
    the desired confidence probability,
    and a minimum time of departure (optional)
    
    this method updates find the best routes and print them
    """
    hour, minute, second = arrival_time.split(':')
    hour = int(hour)
    minute = int(minute)
    second = int(second)
    
    arrival_time_datetime = datetime(year=2020, month=1, day=1, hour=hour, minute=minute, second=second)
    
    hour, minute, second = min_time.split(':')
    hour = int(hour)
    minute = int(minute)
    second = int(second)
    
    min_time_datetime = datetime(year=2020, month=1, day=1, hour=hour, minute=minute, second=second)
    
    for i in range(5):
        
        # initialized time
        times = dict(((row.stop_id, 
               [(datetime(year=2019, month=1, day=6, hour=23, minute=59, second = 59), 1, None, None)]) 
              for row in stops_array))
        
        
        updates_times_dict_given_arrival_top_K_with_proba(times,
                                                          connexions,
                                                          lambda_dict,
                                                          footpaths, 
                                                          arrival_stop, 
                                                          arrival_time_datetime, 
                                                          departure_stop, 
                                                          K, 
                                                          desired_probability,
                                                          min_time_datetime)
        
        successful = []
        not_enough = []
        
        for output in times[departure_stop]:
            if output[0] != datetime(year=2019, month=1, day=6, hour=23, minute=59, second = 59):

                if output[1] >= desired_probability:
                    successful.append(output)
                else:
                    not_enough.append(output)
        
        if len(successful) == 0 and len(not_enough) == 0:
            print('Failure to find such a path')
            return
            
        if len(successful) > 0:
            print('Successful - printing routes...\n')
            
            for i, connexion_data in enumerate(successful):
                paths = print_route(times, connexion_data, arrival_stop)
                print('Route {nb}:'.format(nb = i+1))
                for path in paths:
                    print(path)
                print('\n')
            return
            
        else:
            max_prob = -1
            max_route = None
        
            for path in not_enough:
                prob = path[1]
                
                if prob > max_prob:
                    max_prob = prob
                    max_route = path
            
            
            print('Failure to find such a path')
            print('However the route closest to your requirements is:')
            paths = print_route(times, max_route, departure_stop)
            for path in paths:
                print(path)
                print('\n')
            return     
                        
    print('Failure to find such a path')        
    return

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [15]:
def print_route(times, last_connexion, arrival_stop):
    paths = []
    
    current_stop_data = last_connexion
    next_stop = None
    
    probas = []
    
    while next_stop != arrival_stop:
        current_connexion = current_stop_data[2]
        next_stop = current_connexion['arrival_location']
        
        proba = current_stop_data[1]
        probas.append(proba)
        
        current_stop_data = times[next_stop][0]
        
    current_stop_data = last_connexion
    next_stop = None
    
    i = 1
    
    while next_stop != arrival_stop:
        
        current_connexion = current_stop_data[2]
        
        proba = current_stop_data[1]
        
        current_stop = current_connexion['departure_location']
        next_stop = current_connexion['arrival_location']
        trip = current_connexion['trip_id']
        
        walking = current_connexion.get('walking', None)
        
        if walking:
            path = 'Walking during {s}s'.format(s = int(walking)) + ' from {d} to {a}'.format(d = current_stop, a = next_stop)
        
        else:
            path = 'From {d_l} (at {d_t}) to {a_l} (at {a_t}) using trip: {t}. Current probability = {p}'.format(d_l = current_stop,
                                                                                      d_t = current_connexion['departure_time'].time(),
                                                                                      a_l = next_stop,
                                                                                      a_t = current_connexion['arrival_time'].time(),
                                                                                      t = current_connexion['trip_id'],
                                                                                      p = probas[-i])

        current_stop_data = times[next_stop][0]
        
        paths.append(path)
        
        i+=1
        
    return paths

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

# Running the algorithm

In [16]:
footpaths = compute_footpaths_dict(reachable_pair_grouped)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [17]:
sorted_connexions = sort_connexions(connexions, departure = False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [18]:
stops_array = stops.select(stops.stop_id).collect()

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [19]:
predictive_data = spark.read.orc("/user/{}/grouped_delay_lambdas.orc".format('mjouve'))

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [20]:
distribution_data = predictive_data.select(predictive_data.stop_id,
                                           predictive_data.arrival_time,
                                           predictive_data['lambda'])

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

_Defaultdict storing for each pair (arrival_stop, arrival_time) a lambda computed by our predictive model, if not present returns the default lambda (average computed)._

In [21]:
# This is the average lambda for all possible entries, and we will use it as a default value
default_value = 0.023447352748076224

lambda_dict = defaultdict(lambda: default_value)

for row in distribution_data.collect():
    lambda_dict[(row.stop_id, row.arrival_time)] = row['lambda']

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

# Validation

## 1. Input given on slack

Following example might be helpful:
From Zürich HB (8503000) to Zürich, Auzelg (8591049), arrival by 12:30:00:
Route 1:
- 20.TA.26-9-A-j19-1.2.H: 8503000:0:41/42 at 12:07:00 ~ 8503310:0:3 at 12:17:00
- Walking: 8503310:0:3 ~ 8590620
- 168.TA.26-12-A-j19-1.2.H: 8590620 at 12:23:00 ~ 8591049 at 12:29:00


Route 2:
- 32.TA.80-159-Y-j19-1.8.H: 8503000:0:5 at 12:05:00 ~ 8503006:0:6 at 12:11:00
- Walking: 8503006:0:6 ~ 8580449
- 1914.TA.26-11-A-j19-1.27.R: 8580449 at 12:15:00 ~ 8591049 at 12:24:00

In [22]:
departure_location = '8503000'
arrival_location = '8591049'
arrival_time = '12:30:00'

desired_probability = 0.7
min_departure_time = '12:04:00'

K = 2

find_routes(sorted_connexions, 
            stops_array, 
            lambda_dict, 
            footpaths, 
            arrival_location, 
            arrival_time, 
            departure_location, 
            desired_probability, 
            min_departure_time)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

Successful - printing routes...

Route 1:
From 8503000 (at 12:07:00) to 8503310 (at 12:17:00) using trip: 20.TA.26-9-A-j19-1.2.H. Current probability = 1
Walking during 70s from 8503310 to 8590620
From 8590620 (at 12:23:00) to 8591049 (at 12:29:00) using trip: 168.TA.26-12-A-j19-1.2.H. Current probability = 0.915466308093


Route 2:
From 8503000 (at 12:05:00) to 8503006 (at 12:11:00) using trip: 32.TA.80-159-Y-j19-1.8.H. Current probability = 1
Walking during 72s from 8503006 to 8580449
From 8580449 (at 12:17:00) to 8591128 (at 12:23:00) using trip: 1755.TA.26-781-j19-1.3.R. Current probability = 0.777159023197
From 8591128 (at 12:27:00) to 8591049 (at 12:29:00) using trip: 168.TA.26-12-A-j19-1.2.H. Current probability = 0.773486968064

_If we want an earlier arrival_time._

In [23]:
departure_location = '8503000'
arrival_location = '8591049'
arrival_time = '12:25:00'

desired_probability = 0.7

K = 2

find_routes(sorted_connexions, 
            stops_array, 
            lambda_dict, 
            footpaths, 
            arrival_location, 
            arrival_time, 
            departure_location, 
            desired_probability)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

Successful - printing routes...

Route 1:
From 8503000 (at 12:05:00) to 8503006 (at 12:11:00) using trip: 32.TA.80-159-Y-j19-1.8.H. Current probability = 1
Walking during 72s from 8503006 to 8580449
From 8580449 (at 12:15:00) to 8591049 (at 12:24:00) using trip: 1914.TA.26-11-A-j19-1.27.R. Current probability = 0.783457744058


Route 2:
From 8503000 (at 12:01:00) to 8503006 (at 12:08:00) using trip: 250.TA.26-6-A-j19-1.48.H. Current probability = 1
Walking during 72s from 8503006 to 8580449
From 8580449 (at 12:15:00) to 8591049 (at 12:24:00) using trip: 1914.TA.26-11-A-j19-1.27.R. Current probability = 0.998399487999

_Even though the probability of success in higher in second route, we got that route 1 as a first better choice because it gives you the opportunity to leave later, still with a probability of success higher than the desired one._

## 2. Data from SBB (online)

_We will compare our algorithm's proposed routes with the real time data from sbb today in some random exemples:_

_dep_station_ids   = ['8530644','8591297']_
_dep_station_names = ['Meilen Autoquai','Zürich, Orionstrasse']_

_arr_station_ids   = ['8590679','8591082']_
_arr_station_names = ['Kilchberg ZH, Spital','Zürich, Billoweg']_

#### _Example 1:_ 

![title](sbb_1.png)

In [24]:
departure_location = '8530644'
arrival_location = '8590679'
arrival_time = '11:25:00'

desired_probability = 0.6

K = 1

find_routes(sorted_connexions, 
            stops_array, 
            lambda_dict, 
            footpaths, 
            arrival_location, 
            arrival_time, 
            departure_location, 
            desired_probability)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

Successful - printing routes...

Route 1:
From 8530644 (at 10:25:00) to 8530643 (at 10:35:00) using trip: 63.TA.90-174-Y-j19-1.1.H. Current probability = 1
Walking during 451s from 8530643 to 8580724
Walking during 424s from 8580724 to 8503204
From 8503204 (at 11:00:00) to 8503200 (at 11:10:00) using trip: 523.TA.26-8-A-j19-1.347.H. Current probability = 0.607780134281
Walking during 10s from 8503200 to 8590673
From 8590673 (at 11:13:00) to 8590679 (at 11:23:00) using trip: 35.TA.26-162-j19-1.2.H. Current probability = 0.60763082996

#### _Example 2:_ 

![title](sbb_2.png)

In [25]:
departure_location = '8591297'
arrival_location = '8591082'
arrival_time = '20:05:00'

desired_probability = 0.8

K = 1

find_routes(sorted_connexions, 
            stops_array, 
            lambda_dict, 
            footpaths, 
            arrival_location, 
            arrival_time, 
            departure_location, 
            desired_probability)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

Successful - printing routes...

Route 1:
Walking during 176s from 8591297 to 8591049
From 8591049 (at 19:27:00) to 8580449 (at 19:36:00) using trip: 794.TA.26-11-A-j19-1.3.H. Current probability = 1
Walking during 72s from 8580449 to 8503006
From 8503006 (at 19:41:00) to 8503010 (at 19:53:00) using trip: 424.TA.26-2-j19-1.163.H. Current probability = 1
Walking during 69s from 8503010 to 8591058
From 8591058 (at 19:58:00) to 8591082 (at 20:01:00) using trip: 2643.TA.26-7-B-j19-1.17.H. Current probability = 0.892762262232