# Proof of Concept. Classical approach and checks

This Notebook serves as an insight on how the future capabilities of the optimization algorithm once we have access to a more powerfull Quantum Solver (see first the main POC for more information). To do that we rely here in simulated annealing, which performs well for small values of N and p, to show the complete map of a fully connected public transport network.

## Requirements and Setup

In order to execute this notebook, make sure you have already installed the necessary requirements described in the README.md.
The following are all the necessary imports to run the entire notebook, from start to end.

In [None]:
import os 
import glob
import pandas as pd 
import numpy as np
from main.tree.linkageTree import linkageCut
from main.tsp.TSP_Formulation_Methods import ( 
    create_QUBO_matrix,
    solve_qubo_with_Dwave,
    compute_general_lambdas,
)
from main.tree.utils import ( 
    view_linkage_on_map, 
    draw_centers_on_map,
    map_draw_line,
    convert_bitstring_to_matrix,
    assemble_line,
    check_stops_usage
)
from main.pipe import give_line
from data.utils import fetch_amenities_from

# Load initial data 

As in the main POC, we must first fetch data from a specific city.
In this case, we fetch the data from Granada, Spain. The data consists of local amenities. We call our hierarchical clustering algorithm (ward distance) to propose a city multi-level view. In this proof of concept we will work with 2 levels, that means that the level 0
will create nclusters (red) and level 1 will further divide each cluster in nclusters (blue).

In [None]:
# Load previously stored overpy lat/lon datafile for different amenities
filepath = os.path.join('data', 'amenities-granada.csv')
if os.path.exists(filepath):
    amenities_data = pd.read_csv(filepath)
else:
    # If there is no previous data:
    query_file = os.path.join('data', 'overpy-granada-query.txt')
    query = None
    with open(query_file) as file:
        query = file.read()
    amenities_data = fetch_amenities_from(query=query) # Defaults to Granada
    amenities_data.to_csv(filepath)
    
# Create a hierarchical clustering of amenities
hierarchical_cluster = linkageCut(amenities_data)
# Set a specific number of clusters per levels. Max 9 in this POC
nclusters = 4
levels = 2
labels = hierarchical_cluster.top_down_view_recur(nclusters=nclusters, levels=2)
# Visualize for debugging purposes.
view_linkage_on_map(linkage_matrix=hierarchical_cluster)

# Create the first bus line at all levels

We will follow the workflow from the main POC.

We call Level 0 solutions to the larger regions( i.e., clusters) defined by the clustering. These will define the general flow of the bus 
route, dictating how we will connect the real bus stops once we solve the 'zoomed-in' problem.
We show the division proposed by the hierarchical algorithm, which for the city of Granada is in fact close to a district level organization. 

For our example we will impose two random nodes as start and endpoints for the first bus line. This works as a typical use-case: say we wanted to connect a marginalized area with another area where public services or/and green spaces are available. These extreme nodes will work as the extremes of the route. Additionally, we will fix for the route to traverse a fixed (p) number of districts (level-0 clusters).

Level 1 solutions are solutions for a given line in an specific region of the city. If a bus line goes from or to a specific region (in our example we have 4), the solver should give the route the bus follow inside that region with the specified number of stops (p). Aftet solving in each region, we assemble the whole route as a single-line, connecting regions.


In [None]:
# Fetch the centers of the first level
centers = hierarchical_cluster.give_centers_level(0)

np.random.seed(42)

# Fetch the distance from the centers of the first level
distances = hierarchical_cluster.dist_matrix_level(0, return_labels=False)

# Set initial global parameters
N = distances.shape[0]
p = 3
node_options = set(np.arange(nclusters) + 1)
startNode_0 = np.random.choice(list(node_options))
endNode_0 = np.random.choice(list(node_options - set([startNode_0])))
print("Selected random nodes:", startNode_0, endNode_0)

# Process Parameters
p= min(p, N-1)
startNode_0 = min(startNode_0, N)
endNode_0 = min(endNode_0, N)

reduced_distances = distances/np.max(distances)
maxDistance = np.max(reduced_distances)
lambdas = compute_general_lambdas(reduced_distances, max_N=3)

# Solve level 0
Q_matrix_initial,_ = create_QUBO_matrix(reduced_distances, p, startNode_0 - 1, endNode_0 - 1, lambdas)   
level0_sols, _ = solve_qubo_with_Dwave(Q_matrix_initial, num_reads=1000)
adjacency = convert_bitstring_to_matrix(level0_sols, N=N, p=p)

# Solve level 1
level1_sols = {} # Dict that will hold the bitstring, connected level-0 clusters and corresponding start-end nodes 
nchecks = 1024
all_indices = set(np.arange(nclusters - 1) + 1)
for i in range(1, nclusters+1):
    
    connections = np.concatenate([adjacency[:,i-1].nonzero()[0], adjacency[i-1, :].nonzero()[0]], axis=0) + 1
    print("----- Solving level-1:", i, "------\n")
    print("connections", connections)
    if len(connections) > 0: #Selected
        # Fetch the centers of the first level
        distances, closest, _ = hierarchical_cluster.dist_matrix_label_down(
        i,
        connections=connections,
        )
        
        startNode_ = None
        if len(closest) >= 1:
            startNode_ = closest[0]
            choices = all_indices - set([startNode_])
        if len(closest) == 2:
            endNode_ = closest[1]
        else:
            endNode_ = np.random.choice(list(choices)) # POC criterion, better heuristic should be chosen
        print("Start-end", startNode_, endNode_)
        reduced_distances = distances / np.max(distances)
        Q_matrix_initial,_ = create_QUBO_matrix(reduced_distances, p, startNode_ - 1, endNode_ - 1, lambdas)
        


        sol_, _ = solve_qubo_with_Dwave(Q_matrix_initial, num_reads=1000)
        level1_sols[i] = [sol_, closest]
        
    else:
        level1_sols[i] = (np.zeros((nclusters*(p + 1))), [])
        print('The line does not cross this level-0 cluster')

assembled_line_1 = assemble_line(level0_sols,level1_sols, nclusters, p)
centers_level1 = hierarchical_cluster.give_centers_level(1)
map = map_draw_line(centers_level1[:,::-1], assembled_line_1, color='blue', zoom_start=14)
map

# Create the other bus lines taking into account how the previous ones where created

Contrary to the previous bus line, as we selected there two random start and end stops (in level 0), we will try to cover the entire network with few lines, therefore we choose a different start node

In [None]:
new_lines = 1

assembled_lines = []
assembled_lines.append(assembled_line_1)
map = map_draw_line(centers_level1[:,::-1], assembled_line_1, color='blue', zoom_start=14, map=map)
colors = ['red', 'green', 'pink', 'orange', 'purple', 'yellow', 'black']

list_of_start_and_end_nodes = [[4,1]] # Informed by the previous lines, has to be the same lenght as new_lines

for l in range(new_lines):
    new_startNode_0 = list_of_start_and_end_nodes[l][0]
    new_endNode_0 = list_of_start_and_end_nodes[l][1]
    new_line = give_line(amenities_data, nclusters, p, new_startNode_0, new_endNode_0,'qutip', classical=True)

    new_assembled = assemble_line(new_line[0], new_line[1], nclusters, p)
    centers_level1 = hierarchical_cluster.give_centers_level(1)


    map = map_draw_line(centers_level1[:,::-1], new_assembled, color=colors[l], zoom_start=14, map=map)

map

In [None]:
# We now check if the stops are being used

total_assembled = np.empty_like(assembled_lines[0])
for line in assembled_lines:
    total_assembled += line

check_stops_usage(total_assembled)

We have seen that we can cover the whole network by creating new lines. If for a specific number of lines L the condition is not fulfilled one can start over by adjusting the start and end nodes of the level 0 solutions. Heuristical approaches to improve the choice of those stops while mantaining sociodemographic decission making can be applied.

Also, in a further iteration of the project, a different number of stops for level 0 and level 1 can be easily implemented, allowing for bus lines which travel around less regions but visit more stops in each region. 

Further metrics can be implemented here to improve the proposed solutions, such us the total distance covered by all the lines.