# Advanced Function

This notebook aims to develop the basic function found in "function", so the output can be further restructed by (1) Maximum threshold of people that live in all points of a cluster, combined and (2) a maximum threshold of distance travelled that is necessary to visit each point exactly once.


## Disclaimer

This is a work in progress. The functions currently run if none of the thresholds are exceeded and prints an error message if otherwise. The functions don't currently have a loop that adjusts the parameter of k (number of groups) as longs as necessary for all thresholds to be avoided. This needs to be done manually, aso of right now.

Import modules

In [None]:
import numpy as np
import requests
from sklearn.cluster import KMeans

Define input data, appended by a fictional amount of people per point. The mean of these points add up to 2, which is an estimation for the demo to work. 

In [None]:
input_data = [
    {"latitude": 52.5163, "longitude": 13.3777, "people": 1},
    {"latitude": 52.5200, "longitude": 13.4049, "people": 3},
    {"latitude": 52.5244, "longitude": 13.4050, "people": 2},
    {"latitude": 52.5186, "longitude": 13.3759, "people": 2},
    {"latitude": 52.5138, "longitude": 13.3928, "people": 1},
    {"latitude": 52.5206, "longitude": 13.2951, "people": 3},
    {"latitude": 52.5219, "longitude": 13.4132, "people": 2},
    {"latitude": 52.5034, "longitude": 13.4375, "people": 1},
    {"latitude": 52.5043, "longitude": 13.3325, "people": 2},
    {"latitude": 52.5336, "longitude": 13.3818, "people": 3}
]

## Define Advanced Function:

In [None]:
# Initaite function to take input data coordinates, set an amount of groups, and a maximum amount of people per group
def cluster_voters(input_data, k, max_people_per_cluster):
    # Set origins and destiantions for the Distance Matrix API request
    origins = "|".join([f"{point['latitude']},{point['longitude']}" for point in input_data])
    destinations = "|".join([f"{point['latitude']},{point['longitude']}" for point in input_data])

    # Set Distance Matrix API url
    url_distance = "https://maps.googleapis.com/maps/api/distancematrix/json"

    # Set parameters and response, call the API and save the result
    params = {
        "origins": origins,
        "destinations": destinations,
        "key": "INSERT_API_KEY"
    }
    response = requests.get(url_distance, params=params)
    api_result = response.json()

    # Create a distance matrix from the API result
    distance_matrix = []
    for row in api_result['rows']:
        row_distances = [element['distance']['value'] for element in row['elements']]
        distance_matrix.append(row_distances)

    distance_matrix = np.array(distance_matrix)

    # Use the KMeans algorithm to cluster the data
    kmeans = KMeans(n_clusters=k, random_state=0).fit(distance_matrix)
    
    cluster_labels = kmeans.labels_

    # Create a list of clusters, where each cluster is a list of points
    clusters = [[] for _ in range(k)]
    for i, point in enumerate(input_data):
        clusters[cluster_labels[i]].append(point)

    # Filter out clusters that exceed the maximum amount of people, return the filtered clusters
    filtered_clusters = []
    for cluster in clusters:
        # Access each point within a cluster and sum its people value
        people_count = sum(point['people'] for point in cluster)
        # Print error message if sum people value is above the maximum amount of people per cluster
        # Otherwise, append the cluster to the filtered clusters list
        if people_count > max_people_per_cluster:
            print(f"Cluster with {people_count} people exceeds the maximum of {max_people_per_cluster}")
        else:
            filtered_clusters.append((cluster, kmeans.predict(distance_matrix)))
    return filtered_clusters


Run the function, set arguments:

In [None]:
# This only runs and retuns the clusters, if each group is below the maximum amount of people
cluster_test = cluster_voters(input_data, 2, 20)
print(cluster_test)

**Adding the Maximum Distance Travelled Threshold**

In [None]:
# Set arguments of the clusters obtained above, the API key and the maximum distance threshold
def check_distance(clusters, api_key, max_distance_threshold):
    # Initiate an empty list to store the filtered clusters
    filtered_clusters = []
    # Access the clusters obtained in the function above
    for cluster, _ in clusters:
        # For the Google Maps Route API, we need to specify the origin, destination and waypoints
        waypoints = "|".join([f"{point['latitude']},{point['longitude']}" for point in cluster])
        origin = f"{cluster[0]['latitude']},{cluster[0]['longitude']}"
        destination = f"{cluster[-1]['latitude']},{cluster[-1]['longitude']}"
        # Set the url for this API, define the parameters and call the API, save the result
        url_route = "https://maps.googleapis.com/maps/api/directions/json"
        params_route = {
            "origin": origin,
            "destination": destination,
            "waypoints": waypoints,
            "key": api_key
        }
        response = requests.get(url_route, params=params_route)
        data = response.json()
        
        # Initiate the total distance as 0, and iterate through the routes and legs to sum the distances. "Legs" are given by the API result as the distance betweene two waypoints.
        # To get the full distance of the route, we need to sum the distances of all legs.
        total_distance = 0
        for route in data['routes']:
            for leg in route['legs']:
                total_distance += leg['distance']['value']
        
        # Print an error message if the total distance exceeds the maximum distance threshold, otherwise append the cluster to the filtered clusters list
        if total_distance > max_distance_threshold * 1000:
            print(f'Cluster with {total_distance} meters exceeds the maximum of {max_distance_threshold} meters')
        else:
            filtered_clusters.append(cluster)
    return filtered_clusters


In [None]:
# Run the function with the clusters obtained above, the API key and the maximum distance threshold
distance_test = check_distance(cluster_test, "INSERT_API_KEY", 20)

print(distance_test)