# How to compute and compare distances with different static allocators

This example shows how to use the python API to:

1. Create a fleet of electric vehicles, a charging-post infrastructure, and a
   constraint
1. Allocate fleet to infrastructure via the greedy and the random allocators
1. Plot a comparison of the distance profiles

## Setting up a problem with random inputs

We create random fleets and charging points using the functions provided for that
purpose in evosim. To simplify the problem a bit, we only use a subset of sockets.


In [None]:
from functools import partial

import numpy as np
import pandas as pd
import seaborn

import evosim

rng = np.random.default_rng(2)
sockets = list(evosim.charging_posts.Sockets)[:4]
charging_posts = evosim.charging_posts.random_charging_posts(
    20000, capacity=3, socket_types=sockets, seed=rng,
)
fleet = evosim.fleet.random_fleet(40000, socket_types=sockets, seed=rng)
matcher = evosim.matchers.factory(["socket_compatibility", "charger_compatibility"])

## Computing distance profiles

The distances can be computed using the function provided by evosim,
:py:func:`evosim.objectives.haversine_distance`. To simplify things a bit, we create a
function that takes a table describing the allocated fleet and a table describing the
infrastructure and returns the distances for each electric vehicle to its allocatd
post:

In [None]:
def distances(fleet: pd.pandas, charging_posts: pd.pandas) -> pd.Series:
    from evosim.objectives import haversine_distance

    vehicles = fleet.loc[
        fleet.allocation.notna(), ["dest_lat", "dest_long", "allocation"]
    ].rename(columns=dict(dest_lat="latitude", dest_long="longitude"))
    posts = charging_posts.loc[
        vehicles.allocation, ["latitude", "longitude"]
    ].set_index(vehicles.index)
    return haversine_distance(vehicles, posts.set_index(vehicles.index))


# In the function above, we simply avoid computing distances for unallocated vehicles.
# Later, if we need it, we can recover the number of unallocated vehicles by comparing
# the number of distances computed here and the size of the input fleet.

The efficiency of the two algorithms depends on the inputs: for instance, the
allocation will yield a better result if all vehicles and posts are compatible. So to
compute the efficiency of the algorithms, we should sample a few different sets. Since
we only care about the distribution of distances for each algorithm, we can simply
concatenate the results of multiple runs together.


In [None]:
def algo_distances(vehicles, posts, matcher, algorithm):
    """Composes distances and algorithm functions."""
    allocated = algorithm(vehicles, posts, matcher)
    return distances(allocated, posts).to_numpy()


nfleet = 500
nrepeats = 10
algorithms = dict(
    random=partial(evosim.allocators.random_allocator, seed=rng),
    greedy10=partial(evosim.allocators.greedy_allocator, nearest_neighbors=10),
    greedy40=partial(evosim.allocators.greedy_allocator, nearest_neighbors=40),
    greedy80=partial(evosim.allocators.greedy_allocator, nearest_neighbors=80),
)
data = {
    algo: np.concatenate(
        [
            algo_distances(
                fleet.sample(nfleet),
                charging_posts.sample((nfleet * 4) // 5),
                matcher,
                method,
            )
            for _ in range(nrepeats)
        ]
    )
    for algo, method in algorithms.items()
}

Here we ran the greedy allocator with two slightly different parameters. First we run
it considering only the 10 nearest neighbors of any vehicles, and later we run it
considering the first 40 neighbors. Considering more neighbors should result in fewer
unallocated vehicles, but should be more computationally intensive. Indeed, if all
vehicles cannot be allocated, the greedy allocator may issue a warning.

In [None]:
pd.DataFrame(
    dict(
        algorithm=list(data.keys()),
        allocated=[len(v) for v in data.values()],
        unallocated=[nrepeats * nfleet - len(v) for v in data.values()],
    )
)

## Comparison of the distance distributions

Now we plot the distribution of allocated distances for each algorithm under
consideration.

In [None]:
as_frame = pd.DataFrame(
    dict(
        algorithm=np.concatenate(
            [[algo] * len(values) for algo, values in data.items()]
        ),
        distance=np.concatenate(list(data.values())),
    )
)
seaborn.set_style("dark")
seaborn.violinplot(
    x="algorithm", y="distance", scale="count", inner="quartile", data=as_frame
)