Parse large instances with 15K or 30K nodes #117

N-Wouda · 2024-03-16T20:42:53Z

The large instances of Arnold, Gendreau and Sörensen (2017) in CVRPLIB currently hang on the edge weight calculations. These instances are very large, ranging between 6-30K nodes. Let's see if we can support such large instances as well.

leonlan · 2024-06-20T17:55:57Z

I performed a a quick benchmark with the current distance computation + four alternatives:

Script

import numpy as np
import timeit
from tabulate import tabulate

import numba


def current(coords: np.ndarray) -> np.ndarray:
    diff = coords[:, np.newaxis, :] - coords
    square_diff = diff**2
    square_dist = np.sum(square_diff, axis=-1)
    return np.sqrt(square_dist)


def faster_current(pts: np.ndarray) -> np.ndarray:
    sq_sum = np.sum(pts**2, axis=1)
    sq_diff = np.add.outer(sq_sum, sq_sum) - 2 * np.dot(pts, pts.T)
    np.fill_diagonal(sq_diff, 0)  # avoids minor numerical issues
    return np.sqrt(sq_diff)


def numpy_linalg(X: np.ndarray) -> np.ndarray:
    return np.linalg.norm(X[:, np.newaxis] - X, axis=2)


@numba.jit(nopython=True, fastmath=True)
def variant_numba(a: np.ndarray) -> np.ndarray:
    n = a.shape[0]
    out = np.zeros((n, n))

    for i in range(n):
        for j in range(i):
            out[i][j] = euclidean(a[i], a[j])

    return out + out.T


@numba.jit(nopython=True, fastmath=True)
def euclidean(u, v):
    n = len(u)
    dist = 0
    for i in range(n):
        dist += abs(u[i] - v[i]) ** 2
    return dist ** (1 / 2)


def variant_scipy(pts: np.ndarray) -> np.ndarray:
    from scipy.spatial import distance_matrix

    return distance_matrix(pts, pts)


if __name__ == "__main__":
    np.random.seed(1)
    headers = ["# of coords", "Function", "Time (s)"]
    NUMBER = 3

    for num_coords in [5000, 10000, 15000]:
        coords = np.random.randn(num_coords, 2)

        funcs = [
            current,
            faster_current,
            numpy_linalg,
            variant_numba,
            variant_scipy,
        ]

        results = []
        for func in funcs:
            elapsed = timeit.timeit(
                f"{func.__name__}(coords)", globals=globals(), number=NUMBER
            )
            avg = round(elapsed / NUMBER, 2)
            results.append((num_coords, func.__name__, avg))

        print(tabulate(results, headers=headers))

    # current and numpy_linalg are too slow
    for num_coords in [20000, 25000, 30000]:
        coords = np.random.randn(num_coords, 2)

        funcs = [
            faster_current,
            variant_numba,
            variant_scipy,
        ]

        results = []
        for func in funcs:
            elapsed = timeit.timeit(
                f"{func.__name__}(coords)", globals=globals(), number=NUMBER
            )
            avg = round(elapsed / NUMBER, 2)
            results.append((num_coords, func.__name__, avg))

        print(tabulate(results, headers=headers))

Results

Results:

  # of coords  Function          Time (s)
-------------  --------------  ----------
         5000  current               0.4
         5000  faster_current        0.14
         5000  numpy_linalg          0.41
         5000  variant_numba         0.21
         5000  variant_scipy         0.33

  # of coords  Function          Time (s)
-------------  --------------  ----------
        10000  current               1.58
        10000  faster_current        0.44
        10000  numpy_linalg          1.58
        10000  variant_numba         0.36
        10000  variant_scipy         1.19

  # of coords  Function          Time (s)
-------------  --------------  ----------
        15000  current               4.29
        15000  faster_current        0.89
        15000  numpy_linalg          4.94
        15000  variant_numba         0.76
        15000  variant_scipy         2.62

  # of coords  Function          Time (s)
-------------  --------------  ----------
        20000  faster_current        1.54
        20000  variant_numba         1.39
        20000  variant_scipy         4.61

  # of coords  Function          Time (s)
-------------  --------------  ----------
        25000  faster_current        2.37
        25000  variant_numba         2.16
        25000  variant_scipy         7.2

  # of coords  Function          Time (s)
-------------  --------------  ----------
        30000  faster_current       66.12
        30000  variant_numba         6.8
        30000  variant_scipy        10.86

For instances up to 25K, the faster_current function is pretty fast. At 30K it becomes much slower than the numba or scipy - could this be due to memory issues? I'm not sure. Anyway, I think most users don't solve 30K instances anyway and the 60s-ish runtime is acceptable.
I also don't want vrplib to rely on heavy dependencies like numba or scipy, so sticking to the numpy version is preferred.

I'll implement the faster_current version soon.

Another option is to write some C++ code.

See this for setting up Poetry with a custom build.

N-Wouda · 2024-10-13T16:47:23Z

The scipy variant seems to scale very well and is faster than what we have now. Looking at the implementation here, the relevant bits are just a few lines of code. We could implement our calculation similarly?

N-Wouda · 2024-10-13T21:15:27Z

On my machine (Windows with an x86/i7 CPU from last year), the results look something like this:

  # of coords  Function          Time (s)
-------------  --------------  ----------
         5000  current               0.62
         5000  faster_current        0.28
         5000  numpy_linalg          0.63
         5000  variant_scipy         0.63

        10000  current               2.45
        10000  faster_current        1.14
        10000  numpy_linalg          2.48
        10000  variant_scipy         2.4

        15000  current               5.68
        15000  faster_current        2.71
        15000  numpy_linalg          6.16
        15000  variant_scipy         5.94

        20000  current               10.3
        20000  faster_current        5.71
        20000  variant_scipy        12.82

        25000  faster_current       10.2
        25000  variant_scipy        26.54

        30000  faster_current       27.49
        30000  variant_scipy        41.49

Here, faster_current is the clear winner, but scipy doesn't seem to do so well. Perhaps we should just implement faster_current and be done with it.

leonlan · 2024-10-14T04:21:38Z

Yeah agreed!

N-Wouda added the enhancement New feature or request label Mar 16, 2024

leonlan self-assigned this Jun 20, 2024

leonlan mentioned this issue Jun 21, 2024

Support ragged arrays #119

Merged

N-Wouda mentioned this issue Oct 13, 2024

Implement faster distance algorithm #125

Merged

leonlan closed this as completed in #125 Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse large instances with 15K or 30K nodes #117

Parse large instances with 15K or 30K nodes #117

N-Wouda commented Mar 16, 2024

leonlan commented Jun 20, 2024 •

edited

Loading

N-Wouda commented Oct 13, 2024

N-Wouda commented Oct 13, 2024 •

edited

Loading

leonlan commented Oct 14, 2024

Parse large instances with 15K or 30K nodes #117

Parse large instances with 15K or 30K nodes #117

Comments

N-Wouda commented Mar 16, 2024

leonlan commented Jun 20, 2024 • edited Loading

N-Wouda commented Oct 13, 2024

N-Wouda commented Oct 13, 2024 • edited Loading

leonlan commented Oct 14, 2024

leonlan commented Jun 20, 2024 •

edited

Loading

N-Wouda commented Oct 13, 2024 •

edited

Loading