Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse large instances with 15K or 30K nodes #117

Closed
N-Wouda opened this issue Mar 16, 2024 · 4 comments · Fixed by #125
Closed

Parse large instances with 15K or 30K nodes #117

N-Wouda opened this issue Mar 16, 2024 · 4 comments · Fixed by #125
Assignees
Labels
enhancement New feature or request

Comments

@N-Wouda
Copy link
Member

N-Wouda commented Mar 16, 2024

The large instances of Arnold, Gendreau and Sörensen (2017) in CVRPLIB currently hang on the edge weight calculations. These instances are very large, ranging between 6-30K nodes. Let's see if we can support such large instances as well.

@N-Wouda N-Wouda added the enhancement New feature or request label Mar 16, 2024
@leonlan
Copy link
Member

leonlan commented Jun 20, 2024

I performed a a quick benchmark with the current distance computation + four alternatives:

Script
import numpy as np
import timeit
from tabulate import tabulate

import numba


def current(coords: np.ndarray) -> np.ndarray:
    diff = coords[:, np.newaxis, :] - coords
    square_diff = diff**2
    square_dist = np.sum(square_diff, axis=-1)
    return np.sqrt(square_dist)


def faster_current(pts: np.ndarray) -> np.ndarray:
    sq_sum = np.sum(pts**2, axis=1)
    sq_diff = np.add.outer(sq_sum, sq_sum) - 2 * np.dot(pts, pts.T)
    np.fill_diagonal(sq_diff, 0)  # avoids minor numerical issues
    return np.sqrt(sq_diff)


def numpy_linalg(X: np.ndarray) -> np.ndarray:
    return np.linalg.norm(X[:, np.newaxis] - X, axis=2)


@numba.jit(nopython=True, fastmath=True)
def variant_numba(a: np.ndarray) -> np.ndarray:
    n = a.shape[0]
    out = np.zeros((n, n))

    for i in range(n):
        for j in range(i):
            out[i][j] = euclidean(a[i], a[j])

    return out + out.T


@numba.jit(nopython=True, fastmath=True)
def euclidean(u, v):
    n = len(u)
    dist = 0
    for i in range(n):
        dist += abs(u[i] - v[i]) ** 2
    return dist ** (1 / 2)


def variant_scipy(pts: np.ndarray) -> np.ndarray:
    from scipy.spatial import distance_matrix

    return distance_matrix(pts, pts)


if __name__ == "__main__":
    np.random.seed(1)
    headers = ["# of coords", "Function", "Time (s)"]
    NUMBER = 3

    for num_coords in [5000, 10000, 15000]:
        coords = np.random.randn(num_coords, 2)

        funcs = [
            current,
            faster_current,
            numpy_linalg,
            variant_numba,
            variant_scipy,
        ]

        results = []
        for func in funcs:
            elapsed = timeit.timeit(
                f"{func.__name__}(coords)", globals=globals(), number=NUMBER
            )
            avg = round(elapsed / NUMBER, 2)
            results.append((num_coords, func.__name__, avg))

        print(tabulate(results, headers=headers))

    # current and numpy_linalg are too slow
    for num_coords in [20000, 25000, 30000]:
        coords = np.random.randn(num_coords, 2)

        funcs = [
            faster_current,
            variant_numba,
            variant_scipy,
        ]

        results = []
        for func in funcs:
            elapsed = timeit.timeit(
                f"{func.__name__}(coords)", globals=globals(), number=NUMBER
            )
            avg = round(elapsed / NUMBER, 2)
            results.append((num_coords, func.__name__, avg))

        print(tabulate(results, headers=headers))
Results

Results:

  # of coords  Function          Time (s)
-------------  --------------  ----------
         5000  current               0.4
         5000  faster_current        0.14
         5000  numpy_linalg          0.41
         5000  variant_numba         0.21
         5000  variant_scipy         0.33

  # of coords  Function          Time (s)
-------------  --------------  ----------
        10000  current               1.58
        10000  faster_current        0.44
        10000  numpy_linalg          1.58
        10000  variant_numba         0.36
        10000  variant_scipy         1.19

  # of coords  Function          Time (s)
-------------  --------------  ----------
        15000  current               4.29
        15000  faster_current        0.89
        15000  numpy_linalg          4.94
        15000  variant_numba         0.76
        15000  variant_scipy         2.62

  # of coords  Function          Time (s)
-------------  --------------  ----------
        20000  faster_current        1.54
        20000  variant_numba         1.39
        20000  variant_scipy         4.61

  # of coords  Function          Time (s)
-------------  --------------  ----------
        25000  faster_current        2.37
        25000  variant_numba         2.16
        25000  variant_scipy         7.2

  # of coords  Function          Time (s)
-------------  --------------  ----------
        30000  faster_current       66.12
        30000  variant_numba         6.8
        30000  variant_scipy        10.86

For instances up to 25K, the faster_current function is pretty fast. At 30K it becomes much slower than the numba or scipy - could this be due to memory issues? I'm not sure. Anyway, I think most users don't solve 30K instances anyway and the 60s-ish runtime is acceptable.
I also don't want vrplib to rely on heavy dependencies like numba or scipy, so sticking to the numpy version is preferred.

I'll implement the faster_current version soon.


Another option is to write some C++ code.

See this for setting up Poetry with a custom build.

@leonlan leonlan self-assigned this Jun 20, 2024
@N-Wouda
Copy link
Member Author

N-Wouda commented Oct 13, 2024

The scipy variant seems to scale very well and is faster than what we have now. Looking at the implementation here, the relevant bits are just a few lines of code. We could implement our calculation similarly?

@N-Wouda
Copy link
Member Author

N-Wouda commented Oct 13, 2024

On my machine (Windows with an x86/i7 CPU from last year), the results look something like this:

  # of coords  Function          Time (s)
-------------  --------------  ----------
         5000  current               0.62
         5000  faster_current        0.28
         5000  numpy_linalg          0.63
         5000  variant_scipy         0.63

        10000  current               2.45
        10000  faster_current        1.14
        10000  numpy_linalg          2.48
        10000  variant_scipy         2.4

        15000  current               5.68
        15000  faster_current        2.71
        15000  numpy_linalg          6.16
        15000  variant_scipy         5.94

        20000  current               10.3
        20000  faster_current        5.71
        20000  variant_scipy        12.82

        25000  faster_current       10.2
        25000  variant_scipy        26.54

        30000  faster_current       27.49
        30000  variant_scipy        41.49

Here, faster_current is the clear winner, but scipy doesn't seem to do so well. Perhaps we should just implement faster_current and be done with it.

@leonlan
Copy link
Member

leonlan commented Oct 14, 2024

Yeah agreed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants