Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent overflows when using custom matrix costs #50

Closed
frodrigo opened this issue Oct 31, 2016 · 9 comments
Closed

Prevent overflows when using custom matrix costs #50

frodrigo opened this issue Oct 31, 2016 · 9 comments
Assignees
Labels
Milestone

Comments

@frodrigo
Copy link
Contributor

On some problem vroom fail with:

vroom: heuristics/../algorithms/munkres.h:309: std::unordered_map<short unsigned int, short unsigned int> greedy_symmetric_approx_mwpm(const matrix<T>&) [with T = unsigned int]: Assertion `m.size() % 2 == 0' failed.
@frodrigo
Copy link
Contributor Author

With some matrix vroom fails on assert or run forever probably due to overflow.

NAME: vroom
TYPE: ATSP
DIMENSION: 4
EDGE_WEIGHT_TYPE: EXPLICIT
EDGE_WEIGHT_FORMAT: FULL_MATRIX
EDGE_WEIGHT_SECTION
0 1 2147483647 2147483647
2147483647 0 2147483647 2147483647
2147483647 2147483647 0 2147483647
2147483647 2147483647 2147483647 0
EOF

@frodrigo frodrigo changed the title Assertion `m.size() % 2 == 0' failed Assertion `m.size() % 2 == 0' failed probably due to overflow Oct 31, 2016
@jcoupey
Copy link
Collaborator

jcoupey commented Oct 31, 2016

This is indeed due to the huge costs that can't be added without overflowing. With those kind of failing examples, the solving part is ok again when switching from uint32_t to uint64_t at https://github.com/VROOM-Project/vroom/blob/develop/src/structures/typedefs.h#L19.

The trouble here is that those huge values describe points that can't be reached (from or to), so the problem is ill-formed. For example, using osrm-routed, those are filtered out at https://github.com/VROOM-Project/vroom/blob/develop/src/loaders/routed_loader.h#L131 and an exception is raised.

So we should rely on the user to avoid this situation altogether (remove inaccessible points), but maybe there is a need for a warning of some kind here, or a limitation of the input cost in the matrix that would guarantee everything is fine afterwards ?

@jcoupey jcoupey added this to the v1.2.0 milestone Feb 14, 2017
@jcoupey
Copy link
Collaborator

jcoupey commented Oct 9, 2017

We should decide on a way to prevent overflows causing troubles when huge costs are used in custom input. Trying to reproduce the above with current master using:

{
  "vehicles": [
    {
      "id": 0,
      "start_index": 0,
      "end_index": 0
    }
  ],
  "jobs": [
    {
      "id": 1,
      "location_index": 1
    },
    {
      "id": 2,
      "location_index": 2
    },
    {
      "id": 3,
      "location_index": 3
    }
  ], 
  "matrix": [
    [0, 1, 2147483647, 2147483647],
    [2147483647, 0, 2147483647, 2147483647],
    [2147483647, 2147483647, 0, 2147483647],
    [2147483647, 2147483647, 2147483647, 0]
  ]
}

we don't hit the same assertion but the local search goes on endlessly as overflows make it seem like we find endless improvements. Verbose output loops on:

* Performed 1 "2-opt" steps, gaining 2147483646.
* Performed 1 "relocate" steps, gaining 2147483650.

@jcoupey jcoupey added the bug label Oct 9, 2017
@jcoupey jcoupey changed the title Assertion `m.size() % 2 == 0' failed probably due to overflow Prevent overflows when using custom matrix costs Oct 9, 2017
@PattyDePuh
Copy link
Contributor

What is the prefered solution outcome?

  • To be able to actually solve problems with astronomical distances

  • or simply warn the user about the currently not handleable instance and abbort ?

I guess a static extend of memory space for the variables won't do, since we would always at some point will hit a border, if we want to achieve scenario one.

From the users perspective - at which point is it relevant for them to optimize their solution up to more then the 10th relevant digit?

My suggestion: Parsing the input from the matrix and try count the sum of all entries with a uint64_t variable (or higher) and mark every input as invalid, where the sum might not fit into uint32_t.

@jcoupey
Copy link
Collaborator

jcoupey commented Oct 11, 2017

@PattyDePuh 2nd option: we want to stop whenever huge values are likely to cause an overflow when computing costs. I think using uint32_t for single costs is enough, and switching to uint64_t would just mean moving the overflow problem further away, not solving it (and would have a huge impact on memory for big problems).

As you suggest, we need to come up with a good upper bound on the overall cost when parsing the matrix. Summing all entries sounds like a very loose bound, we can probably find something simple with a smaller gap.

@PattyDePuh
Copy link
Contributor

PattyDePuh commented Oct 11, 2017

I don't think, that there is an "easy" way, if we like to utilize the most out of the uint32_t.
Here two implementation suggestions:

  • The upper bound for every matrix entry could be something like the <uint32_t>max() divided by the number of jobs.

  • Calculating roughly the cost of the worst possible solution and checking, if it's below <uint32_t>max().

The first one is the easiest to implement, but there is still some more room: if one entry is above the dynamic upper bound, then it does not necessarily mean, thats the worst possible route would ever exceed our <uint32_t>max() boundary.

To determine roughly the worst possible route, we could sum up the highest entry of every collumn of the input matrix.

@jcoupey jcoupey mentioned this issue Oct 26, 2017
5 tasks
@jcoupey jcoupey self-assigned this Oct 26, 2017
@jcoupey
Copy link
Collaborator

jcoupey commented Oct 26, 2017

Plan to compute a generic upper bound that should be ok for any type of VRP. While parsing the matrix, we store the max value max_i and max_j on each line and column.

  1. job to job bound
  • summing max_i over all jobs indices i (bound for costs from jobs)
  • summing max_j over all jobs indices j(bound for costs to jobs)
  • use max value of the two above as job arriving/departing costs bound[1]
  1. vehicle start/end bound
  • summing max_i over all vehicle start indices i (bound for start costs[2])
  • summing max_j over all vehicle end indices j (bound for end costs[3])

Our upper bound is [1] + [2] + [3]. We should check for potential overflows during the computation to stop if required. If the result fits in a uint32_t, then go on solving.

@jcoupey
Copy link
Collaborator

jcoupey commented Oct 26, 2017

The above bound is probably a bit "loose" but should be fine though, and it's fast to compute.

The bound value is around 4 or 5 times bigger than the cost of the heuristic solution for small problems on medium areas, and up to around 50 times for bigger problems on wider geographical scale.

As an example, running on a random problem with 5000 points at (french) national scale provides a bound of 149962759, still one order of magnitude under std::numeric_limits<cost_t>::max().

AFAICT this should leave enough room to solve big problems while being confident we don't run into trouble with overflows.

@jcoupey
Copy link
Collaborator

jcoupey commented Oct 27, 2017

For the record, the initial reported assertion can be hit on current master using:

{
  "vehicles": [
    {
      "id": 0,
      "start_index": 0,
      "end_index": 0
    }
  ],
  "jobs": [
    {
      "id": 1,
      "location_index": 1
    },
    {
      "id": 2,
      "location_index": 2
    },
    {
      "id": 3,
      "location_index": 3
    }
  ], 
  "matrix": [
    [0, 1, 2147483647, 2147483647],
    [2147483647, 0, 2147483647, 2147483647],
    [2147483647, 2147483647, 0, 2147483647],
    [1, 1, 2147483647, 0]
  ]
}

which aborts as expected with #68.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants