Failing traveling salesman test #5913

jamesjer · 2022-08-02T22:09:47Z

Current Behavior

I help maintain a NetworkX package for the Fedora Linux distribution. We like to run the tests when we build, to catch problems that might affect users. One traveling salesman test, test_asadpour_tsp at line 707 of networkx/algorithms/approximation/tests/test_traveling_salesman.py in NetworkX 2.8.5, sometimes succeeds and sometimes fails. When it fails, it finds the tour [3, 2, 1, 0, 5, 4, 3], which has the same weight (402) as the expected tour [3, 2, 0, 1, 4, 5, 3]. I'm curious about this comment at the bottom of the test:

    # However, we are using a fixed random number generator so we know what the
    # expected tour is.

Does that mean that something is wrong with the random number generator? If so, can you give me a hint as to what I should look for?

Expected Behavior

The tests should pass.

Steps to Reproduce

The test failure is intermittent. We do the build, then run pytest.

Environment

Python version: 3.11.0~b5
NetworkX version: 2.8.5

The text was updated successfully, but these errors were encountered:

mjschwenne · 2022-08-03T01:48:56Z

This is odd indeed. While I did author the code in question, I will admit that the random number generation pattern used in networkx isn't something that I've investigated. I haven't been able to replicate the error in my environment running Ubuntu 22.04 (python 3.10.4) against the main branch of the repo and haven't heard any reports of that test failing in our pipeline, but since you mention that it's an intermittent problem that doesn't exactly say anything. (Ugh those types of problems are the worst to track down!)

I did check that the seed parameter passed to asadpour_atsp is used in all of the places where random numbers are required. It is worth noting that the since their are already two tours in the expected_tours, the comment above it seems... not exactly correct, but I'm not sure where the existing non-determinism is coming from.

Since the returned answer has the same weight as the correct answer, the quick and dirty option here is to add that tour to the list of expected tours. Perhaps either @dschult or @rossbar would have a better intuition on whether this is a problem that we need to track down or if the fast solution is considered good enough.

dschult · 2022-08-03T03:58:14Z

I think the random number generation in asadpour is fine. But calling traveling_salesman_problem doesn’t present a way to provide the seed to the asadpour function directly. One of the tests shows that it can be called with a seed by overriding the default method with a function that calls asadpour with a seed. But… the test that is causing troubles doesn’t do that. So no seed is set for that test despite the comment saying it is. Perhaps a previous version did call the seed.

I think the easy fix is to change the test to use asadpour with a seed. Longer term maybe we should provide a seed argument to traveling_salesman_problem. But I’m not sure. If we don’t do that we should provide an example in the doc_string of how to call it with a seed. Thoughts?

jamesjer · 2022-08-03T14:28:24Z

Changing the test to provide a seed sounds like the right way to go. Thank you both for the quick responses.

mjschwenne · 2022-08-03T14:58:14Z

Changing it to use a seed would be the best solution, but I believe that it already is? Line 742 in the test file of the 2.8.5 release does wrap the asadpour_atsp function and provides a seed value of 19 for the failing test unless I'm seriously mistaken.

dschult · 2022-08-04T00:10:47Z

Whoops! Sorry @mjschwenne I stand corrected. The seed is set in the test function. Not sure what I was looking at. :}

I have extracted the test function and run it many times with the same results. I have also added a print statement to show the next random number generated after it finishes and I get the same number from (0,1) many many times within a for loop. I'm not sure how best to debug this. @jamesjer can you repeatedly run this code in the environment the test is run in?

import networkx as nx
import numpy as np
edge_list = [
        (0, 1, 100),
        (0, 2, 100),
        (0, 5, 1),
        (1, 2, 100),
        (1, 4, 1),
        (2, 3, 1),
        (3, 4, 100),
        (3, 5, 100),
        (4, 5, 100),
        (1, 0, 100),
        (2, 0, 100),
        (5, 0, 1),
        (2, 1, 100),
        (4, 1, 1),
        (3, 2, 1),
        (4, 3, 100),
        (5, 3, 100),
        (5, 4, 100),
    ]

G = nx.DiGraph()
G.add_weighted_edges_from(edge_list)

def fixed_asadpour(G, weight):
    return nx_app.asadpour_atsp(G, weight, seed=19)
    
for i in range(15):
    nx_app.traveling_salesman_problem(G, weight="weight",method=fixed_asadpour)

rossbar · 2022-11-15T19:14:41Z

Are you still seeing this issue @jamesjer ? Have you tried NX>2.8.5? I can't reproduce it locally either.

thesamesam · 2024-02-20T07:05:41Z

I think we might be hitting this in Gentoo on x86 (https://bugs.gentoo.org/921958).

If i run that script above, I get:

$ python /tmp/foo.py
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]
[3, 2, 1, 0, 5, 4, 3]

But for me, I seem to always get the failure (and the same output from the script).

The test gets a valid solution but an unexpected on on x86. It's a somewhat long-standing failure, so skip for now on x86. Bug: networkx/networkx#5913 Bug: https://bugs.gentoo.org/921958 Signed-off-by: Sam James <sam@gentoo.org>

mjschwenne · 2024-02-21T19:11:29Z

What version of networkx and python are you using?

jamesjer · 2024-02-22T00:09:36Z

I haven't seen this happen for quite awhile, but it did on a recent build of networkx 3.2.1 with python 3.12.1. It does not happen on every build. Running the script above in the build environment produces nothing but [3, 2, 0, 1, 4, 5, 3], every time.

Is there any parallelism in the test suite?

rossbar · 2024-02-22T07:37:29Z

I took another look at the implementation and the first thing that jumped out as a potential point of variability is the use of scipy.optimize.linprog in the held_karp_ascent function. By default, linprog uses the "highs" method, which chooses between one of two sub-methods, either highs-ipm or highs-ds - see the linprog("highs") method docstring for details.

The docstring mentions that when the default "highs" is specified, the submethod is chosen by scipy. I didn't chase down the selection logic, but perhaps it's possible different methods are being selected on different platforms. Another possibility is that the methods give different results on different platforms - they are binary extensions after all, so their building/packaging may have something to do with it.

If someone were going to dive into this further, that's where I'd start looking. In the end though, there may not be a whole lot we can do about it. If we're lucky, perhaps explicitly specifying one of highs-ds or highs-ipm removes the variability. The worst case scenario, such as it is, is that we cannot guarantee exact reproducibility for cases where there are multiple valid solutions. IMV this is not the end of the world. I'd be inclined to just document it and update the test to include all valid solutions to the test case!

thesamesam · 2024-02-22T08:03:16Z

What version of networkx and python are you using?

networkx-3.2.1 w/ python-3.11.8

mjschwenne · 2024-03-07T02:14:03Z

Well, I'm still not sure what is causing this issue. I'm using the NixOS compiled version of both numpy and scipy and I was still unable to reproduce the bug by changing the scipy.linprog method. I tend to agree that Rossbar that this was probably the most likely point of origin for the bug, pretty much everything outside of that call is python only and should be fairly robust to platform specific changes.

Either way, since the new tour returned is equally valid, I've added the reported path to the test and fixed the method to "highs-ipm". That should fix the bug, but since I can't reproduce the problem I'd really appreciate if you could test the update.

dschult added this to the networkx-2.8.6 milestone Aug 3, 2022

jarrodmillman modified the milestones: networkx-2.8.6, networkx-2.8.7 Aug 19, 2022

jarrodmillman modified the milestones: networkx-2.8.7, networkx-3.0 Sep 30, 2022

rossbar removed this from the networkx-3.0 milestone Nov 15, 2022

rossbar added the type: Bug fix label Feb 20, 2024

mjschwenne mentioned this issue Mar 7, 2024

Add new test result to test_asadpour_tsp and change linprog method #7335

Merged

dschult closed this as completed in #7335 Mar 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failing traveling salesman test #5913

Failing traveling salesman test #5913

jamesjer commented Aug 2, 2022

mjschwenne commented Aug 3, 2022

dschult commented Aug 3, 2022

jamesjer commented Aug 3, 2022

mjschwenne commented Aug 3, 2022

dschult commented Aug 4, 2022

rossbar commented Nov 15, 2022

thesamesam commented Feb 20, 2024

mjschwenne commented Feb 21, 2024

jamesjer commented Feb 22, 2024

rossbar commented Feb 22, 2024

thesamesam commented Feb 22, 2024

mjschwenne commented Mar 7, 2024

Failing traveling salesman test #5913

Failing traveling salesman test #5913

Comments

jamesjer commented Aug 2, 2022

Current Behavior

Expected Behavior

Steps to Reproduce

Environment

mjschwenne commented Aug 3, 2022

dschult commented Aug 3, 2022

jamesjer commented Aug 3, 2022

mjschwenne commented Aug 3, 2022

dschult commented Aug 4, 2022

rossbar commented Nov 15, 2022

thesamesam commented Feb 20, 2024

mjschwenne commented Feb 21, 2024

jamesjer commented Feb 22, 2024

rossbar commented Feb 22, 2024

thesamesam commented Feb 22, 2024

mjschwenne commented Mar 7, 2024