Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] (Nondeterministic) incorrect query result on subgraphs #2675

Closed
sighingnow opened this issue May 8, 2023 · 4 comments · Fixed by #2956
Closed

[BUG] (Nondeterministic) incorrect query result on subgraphs #2675

sighingnow opened this issue May 8, 2023 · 4 comments · Fixed by #2956
Assignees
Labels
bug Something isn't working component:gie good first issue Good for newcomers

Comments

@sighingnow
Copy link
Collaborator

Describe the bug

We recently start to notice non-deterministic incorrect query result on subgraphs in CI, e.g., https://github.com/alibaba/GraphScope/actions/runs/4908818788/jobs/8766031056#step:5:7451

=================================== FAILURES ===================================
__________________________ test_modern_graph[2-2-ON] ___________________________

parallel_executors = 'ON', num_workers = 2, threads_per_worker = 2

    @pytest.mark.parametrize(
        "parallel_executors",
        ["ON", "OFF"],
    )
    @pytest.mark.parametrize(
        "num_workers,threads_per_worker",
        [
            (1, 1),
            (1, 2),
            (2, 1),
            (2, 2),
        ],
    )
    def test_modern_graph(parallel_executors, num_workers, threads_per_worker):
        import vineyard
    
        def make_nodes_set(nodes):
            return {
                item.get("id", [None])[0]: {k: v for k, v in item.items() if k}
                for item in nodes
            }
    
        def make_edges_set(edges):
            return {item.get("eid", [None])[0]: item for item in edges}
    
        def subgraph_roundtrip(num_workers, threads_per_worker):
            logger.info(
                "testing subgraph with %d workers and %d threads per worker",
                num_workers,
                threads_per_worker,
            )
    
            vquery = "g.V().valueMap()"
            equery = "g.E().valueMap()"
            session = graphscope.session(cluster_type="hosts", num_workers=num_workers)
    
            g0 = load_modern_graph(session)
            interactive0 = session.gremlin(g0)
            nodes = interactive0.execute(vquery).all()
            edges = interactive0.execute(equery).all()
    
            logger.info("nodes = %s", nodes)
            logger.info("edges = %s", edges)
            assert len(nodes) == 6
            assert len(edges) == 6
    
            g1 = interactive0.subgraph("g.E()")
            interactive0.close()
            interactive1 = session.gremlin(g1)
            subgraph_nodes = interactive1.execute(vquery).all()
            subgraph_edges = interactive1.execute(equery).all()
            logger.info("subgraph nodes = %s", subgraph_nodes)
            logger.info("subgraph edges = %s", subgraph_edges)
            interactive1.close()
    
            assert make_nodes_set(nodes) == make_nodes_set(subgraph_nodes)
            assert make_edges_set(edges) == make_edges_set(subgraph_edges)
            session.close()
    
        with vineyard.envvars(
            {
                "RUST_LOG": "debug",
                "THREADS_PER_WORKER": "%d" % (threads_per_worker,),
                "PARALLEL_INTERACTIVE_EXECUTOR_ON_VINEYARD": parallel_executors,
            }
        ):
>           subgraph_roundtrip(num_workers, threads_per_worker)

../../../.local/lib/python3.10/site-packages/graphscope/tests/minitest/test_min.py:319: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

num_workers = 2, threads_per_worker = 2

    def subgraph_roundtrip(num_workers, threads_per_worker):
        logger.info(
            "testing subgraph with %d workers and %d threads per worker",
            num_workers,
            threads_per_worker,
        )
    
        vquery = "g.V().valueMap()"
        equery = "g.E().valueMap()"
        session = graphscope.session(cluster_type="hosts", num_workers=num_workers)
    
        g0 = load_modern_graph(session)
        interactive0 = session.gremlin(g0)
        nodes = interactive0.execute(vquery).all()
        edges = interactive0.execute(equery).all()
    
        logger.info("nodes = %s", nodes)
        logger.info("edges = %s", edges)
        assert len(nodes) == 6
        assert len(edges) == 6
    
        g1 = interactive0.subgraph("g.E()")
        interactive0.close()
        interactive1 = session.gremlin(g1)
        subgraph_nodes = interactive1.execute(vquery).all()
        subgraph_edges = interactive1.execute(equery).all()
        logger.info("subgraph nodes = %s", subgraph_nodes)
        logger.info("subgraph edges = %s", subgraph_edges)
        interactive1.close()
    
>       assert make_nodes_set(nodes) == make_nodes_set(subgraph_nodes)
E       AssertionError: assert {1: {'age': [...'josh']}, ...} == {2: {'age': [...': ['peter']}}
E         Omitting 3 identical items, use -vv to show
E         Left contains 3 more items:
E         {1: {'age': [29], 'id': [1], 'name': ['marko']},
E          3: {'id': [3], 'lang': ['java'], 'name': ['lop']},
E          5: {'id': [5], 'lang': ['java'], 'name': ['ripple']}}
E         Full diff:
E           {
E         +  1: {'age': [29],
E         +      'id': [1],
E         +      'name': ['marko']},
E            2: {'age': [27],
E                'id': [2],
E                'name': ['vadas']},
E         +  3: {'id': [3],
E         +      'lang': ['java'],
E         +      'name': ['lop']},
E            4: {'age': [32],
E                'id': [4],
E                'name': ['josh']},
E         +  5: {'id': [5],
E         +      'lang': ['java'],
E         +      'name': ['ripple']},
E            6: {'age': [35],
E                'id': [6],
E                'name': ['peter']},
E           }

../../../.local/lib/python3.10/site-packages/graphscope/tests/minitest/test_min.py:308: AssertionError
=============================== warnings summary ===============================
../../../.local/lib/python3.10/site-packages/graphscope/tests/conftest.py:828
  /Users/runner/.local/lib/python3.10/site-packages/graphscope/tests/conftest.py:828: PytestUnknownMarkWarning: Unknown pytest.mark.timeout - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    item.add_marker(pytest.mark.timeout(600))

.local/lib/python3.10/site-packages/graphscope/tests/minitest/test_min.py: 11 warnings
  /Users/runner/.local/lib/python3.10/site-packages/graphscope/client/session.py:808: DeprecationWarning: currentThread() is deprecated, use current_thread() instead
    if threading.currentThread() is threading.main_thread():

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED ../../../.local/lib/python3.10/site-packages/graphscope/tests/minitest/test_min.py::test_modern_graph[2-2-ON] - AssertionError: assert {1: {'age': [...'josh']}, ...} == {2: {'age': [...': ['peter']}}
  Omitting 3 identical items, use -vv to show
  Left contains 3 more items:
  {1: {'age': [29], 'id': [1], 'name': ['marko']},
   3: {'id': [3], 'lang': ['java'], 'name': ['lop']},
   5: {'id': [5], 'lang': ['java'], 'name': ['ripple']}}
  Full diff:
    {
  +  1: {'age': [29],
  +      'id': [1],
  +      'name': ['marko']},
     2: {'age': [27],
         'id': [2],
         'name': ['vadas']},
  +  3: {'id': [3],
  +      'lang': ['java'],
  +      'name': ['lop']},
     4: {'age': [32],
         'id': [4],
         'name': ['josh']},
  +  5: {'id': [5],
  +      'lang': ['java'],
  +      'name': ['ripple']},
     6: {'age': [35],
         'id': [6],
         'name': ['peter']},
    }
@sighingnow sighingnow added bug Something isn't working component:gie labels May 8, 2023
@longbinlai
Copy link
Collaborator

@BingqingLyu Can you have a look?

@sighingnow
Copy link
Collaborator Author

@BingqingLyu
Copy link
Collaborator

I've tried several times to reproduce the bug but failed (commit 7a5ed13):

image

@sighingnow
Copy link
Collaborator Author

Fixed by #2956

sighingnow added a commit that referenced this issue Jun 30, 2023
#2956)

## What do these changes do?

In some cases, the server list is in different order inside different
process, and yield an unexpected partition dispatch plan.

Fixes #2675

Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component:gie good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants