Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shortest path does not respect first ... sometimes #3663

Closed
campoy opened this issue Jul 12, 2019 · 5 comments
Closed

Shortest path does not respect first ... sometimes #3663

campoy opened this issue Jul 12, 2019 · 5 comments
Assignees
Labels
area/querylang/algos Related to graph algorithms, such as k-shortest path. kind/bug Something is broken. priority/P2 Somehow important but would not block a release. status/accepted We accept to investigate/work on it.
Milestone

Comments

@campoy
Copy link
Contributor

campoy commented Jul 12, 2019

If you suspect this could be a bug, follow the template.

  • What version of Dgraph are you using?

play.dgraph.io

  • Have you tried reproducing the issue with latest release?

Yes, it still fails

  • What is the hardware spec (RAM, OS)?

MacOS but also play.dgraph.io, so probably irrelevant

  • Steps to reproduce the issue (command/config used to run Dgraph).

Run a shortest path query and use first or offset to get a specific part of the path.

For instance, in order to find movies directed by Michael Bay where Megan Fox appears,
we can run this query on play.dgraph.io:

{
  path as shortest(from: 0x5526c6, to: 0x652d3f, depth: 2) {
    <director.film>
    <actor.film>
    <starring>
    <performance.actor>
  }
  
  names(func: uid(path)) {
    uid
    name@.
  }
}

This wils return one of the movies matching the description:

{
  "extensions": {...},
  "data": {
    "names": [
      {
        "uid": "0x5526c6",
        "name@.": "Michael Bay"
      },
      {
        "uid": "0x40b2f3",
        "name@.": "Transformers: Revenge of the Fallen"
      },
      {
        "uid": "0x4d6d44"
      },
      {
        "uid": "0x652d3f",
        "name@.": "Megan Fox"
      }
    ],
    "_path_": [
      {
        "director.film": [
          {
            "starring": [
              {
                "performance.actor": [
                  {
                    "uid": "0x652d3f"
                  }
                ],
                "uid": "0x4d6d44"
              }
            ],
            "uid": "0x40b2f3"
          }
        ],
        "uid": "0x5526c6"
      }
    ]
  }
}

Now, if I want to get only the name of the movie, I know that this is the second element of the path, so first:1, offset: 1 should work, but it doesn't.

Depending on the combination you might get different results.

Only offset

If I set offset: 1 I expect to see the name of the movie first (we skipped the director) followed by an anonymous node (the performance) and finally Megan Fox.

{
  path as shortest(from: 0x5526c6, to: 0x652d3f, depth: 2) {
    <director.film>
    <actor.film>
    <starring>
    <performance.actor>
  }
  
  names(func: uid(path),  offset: 1) {
    uid
    name@.
  }
}

Randomly, I get either the name of the movie and the performance but no Megan Fox:

[...]
  "data": {
    "names": [
      {
        "uid": "0x7111ab",
        "name@.": "Bad Boys II"
      },
      {
        "uid": "0x733703"
      }
    ],
    "_path_": [
      {
        "director.film": [
          {
            "starring": [
              {
                "performance.actor": [
                  {
                    "uid": "0x652d3f"
                  }
                ],
                "uid": "0x733703"
              }
            ],
            "uid": "0x7111ab"
          }
        ],
        "uid": "0x5526c6"
      }
    ]
  }
}

But also, sometimes, I'll get just Megan Fox

[...]
  "data": {
    "names": [
      {
        "uid": "0x652d3f",
        "name@.": "Megan Fox"
      }
    ],
    "_path_": [
      {
        "director.film": [
          {
            "starring": [
              {
                "performance.actor": [
                  {
                    "uid": "0x652d3f"
                  }
                ],
                "uid": "0x466767"
              }
            ],
            "uid": "0x1053f0"
          }
        ],
        "uid": "0x5526c6"
      }
    ]
  }
}

Only first

If I set the request to use first: 1 the query works correctly and only returns Michael Bay.

But ... if I set it to two, randomness comes back and sometimes I'll get the right result:

{
  path as shortest(from: 0x5526c6, to: 0x652d3f, depth: 2) {
    <director.film>
    <actor.film>
    <starring>
    <performance.actor>
  }
  
  names(func: uid(path),  first: 2) {
    uid
    name@.
  }
}
[...]
  "data": {
    "names": [
      {
        "uid": "0x5526c6",
        "name@.": "Michael Bay"
      },
      {
        "uid": "0x7111ab",
        "name@.": "Bad Boys II"
      }
    ],
    "_path_": [
      {
        "director.film": [
          {
            "starring": [
              {
                "performance.actor": [
                  {
                    "uid": "0x652d3f"
                  }
                ],
                "uid": "0x733703"
              }
            ],
            "uid": "0x7111ab"
          }
        ],
        "uid": "0x5526c6"
      }
    ]
  }
}

And sometimes I'll get Michael Bay and Megan Fox, skipping both the movie and the performance!

[...]
  "data": {
    "names": [
      {
        "uid": "0x5526c6",
        "name@.": "Michael Bay"
      },
      {
        "uid": "0x652d3f",
        "name@.": "Megan Fox"
      }
    ],
    "_path_": [
      {
        "director.film": [
          {
            "starring": [
              {
                "performance.actor": [
                  {
                    "uid": "0x652d3f"
                  }
                ],
                "uid": "0x466767"
              }
            ],
            "uid": "0x1053f0"
          }
        ],
        "uid": "0x5526c6"
      }
    ]
  }
}

Lastly, just for fun let's use first and offset together!

The query I wanted to perform to get the second element was first: 1, offset: 1.

{
  path as shortest(from: 0x5526c6, to: 0x652d3f, depth: 2) {
    <director.film>
    <actor.film>
    <starring>
    <performance.actor>
  }
  
  names(func: uid(path), offset: 1, first: 1) {
    uid
    name@.
  }
}

This will again sometimes work:

[...]
  "data": {
    "names": [
      {
        "uid": "0x7111ab",
        "name@.": "Bad Boys II"
      }
    ],
    "_path_": [
      {
        "director.film": [
          {
            "starring": [
              {
                "performance.actor": [
                  {
                    "uid": "0x652d3f"
                  }
                ],
                "uid": "0x733703"
              }
            ],
            "uid": "0x7111ab"
          }
        ],
        "uid": "0x5526c6"
      }
    ]
  }
}

But sometimes it won't:

[...]
  "data": {
    "names": [
      {
        "uid": "0x652d3f",
        "name@.": "Megan Fox"
      }
    ],
    "_path_": [
      {
        "director.film": [
          {
            "starring": [
              {
                "performance.actor": [
                  {
                    "uid": "0x652d3f"
                  }
                ],
                "uid": "0x4d6d44"
              }
            ],
            "uid": "0x40b2f3"
          }
        ],
        "uid": "0x5526c6"
      }
    ]
  }
}
@campoy campoy added kind/bug Something is broken. priority/P1 Serious issue that requires eventual attention (can wait a bit) status/accepted We accept to investigate/work on it. area/querylang/algos Related to graph algorithms, such as k-shortest path. labels Jul 12, 2019
@campoy campoy added this to the Dgraph v1.2 milestone Jul 12, 2019
@pawanrawal pawanrawal self-assigned this Jul 31, 2019
@pawanrawal
Copy link
Contributor

pawanrawal commented Jul 31, 2019

I had a deeper look into this and the problem here is that we don't preserve the order of UIDs in the uidMatrix when populating UIDs from a variable that was the result of the shortest path query. We sort the list of UIDs before doing pagination. In general all uid variables are supposed to contain sorted lists whereas in the case of the shortest path, they are not guaranteed to and the order matters. We'll have to add a way to find out the cases in which we should preserve the order.

This is the step where the order of elements in the uidMatrix is changed

sg.updateUidMatrix()

@campoy
Copy link
Contributor Author

campoy commented Aug 5, 2019

No need to work on this right now, but it's good to keep track of what we need to improve regarding k-shortest paths.

@manishrjain
Copy link
Contributor

Seems tricky. We'd need to make variables switch from Go maps to some sorted map implementation, which would be slower.

@campoy
Copy link
Contributor Author

campoy commented Sep 17, 2019

I do understand this is not easy to fix, but something needs to be done.

Maybe we should simply disable the usage of first and other pagination on variables coming from shortest path?
Or maybe the variables coming from shortest path should simply be different?

@campoy campoy added priority/P2 Somehow important but would not block a release. and removed priority/P1 Serious issue that requires eventual attention (can wait a bit) labels Oct 15, 2019
@minhaj-shakeel
Copy link
Contributor

Github issues have been deprecated.
This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

drawing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/querylang/algos Related to graph algorithms, such as k-shortest path. kind/bug Something is broken. priority/P2 Somehow important but would not block a release. status/accepted We accept to investigate/work on it.
Development

No branches or pull requests

4 participants