Skip to content

Something happens with indexes or reverse references spontaneously being messed up or deleted #5160

@nodeworks

Description

@nodeworks

What version of Dgraph are you using?

Dgraph version : v1.2.0
Dgraph SHA-256 : 62e8eccb534b1ff072d4a516ee64f56b7601e3aeccb0b2f9156b0a8e4a1f8a12
Commit SHA-1 : 24b4b74
Commit timestamp : 2020-01-27 15:53:31 -0800
Branch : HEAD
Go version : go1.13.5

Have you tried reproducing the issue with the latest release?

No

What is the hardware spec (RAM, OS)?

Mac OSx using Docker:

  • Mac OSx Version: 10.15.3
  • Docker version:
    -- Engine: 19.03.8
    -- Compose: 1.25.4
    -- Docker Desktop: 2.2.0.5 (43884) Stable

Steps to reproduce the issue (command/config used to run Dgraph).

I am using Docker compose. The alpha instance is run with this command:

dgraph alpha --my=server:7080 --normalize_node_limit=10000000 --lru_mb=10240 --zero=zero:5080 --whitelist=172.21.0.1:172.30.1.1,127.0.0.1:127.0.2.1 --export=/exports --bindall=true --jaeger.collector=http://crimson_api_jaeger:14268 -p ./out/0/p

The zero instance is run with this command:

dgraph zero --my=zero:5080

I've attached my postings, schema, and rdfs. I did a bulk import from a backup about a week ago and it's been working fine until today. The bulk import command i'm using is:

docker exec -it crimson_api_zero dgraph bulk -f /exports/current/g01.rdf.gz -s /exports/current/g01.schema.gz --reduce_shards=1 --zero=localhost:5080

One standout way to test is to use this query (workflow_id) is an entity:

{
  q(func: type(workflow_id), orderdesc: workflow_id.id, first: 10000) {
    uid
    workflow_id.id
  }
}

results in this:

{
  "data": {
    "q": [
      {
        "uid": "0x958f2b",
        "workflow_id.id": 0
      },
      {
        "uid": "0x958f33",
        "workflow_id.id": 0
      },
      {
        "uid": "0x958f36",
        "workflow_id.id": 0
      },
      {
        "uid": "0x958f3a",
        "workflow_id.id": 0
      },
      {
        "uid": "0x958f62",
        "workflow_id.id": 0
      },
      {
        "uid": "0x958f68",
        "workflow_id.id": 0
      },
      {
        "uid": "0x958f6b",
        "workflow_id.id": 0
      },
      {
        "uid": "0x958f6f",
        "workflow_id.id": 0
      },
      {
        "uid": "0x95b0cd",
        "workflow_id.id": 0
      },
      {
        "uid": "0x95b0cf",
        "workflow_id.id": 0
      }
    ]
  }

then run this query:

{
  q(func: type(workflow_id)) {
    uid
    workflow_id.id
  }
}

which results in the data showing up correctly:

{
  "data": {
    "q": [
      {
        "uid": "0x958c75",
        "workflow_id.id": 750
      },
      {
        "uid": "0x958c76",
        "workflow_id.id": 454
      },
      {
        "uid": "0x958c77",
        "workflow_id.id": 565
      },
      {
        "uid": "0x958c78",
        "workflow_id.id": 608
      },
      {
        "uid": "0x958c79",
        "workflow_id.id": 203
      },
      {
        "uid": "0x958c7a",
        "workflow_id.id": 330
      },
      {
        "uid": "0x958c7b",
        "workflow_id.id": 494
      },
      {
        "uid": "0x958c7c",
        "workflow_id.id": 601
      },
      {
        "uid": "0x958c7d",
        "workflow_id.id": 85
      },
.......

You can see it doesn't make sense. The workflow_id.id field has an "int" index. This query was working correctly up until yesterday.

Expected behaviour and actual result.

See above.

Files:
Message me for files as they are sensitive and private

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/indexesRelated to indexes (indices) and their usage.kind/bugSomething is broken.status/needs-attentionThis issue needs more eyes on it, more investigation might be required before accepting/rejecting it

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions