Unable to rebuild edge index #4166

kikimo · 2022-04-15T10:10:38Z

Please check the FAQ documentation before raising an issue

Describe the bug (required)

Unable to rebuild edge index.

way to reproduce:

Start a cluster of 5storage + 5replicas + 1part, create a space with one edge (no edge index at first)
keep insert edge and trigger network partition
create edge index and rebuild edge index, it will faile
stop insertint edge and recover network partition, and rebuild edge index, it always fail

What we found in the log:

585604 E20220418 11:28:30.456131    40 AddEdgesProcessor.cpp:361] Error! ret = E_LEADER_LEASE_FAILED, spaceId 1
585605 I20220418 11:28:30.456140    40 RaftPart.cpp:218] ===> OOPs, atomOp failed!!!, code = E_RAFT_ATOMIC_OP_FAILED
585606 E20220418 11:28:30.456156    40 AddEdgesProcessor.cpp:361] Error! ret = E_LEADER_LEASE_FAILED, spaceId 1
585607 I20220418 11:28:30.456161    40 RaftPart.cpp:218] ===> OOPs, atomOp failed!!!, code = E_RAFT_ATOMIC_OP_FAILED
585608 E20220418 11:28:30.456176    40 AddEdgesProcessor.cpp:361] Error! ret = E_LEADER_LEASE_FAILED, spaceId 1
585609 I20220418 11:28:30.456183    40 RaftPart.cpp:218] ===> OOPs, atomOp failed!!!, code = E_RAFT_ATOMIC_OP_FAILED
585610 E20220418 11:28:30.456197    40 AddEdgesProcessor.cpp:361] Error! ret = E_LEADER_LEASE_FAILED, spaceId 1
585611 I20220418 11:28:30.456204    40 RaftPart.cpp:218] ===> OOPs, atomOp failed!!!, code = E_RAFT_ATOMIC_OP_FAILED
585612 I20220418 11:28:34.883504    36 AdminTask.cpp:21] createAdminTask (79, 0)
585613 I20220418 11:28:34.883563    36 RebuildIndexTask.cpp:28] Rebuild index task is rate limited to 4194304 for each subtask by default
585614 I20220418 11:28:34.883694    36 AdminTaskManager.cpp:158] enqueue task(79, 0)
585615 I20220418 11:28:34.883728   131 AdminTaskManager.cpp:239] dequeue task(79, 0)
585616 I20220418 11:28:34.883819   131 AdminTaskManager.cpp:282] run task(79, 0), 1 subtasks in 1 thread
585617 I20220418 11:28:34.884032   131 AdminTaskManager.cpp:227] waiting for incoming task
585618 I20220418 11:28:34.884073   799 RebuildIndexTask.cpp:213] Modify the index failed
585619 I20220418 11:28:34.884099   799 RebuildIndexTask.cpp:97] Start building index
585620 I20220418 11:28:34.884121   799 RebuildEdgeIndexTask.cpp:58] Processing Part 1 Failed
585621 I20220418 11:28:34.884126   799 RebuildIndexTask.cpp:100] Building index failed
585622 I20220418 11:28:34.884130   799 AdminTaskManager.cpp:318] subtask of task(79, 0) finished, unfinished task 0
585623 I20220418 11:28:34.884135   799 AdminTask.h:129] task(79, 0) finished, rc=[E_REBUILD_INDEX_FAILED]
585624 I20220418 11:28:34.884284   132 AdminTaskManager.cpp:92] reportTaskFinish(), job=79, task=0, rc=E_REBUILD_INDEX_FAILED
585625 I20220418 11:28:34.888643   132 AdminTaskManager.cpp:134] reportTaskFinish(), job=79, task=0, rc=SUCCEEDED
585626 I20220418 11:28:50.798808    38 AdminTask.cpp:21] createAdminTask (80, 0)
585627 I20220418 11:28:50.798851    38 RebuildIndexTask.cpp:28] Rebuild index task is rate limited to 4194304 for each subtask by default
585628 I20220418 11:28:50.798928    38 AdminTaskManager.cpp:158] enqueue task(80, 0)
585629 I20220418 11:28:50.798934   131 AdminTaskManager.cpp:239] dequeue task(80, 0)
585630 I20220418 11:28:50.798987   131 RebuildIndexTask.cpp:66] This space is building index
585631 I20220418 11:28:50.798995   131 AdminTaskManager.cpp:258] job 80, genSubTask failed, err=E_REBUILD_INDEX_FAILED
585632 I20220418 11:28:50.799010   131 AdminTask.h:129] task(80, 0) finished, rc=[E_REBUILD_INDEX_FAILED]
585633 I20220418 11:28:50.799058   131 AdminTaskManager.cpp:227] waiting for incoming task
585634 I20220418 11:28:50.799137   132 AdminTaskManager.cpp:92] reportTaskFinish(), job=80, task=0, rc=E_REBUILD_INDEX_FAILED
585635 I20220418 11:28:50.800017   132 AdminTaskManager.cpp:134] reportTaskFinish(), job=80, task=0, rc=SUCCEEDED
585636 I20220418 11:28:55.531267    79 MetaClient.cpp:3062] Load leader of "store1":9779 in 0 space

Your Environments (required)

OS: uname -a
Compiler: g++ --version or clang++ --version
CPU: lscpu
Commit id 5626e64

How To Reproduce(required)

Steps to reproduce the behavior:

Step 1
Step 2
Step 3

Expected behavior

Additional context

The text was updated successfully, but these errors were encountered:

liuyu85cn · 2022-04-18T04:24:43Z

Rebuild index running on storage raft leader.
It will fail if its network isolated. (step3).

And, when raft leader's network partition, there should be a new leader.
But looks like the request keep sending to old leader. (step).

However, after wait for a while, we (with kikimo) found the request can be sent to new leader.
Now wait to see if it can run successfully.

critical27 · 2022-04-19T03:14:33Z

I could modify some logic here, perviously when meta call addTask it does not handle leader change. But REBUILD and some other kind of task do need to handle leader change

critical27 · 2022-04-19T06:37:56Z

After a little digging, the TaskManager in storage can't tell whether a part is leader or not, it only add a task into a queue, and return the response. In other words, it can't tell the leader until the job is actually executed.

So that's why we can only recover the task to the previous old leader. For now, if network partition happens and leader change, use can start a new job instead of recovering the old one as workarounds.

kikimo added the type/bug Type: something is unexpected label Apr 15, 2022

kikimo added this to the v3.1.0 milestone Apr 15, 2022

Sophie-Xie assigned critical27 and liuyu85cn and unassigned critical27 Apr 15, 2022

jamieliu1023 mentioned this issue Apr 16, 2022

Weekly Report 2022-04-15 vesoft-inc/nebula-community#104

Closed

Sophie-Xie assigned critical27 Apr 19, 2022

critical27 closed this as completed Apr 19, 2022

jamieliu1023 mentioned this issue Apr 23, 2022

Weekly Report 2022-04-22 vesoft-inc/nebula-community#105

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to rebuild edge index #4166

Unable to rebuild edge index #4166

kikimo commented Apr 15, 2022 •

edited

Loading

liuyu85cn commented Apr 18, 2022

critical27 commented Apr 19, 2022

critical27 commented Apr 19, 2022

Unable to rebuild edge index #4166

Unable to rebuild edge index #4166

Comments

kikimo commented Apr 15, 2022 • edited Loading

liuyu85cn commented Apr 18, 2022

critical27 commented Apr 19, 2022

critical27 commented Apr 19, 2022

kikimo commented Apr 15, 2022 •

edited

Loading