Skip to content

After range splitting, the newly created range take near 2 seconds to accept command. #1384

Closed
@es-chow

Description

@es-chow

When running following command

./cockroach range split a
./cockroach range split b

a error will print:

split failed: range_command.go:793: range is already split at key "b" 

but the range split had succeed.
In log, the election time delay from the raft creating time is about 2 seconds (raftTick is 100ms, electionTimeoutTicks is 15 which defined in store.go), so only after 2 seconds, the newly created range will accept command like InternalResolveIntent, ConditonalPut, etc. but AdminSplit retry will failed with forementioned error.

I0616 21:22:26.578833   10431 raft.go:390] 100000001 became follower at term 5
I0616 21:22:26.578877   10431 raft.go:207] newRaft 100000001 [peers: [100000001,200000002,300000003], term: 5, commit: 10, applied: 10, lastindex: 10, lastterm: 5]
I0616 21:22:26.578934   10431 raft.go:620] 100000001 no leader at term 5; dropping proposal
I0616 21:22:28.151670   10431 txn_coord_sender.go:297] txn coordinator: 0.40 txn/sec, 100.00/0.00/0.00 %cmmt/abrt/abnd, 17ms/0/18ms avg/σ/max duration, 0.0/0.0/0.0 avg/σ/max restarts (2 samples)
I0616 21:22:28.549805   10431 raft.go:464] 100000001 is starting a new election at term 5
I0616 21:22:28.549870   10431 raft.go:403] 100000001 became candidate at term 6
I0616 21:22:28.549886   10431 raft.go:447] 100000001 received vote from 100000001 at term 6
I0616 21:22:28.549966   10431 raft.go:440] 100000001 [logterm: 5, index: 10] sent vote request to 200000002 at term 6
I0616 21:22:28.550048   10431 raft.go:440] 100000001 [logterm: 5, index: 10] sent vote request to 300000003 at term 6
I0616 21:22:28.551022   10431 raft.go:447] 100000001 received vote from 200000002 at term 6
I0616 21:22:28.551053   10431 raft.go:605] 100000001 [q:2] has received 2 votes and 0 vote rejections
I0616 21:22:28.551084   10431 raft.go:426] 100000001 became leader at term 6

Is it a way to shorten this time?

  1. issue a leader election after creating the raft group. When server restarting, raft group is created lazily in store.ProposeRaftCommand, so it's not a spike.
  2. shorten the election timeout in raft like:
func (r *raft) isElectionTimeout() bool {
        // if newly created range or in Election, use short timeout.
    if r.lead == None {
        return d > r.rand.Int()%r.electionTimeout
    }                                                           
    d := r.elapsed - r.electionTimeout
    if d < 0 {
        return false
    }
    return d > r.rand.Int()%r.electionTimeout
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions