pd: validate PD list #1201

overvenus · 2016-10-21T08:05:11Z

validate_pd_list only accepts PD endpoints that are in the same PD cluster. It treats a removed PD as an invalied PD node. ~~It only validates once at the beginning of raftkv startup.~~ It validates the list every time when the tikv trys to connect a PD.

Let's say there are two PD clusters, one consists three nodes, and the other is a standlone node. Details:

cluster-id: 1
pd1 client urls: "127.0.0.1:12379"
pd2 client urls: "127.0.0.1:22379"
pd3 client urls: "127.0.0.1:32379"

cluster-id: 1
pd1 client urls: "127.0.0.1:42379"

Following are cases tested so far:

("127.0.0.1:12379,127.0.0.1:22379,127.0.0.1:32379".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:22379, 127.0.0.1:32379".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:22379, 127.0.0.1:32379,".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:22379, 127.0.0.1:32379, ,".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:32379".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:32379,".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:32379,127.0.0.1:42379".to_owned(), false), // separate clusters
("127.0.0.1:12379, 127.0.0.1:12379, 127.0.0.1:32379,".to_owned(), false), // duplicate endpoints
("127.0.0.1:12379, 127.0.0.1:12379, 127.0.0.1:32379, 127.0.0.1:42379".to_owned(), false), // invalid endpoints
("127.0.0.1:12379, 127.0.0.1:12379, 127.0.0.1:32379, 127.0.0.1:22379".to_owned(), false), // duplicate endpoints

Explanation:

A tuple stands for a case
tuple.0 is the command line argument --pd
tuple.1 is the result of validate_pd_list
- true -> Ok(_)
- false -> Err(_)

PTAL @siddontang @BusyJay

Close #1186

BusyJay · 2016-10-21T08:37:51Z

src/util/mod.rs

+
+/// `validate_pd_list` validates PD list, it vists PDs members-API one by one.
+/// It returns Ok(()), only when all results are equal.
+pub fn validate_pd_list(endpoints: Vec<String>) -> Result<(), String> {


Can it be checked in try_connect of PdClientCore?

Yes, it can. Maybe it should be validated in try_connect too?

siddontang · 2016-10-21T11:55:18Z

@huachaohuang
can we support a GetMembers RPC so that we don't need use hyper here.

overvenus · 2016-10-27T06:01:09Z

PTAL @siddontang @huachaohuang @BusyJay

BusyJay · 2016-10-27T07:32:06Z

src/pd/client.rs

+        let mut stream = try!(rpc_connect(ep.as_str()));
+        let mut req = new_request(cluster_id, CommandType::GetPDMembers);
+        req.set_get_pd_members(GetPDMembersRequest::new());
+        let (id, mut resp) = try!(send_msg(&mut stream, VALIDATION_MSG_ID, &req));


What if some nodes are down? For example, 1, 2, 3 are still online, but 4, 5 are down, then the validate_and_connect may not succeed.

BusyJay · 2016-10-28T07:34:51Z

src/pd/client.rs

        }
    }

-    Err(box_err!("failed to connect to {:?}", hosts))
+    // Check all fields.
+    let sample = members_resps.pop().unwrap();


Is it safe to unwrap here?

overvenus · 2016-11-21T10:50:48Z

PTAL @siddontang @BusyJay

siddontang · 2016-11-21T13:27:15Z

src/pd/client.rs

+    }
+
+    let len = endpoints.len();
+    let mut endpoints_set = HashSet::with_capacity(len);


you can use sort + dedup to check duplicated urls.

I think hashset is ok here.

siddontang · 2016-11-21T13:29:43Z

src/pd/client.rs

 use super::metrics::*;

 const MAX_PD_SEND_RETRY_COUNT: usize = 100;
 const SOCKET_READ_TIMEOUT: u64 = 3;
 const SOCKET_WRITE_TIMEOUT: u64 = 3;

 const PD_RPC_PREFIX: &'static str = "/pd/rpc";
+const MSG_ID_VALIDATE: u64 = 0;


ID_VALIDATE is not proper here. You use it not for msg id, but for cluster id, and it is only used in GetPDMembers.
Maybe NONE_CLUSTER_ID is better here.

oh, my fault. You use it both for cluster and msg.

so you should use another cluster var for cluster id.
Btw, the ID here is not INVALID.

Oops, my bad, it is really confusing.

siddontang · 2016-11-21T13:34:03Z

src/pd/client.rs

+        }
+    }
+
+    Ok(())


we must check all PDs have same cluster ID too.

Btw, I think checking cluster ID is enough.

Checking cluster ID should be enough, but additional checks prevent ID collision.

The collision probability is too small, we must start two pd clusters at same time and with the same random number.

First of all, I agree the probability is small.

// Generate a random cluster ID. ts := uint64(time.Now().Unix()) clusterID := (ts << 32) + uint64(rand.Uint32())

Above is how PD generates it's cluster id.

Actually, we will always get the same random number, because PD does not seed the global random source.
So we may get collision if PDs are started at the same second!

Play the clusterID demo at https://play.golang.org/p/vCMUxx9q0E

cc @huachaohuang

@huachaohuang

Maybe we can use time + hash(initial-cluster) ?

@siddontang Good idea!

Another approach is:

// Generate a random cluster ID. rand.Seed(time.Now().UnixNano()) ts := uint64(time.Now().Unix()) clusterID := (ts << 32) + uint64(rand.Uint32())

siddontang · 2016-11-22T05:26:24Z

PTAL @huachaohuang

huachaohuang · 2016-11-22T06:42:34Z

src/pd/client.rs

+        }
+    }
+
+    Ok(())


Seems if all endpoints failed, this will just return OK?

huachaohuang · 2016-11-22T06:44:21Z

src/pd/client.rs

+            .filter(|s| !s.is_empty())
+            .collect();
+
+        try!(validate_endpoints(&endpoints));


We need to add some retries here, see line 231.
If PD and TiKV start at the same time, TiKV may fail if PD is not ready yet.

huachaohuang · 2016-11-22T09:38:04Z

LGTM

siddontang · 2016-11-22T13:42:08Z

src/pd/client.rs

                Ok(id) => {
-                    client.cluster_id = id;
-                    return Ok(client);
+                    cluster_id = id;


break the loop

siddontang · 2016-11-24T03:28:39Z

src/pd/client.rs

+        // Check cluster ID.
+        let cid = resp.take_header().get_cluster_id();
+        if let Some(sample) = cluster_id {
+            if sample != cid {


any test to cover cluster mismatched?

siddontang · 2016-11-24T03:30:31Z

src/pd/client.rs

+        }
+
+        // Check all fields.
+        let mut members = resp.take_get_pd_members().take_members().into_vec();


I am confused that why do we need to check this. Is it necessary?

Or you can give some test for it and let me how this works.

siddontang · 2016-11-29T02:38:12Z

Ping @overvenus

overvenus · 2016-11-30T12:41:53Z

PTAL @siddontang @BusyJay

siddontang · 2016-11-30T12:46:24Z

tests/pd/test_rpc_client.rs

+    };
+    let endpoints = [endpoints_1, endpoints_2];
+
+    assert_eq!(RpcClient::validate_endpoints(&endpoints).is_err(), true);


assert!(.is_err())

siddontang · 2016-11-30T12:47:21Z

@huachaohuang I suggest mocking a PD server, so we don't need to start the real PD in CI. But I think we can use real PD now.

siddontang · 2016-11-30T12:55:23Z

PTAL @huachaohuang

Rest LGTM

huachaohuang · 2016-12-01T02:52:20Z

CI failed @overvenus
Rest LGTM

* src/storage: scheduler set current region for blacklist Signed-off-by: Evan Zhou <coocood@gmail.com> * set region id before async_write Signed-off-by: Evan Zhou <coocood@gmail.com> --------- Signed-off-by: Evan Zhou <coocood@gmail.com>

BusyJay reviewed Oct 21, 2016

View reviewed changes

overvenus force-pushed the validate-pd-list branch from 71cb499 to b1060b8 Compare October 21, 2016 11:51

huachaohuang mentioned this pull request Oct 21, 2016

Add RPC to get PD members tikv/pd#355

Closed

overvenus changed the title ~~util, bin: validate PD list before starting raftkv~~ [WIP] util, bin: validate PD list before starting raftkv Oct 22, 2016

This was referenced Oct 22, 2016

proto/pdpb.proto: add GetPDMembers pingcap/kvproto#108

Merged

server: add get pd members cmd support tikv/pd#357

Merged

overvenus changed the title ~~[WIP] util, bin: validate PD list before starting raftkv~~ util, bin: validate PD list before starting raftkv Oct 27, 2016

BusyJay reviewed Oct 27, 2016

View reviewed changes

overvenus mentioned this pull request Oct 27, 2016

bootstrap cluster and generate a cluster id tikv/pd#362

Closed

BusyJay reviewed Oct 28, 2016

View reviewed changes

siddontang changed the title ~~util, bin: validate PD list before starting raftkv~~ [DNM] util, bin: validate PD list before starting raftkv Oct 28, 2016

validate_endpoints on RpcClient::new()

4b70a2c

overvenus force-pushed the validate-pd-list branch from bed7e3e to 4b70a2c Compare November 21, 2016 10:16

tweak

7459cd3

overvenus changed the title ~~[DNM] util, bin: validate PD list before starting raftkv~~ util, bin: validate PD list before starting raftkv Nov 21, 2016

siddontang reviewed Nov 21, 2016

View reviewed changes

Address comments

82990be

huachaohuang reviewed Nov 22, 2016

View reviewed changes

retry if all PD nodes failed to respond

5f511d3

overvenus changed the title ~~util, bin: validate PD list before starting raftkv~~ pd: validate PD list Nov 22, 2016

few tweaks

14aa91a

siddontang reviewed Nov 22, 2016

View reviewed changes

src/pd/client.rs

Ok(id) => {

client.cluster_id = id;

return Ok(client);

cluster_id = id;

Copy link

Contributor

siddontang Nov 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break the loop

address comments

57413c7

siddontang reviewed Nov 24, 2016

View reviewed changes

add a test

f96117a

overvenus force-pushed the validate-pd-list branch from e0ab476 to f96117a Compare November 30, 2016 10:17

siddontang reviewed Nov 30, 2016

View reviewed changes

address comments

a9b0a68

fix ci

20787e9

overvenus merged commit 20671bc into tikv:master Dec 1, 2016

overvenus deleted the validate-pd-list branch December 1, 2016 07:19

overvenus mentioned this pull request Dec 8, 2016

Add shuffle-leader-scheduler tikv/pd#409

Merged

overvenus mentioned this pull request Mar 27, 2017

Introducing PD gRPC Client #1591

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pd: validate PD list #1201

pd: validate PD list #1201

overvenus commented Oct 21, 2016 •

edited

Loading

BusyJay Oct 21, 2016

overvenus Oct 21, 2016

siddontang commented Oct 21, 2016

overvenus commented Oct 27, 2016

BusyJay Oct 27, 2016

BusyJay Oct 28, 2016

overvenus commented Nov 21, 2016

siddontang Nov 21, 2016

overvenus Nov 22, 2016

siddontang Nov 21, 2016

siddontang Nov 21, 2016

siddontang Nov 21, 2016

overvenus Nov 22, 2016 •

edited

Loading

siddontang Nov 21, 2016

siddontang Nov 21, 2016

overvenus Nov 22, 2016

siddontang Nov 22, 2016

overvenus Nov 22, 2016

siddontang Nov 22, 2016

overvenus Nov 22, 2016

siddontang commented Nov 22, 2016

huachaohuang Nov 22, 2016

huachaohuang Nov 22, 2016

huachaohuang commented Nov 22, 2016

siddontang Nov 22, 2016

siddontang Nov 24, 2016

siddontang Nov 24, 2016

siddontang commented Nov 29, 2016

overvenus commented Nov 30, 2016

siddontang Nov 30, 2016

siddontang commented Nov 30, 2016 •

edited

Loading

siddontang commented Nov 30, 2016

huachaohuang commented Dec 1, 2016

+                      }
+                  }
+                  Ok(())

pd: validate PD list #1201

pd: validate PD list #1201

Conversation

overvenus commented Oct 21, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siddontang commented Oct 21, 2016

overvenus commented Oct 27, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

overvenus commented Nov 21, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

overvenus Nov 22, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siddontang commented Nov 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huachaohuang commented Nov 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siddontang commented Nov 29, 2016

overvenus commented Nov 30, 2016

Choose a reason for hiding this comment

siddontang commented Nov 30, 2016 • edited Loading

siddontang commented Nov 30, 2016

huachaohuang commented Dec 1, 2016

overvenus commented Oct 21, 2016 •

edited

Loading

overvenus Nov 22, 2016 •

edited

Loading

siddontang commented Nov 30, 2016 •

edited

Loading