Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd: validate PD list #1201

Merged
merged 9 commits into from
Dec 1, 2016
Merged

pd: validate PD list #1201

merged 9 commits into from
Dec 1, 2016

Conversation

overvenus
Copy link
Member

@overvenus overvenus commented Oct 21, 2016

validate_pd_list only accepts PD endpoints that are in the same PD cluster. It treats a removed PD as an invalied PD node. It only validates once at the beginning of raftkv startup. It validates the list every time when the tikv trys to connect a PD.


Let's say there are two PD clusters, one consists three nodes, and the other is a standlone node. Details:

cluster-id: 1
pd1 client urls: "127.0.0.1:12379"
pd2 client urls: "127.0.0.1:22379"
pd3 client urls: "127.0.0.1:32379"
cluster-id: 1
pd1 client urls: "127.0.0.1:42379"

Following are cases tested so far:

("127.0.0.1:12379,127.0.0.1:22379,127.0.0.1:32379".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:22379, 127.0.0.1:32379".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:22379, 127.0.0.1:32379,".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:22379, 127.0.0.1:32379, ,".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:32379".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:32379,".to_owned(), true),
("127.0.0.1:12379, 127.0.0.1:32379,127.0.0.1:42379".to_owned(), false), // separate clusters
("127.0.0.1:12379, 127.0.0.1:12379, 127.0.0.1:32379,".to_owned(), false), // duplicate endpoints
("127.0.0.1:12379, 127.0.0.1:12379, 127.0.0.1:32379, 127.0.0.1:42379".to_owned(), false), // invalid endpoints
("127.0.0.1:12379, 127.0.0.1:12379, 127.0.0.1:32379, 127.0.0.1:22379".to_owned(), false), // duplicate endpoints

Explanation:

  • A tuple stands for a case
  • tuple.0 is the command line argument --pd
  • tuple.1 is the result of validate_pd_list
    • true -> Ok(_)
    • false -> Err(_)

PTAL @siddontang @BusyJay

Close #1186


/// `validate_pd_list` validates PD list, it vists PDs members-API one by one.
/// It returns Ok(()), only when all results are equal.
pub fn validate_pd_list(endpoints: Vec<String>) -> Result<(), String> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be checked in try_connect of PdClientCore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can. Maybe it should be validated in try_connect too?

@siddontang
Copy link
Contributor

@huachaohuang
can we support a GetMembers RPC so that we don't need use hyper here.

@overvenus overvenus changed the title util, bin: validate PD list before starting raftkv [WIP] util, bin: validate PD list before starting raftkv Oct 22, 2016
@overvenus overvenus changed the title [WIP] util, bin: validate PD list before starting raftkv util, bin: validate PD list before starting raftkv Oct 27, 2016
@overvenus
Copy link
Member Author

PTAL @siddontang @huachaohuang @BusyJay

let mut stream = try!(rpc_connect(ep.as_str()));
let mut req = new_request(cluster_id, CommandType::GetPDMembers);
req.set_get_pd_members(GetPDMembersRequest::new());
let (id, mut resp) = try!(send_msg(&mut stream, VALIDATION_MSG_ID, &req));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if some nodes are down? For example, 1, 2, 3 are still online, but 4, 5 are down, then the validate_and_connect may not succeed.

}
}

Err(box_err!("failed to connect to {:?}", hosts))
// Check all fields.
let sample = members_resps.pop().unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to unwrap here?

@siddontang siddontang changed the title util, bin: validate PD list before starting raftkv [DNM] util, bin: validate PD list before starting raftkv Oct 28, 2016
@overvenus overvenus changed the title [DNM] util, bin: validate PD list before starting raftkv util, bin: validate PD list before starting raftkv Nov 21, 2016
@overvenus
Copy link
Member Author

PTAL @siddontang @BusyJay

}

let len = endpoints.len();
let mut endpoints_set = HashSet::with_capacity(len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use sort + dedup to check duplicated urls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hashset is ok here.

use super::metrics::*;

const MAX_PD_SEND_RETRY_COUNT: usize = 100;
const SOCKET_READ_TIMEOUT: u64 = 3;
const SOCKET_WRITE_TIMEOUT: u64 = 3;

const PD_RPC_PREFIX: &'static str = "/pd/rpc";
const MSG_ID_VALIDATE: u64 = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ID_VALIDATE is not proper here. You use it not for msg id, but for cluster id, and it is only used in GetPDMembers.
Maybe NONE_CLUSTER_ID is better here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, my fault. You use it both for cluster and msg.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you should use another cluster var for cluster id.
Btw, the ID here is not INVALID.

Copy link
Member Author

@overvenus overvenus Nov 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, my bad, it is really confusing.

}
}

Ok(())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we must check all PDs have same cluster ID too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I think checking cluster ID is enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking cluster ID should be enough, but additional checks prevent ID collision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collision probability is too small, we must start two pd clusters at same time and with the same random number.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, I agree the probability is small.

// Generate a random cluster ID.
ts := uint64(time.Now().Unix())
clusterID := (ts << 32) + uint64(rand.Uint32())

Above is how PD generates it's cluster id.

Actually, we will always get the same random number, because PD does not seed the global random source.
So we may get collision if PDs are started at the same second!

Play the clusterID demo at https://play.golang.org/p/vCMUxx9q0E

cc @huachaohuang

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huachaohuang

Maybe we can use time + hash(initial-cluster) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siddontang Good idea!

Another approach is:

// Generate a random cluster ID.
rand.Seed(time.Now().UnixNano())
ts := uint64(time.Now().Unix())
clusterID := (ts << 32) + uint64(rand.Uint32())

@siddontang
Copy link
Contributor

PTAL @huachaohuang

}
}

Ok(())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems if all endpoints failed, this will just return OK?

.filter(|s| !s.is_empty())
.collect();

try!(validate_endpoints(&endpoints));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add some retries here, see line 231.
If PD and TiKV start at the same time, TiKV may fail if PD is not ready yet.

@overvenus overvenus changed the title util, bin: validate PD list before starting raftkv pd: validate PD list Nov 22, 2016
@huachaohuang
Copy link
Contributor

LGTM

Ok(id) => {
client.cluster_id = id;
return Ok(client);
cluster_id = id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break the loop

// Check cluster ID.
let cid = resp.take_header().get_cluster_id();
if let Some(sample) = cluster_id {
if sample != cid {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any test to cover cluster mismatched?

}

// Check all fields.
let mut members = resp.take_get_pd_members().take_members().into_vec();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused that why do we need to check this. Is it necessary?

Or you can give some test for it and let me how this works.

@siddontang
Copy link
Contributor

Ping @overvenus

@overvenus
Copy link
Member Author

PTAL @siddontang @BusyJay

};
let endpoints = [endpoints_1, endpoints_2];

assert_eq!(RpcClient::validate_endpoints(&endpoints).is_err(), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert!(.is_err())

@siddontang
Copy link
Contributor

siddontang commented Nov 30, 2016

@huachaohuang I suggest mocking a PD server, so we don't need to start the real PD in CI. But I think we can use real PD now.

@siddontang
Copy link
Contributor

PTAL @huachaohuang

Rest LGTM

@huachaohuang
Copy link
Contributor

CI failed @overvenus
Rest LGTM

@overvenus overvenus merged commit 20671bc into tikv:master Dec 1, 2016
@overvenus overvenus deleted the validate-pd-list branch December 1, 2016 07:19
@overvenus overvenus mentioned this pull request Mar 27, 2017
5 tasks
iosmanthus pushed a commit to iosmanthus/tikv that referenced this pull request Jan 4, 2024
* src/storage: scheduler set current region for blacklist

Signed-off-by: Evan Zhou <coocood@gmail.com>

* set region id before async_write

Signed-off-by: Evan Zhou <coocood@gmail.com>

---------

Signed-off-by: Evan Zhou <coocood@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants