Skip to content

do not watch unwanted tier's data node in HttpServerInventoryView#6698

Closed
kaijianding wants to merge 2 commits intoapache:masterfrom
kaijianding:filter
Closed

do not watch unwanted tier's data node in HttpServerInventoryView#6698
kaijianding wants to merge 2 commits intoapache:masterfrom
kaijianding:filter

Conversation

@kaijianding
Copy link
Contributor

broker should not connect to realtime/historical node if node's tier is not in broker's watching tiers. Currently, broker will connect to every compute node, and do the filter in addSegment(), which is too late and is big waste, we should do the filter early in serverAdded().

would you like to review this? @himanshug

@fjy fjy added this to the 0.13.1 milestone Dec 3, 2018
@leventov leventov changed the title do not watch unwanted tier's compute node in HttpServerInventoryView do not watch unwanted tier's Historical node in HttpServerInventoryView Dec 3, 2018
@leventov leventov changed the title do not watch unwanted tier's Historical node in HttpServerInventoryView do not watch unwanted tier's data node in HttpServerInventoryView Dec 3, 2018
@gianm gianm requested a review from jihoonson December 4, 2018 21:56
@kaijianding kaijianding requested a review from himanshug December 8, 2018 02:00
Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem this PR tries to fix looks legit to me, but please check my comment. Also, would you please add a unit test to HttpServerInventoryViewTest?

synchronized (servers) {
DruidServerHolder holder = servers.get(server.getName());
if (holder == null) {
if (!finalPredicate.apply(Pair.of(server.getMetadata(), null))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes rhs of the predicate argument nullable now, but I believe nullable argument can make things complicated. How about adding a new interface like registerSegmentCallback(Executor exec, SegmentCallback callback, Predicate<DruidServerMetadata> predicate) to FilteredServerInventoryView?

@leventov leventov self-requested a review December 20, 2018 18:56
@jihoonson
Copy link
Contributor

@kaijianding would you please check my last comment?

@kaijianding
Copy link
Contributor Author

@jihoonson I found it is too complicated to implement the new interface you suggested. It need to add a finalServerPredicate like finalPredicate(which is an 'or' by defaultFilter and segmentPredicates, it should add a new 'or' by defaultServerFilter and serverPredicates) in 3 sub classes and many related classes like BrokerServerView, and apply finalServerPredicate before applying finalPredicate when filtering segments . And the default implement for the new interface is an empty function because a zk InventoryView implement doesn't actually need this.

My opinion is that we just do it simple and easy to understand by just let predicate argument nullable.

On the other hand, I think if (input.rhs == null) { return true; } is some kind of reasonable just from code perspective to make sure rhs is not null before input.rhs.getDataSource() is called

@jihoonson
Copy link
Contributor

I don't think the current change is easy to understand because

  1. DataSegment for the pair is nullable only in HttpServerInventoryView. This is because zk-based inventoryView doesn't need it as you said.
  2. It's not defined when DataSegment is null.
  3. As it's supposed from its name, segmentPredicate is intended for filtering segments, but it's being used for filtering servers in this PR.

It's also not good for code maintenance because it's easy to be broken if someone adds a new segmentPredicate which doesn't check null segment.

On the other hand, I think if (input.rhs == null) { return true; } is some kind of reasonable just from code perspective to make sure rhs is not null before input.rhs.getDataSource() is called

I also don't think it's reasonable to return true when rhs is null. This is related to 2). What does null segment mean?

I found it is too complicated to implement the new interface you suggested. It need to add a finalServerPredicate like finalPredicate(which is an 'or' by defaultFilter and segmentPredicates, it should add a new 'or' by defaultServerFilter and serverPredicates) in 3 sub classes and many related classes like BrokerServerView, and apply finalServerPredicate before applying finalPredicate when filtering segments . And the default implement for the new interface is an empty function because a zk InventoryView implement doesn't actually need this.

I'm not sure what're complicated. As you said, the default implementation would be empty and zk-based inventoryView doesn't have to implement it. Currently there's only one class adding these predicates which is BrokerServerView. Adding a new finalServerPredicate and using it look pretty straightforward to me.

@leventov
Copy link
Member

I agree with @jihoonson's concerns. Also

I found it is too complicated to implement the new interface you suggested.

is probably a sign that this part of the system is already too complicated or the original design doesn't meet the current needs, so we need to look how we could alter the design to make everything coherent and understandable again. See item 2 in the "Complexity & Understandability" section in the checklist.

@jon-wei jon-wei removed this from the 0.14.0 milestone Feb 15, 2019
@egor-ryashin
Copy link
Contributor

I wonder shouldn't the tiers be specified in a config accessible to HttpServerInventoryView? Because HttpServerInventoryView should actually know which tiers to watch.

@fjy fjy added this to the 0.15.0 milestone Mar 11, 2019
@himanshug
Copy link
Contributor

@kaijianding apologies for the gigantic delay. I think it is a great optimization opportunity and we should find a solution whether or not it is ideal.

I agree with @egor-ryashin comment about HttpServerInventoryView knowing about tiers/servers to watch . In fact, I think it could have been done that way for zookeeper based impl too because that is also unnecessarily watching the segments for servers it is continuously discarding after the fact. It is just that the cost is not as high in that case.

Simplest would be to get an instance of BrokerSegmentWatcherConfig injected in HttpServerInventoryView and use that. We will need to put some blurb in release notes for users to make sure and not accidentally have those properties set on coordinator node.

PS: If I was doing things from scratch then I would've called it "BaseServerInventoryViewConfig" and named the watched tier property name (and some others) accordingly.

@jihoonson does that sound reasonable to you ?

@jihoonson
Copy link
Contributor

It sounds simpler than mine. But now, I'm wondering that historicals shouldn't announce segments to the broker in the first place if they are not in the tier the broker is watching. @himanshug what do you think?

We will need to put some blurb in release notes for users to make sure and not accidentally have those properties set on coordinator node.

It would be best if coordinator can ignore these properties if they are set. Probably it's possible.

@himanshug
Copy link
Contributor

@jihoonson historicals don't announce to either coordinator/broker. coordinator/broker watch that information. so broker needs to stop getting(or discarding) the stuff it doesn't need.

a bit more detail....
in case of "zookeeper": historicals publish information about segments served in specific zookeeper nodes. coordinator/broker watch those paths.

in case of "http" , coordinator/broker send http request to get what segments are served by historical. historical does not need to announce segments anywhere in this case.
By default historicals are always announcing in zookeeper so that users can seamlessly change between zookeeper or http at broker/coordinator . If someone is using http at all coordinators/brokers, they can change historical config ( druid.announcer.skipSegmentAnnouncementOnZk=false) so that it doesn't announce segments in zookeeper anymore. That property is undocumented currently because of the non-production-ready official status of http stuff.

@jihoonson
Copy link
Contributor

Ah yeah. That sounds correct. Sorry I didn’t remember correctly what this PR is for because it has been quite a long time since I checked last time.

@jihoonson
Copy link
Contributor

@kaijianding would you please check @himanshug's comment? BTW, I'm untagging milestone since this issue is not necessarily a release blocker. Feel free to let me know if you think this should be.

@jihoonson jihoonson removed this from the 0.15.0 milestone May 7, 2019
@github-actions
Copy link

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions bot added the stale label Jul 29, 2023
@github-actions
Copy link

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants