Keep original nodes when adding discoveredHosts in SettingsUtils #256

bseibel · 2014-08-28T16:12:19Z

This fixes the case where all queries are hitting one node even when
the partition is setting a specific node.

This fixes the case where all queries are hitting one node even when the partition is setting a specific node.

costin · 2014-08-28T16:26:38Z

@bseibel thanks for the PR. Can you clarify though what's the issue that you are trying to address? We deliberately use the discovered nodes instead of the provided ones since the given ones can be masters or clients and if valid, will be included in the discovered ones.
Do you see certain load or abnormal behavior on certain nodes?

bseibel · 2014-08-28T16:43:56Z

Right, but since the list is being replaced by discovered nodes each partitions client will always select the same node to connect to, which means all traffic ends up getting routed through a single node in the cluster instead of directly to the node that has the shard.

costin · 2014-08-28T19:39:06Z

@bseibel the discovery nodes are used only to query information about the cluster state - once an index shard information has been retrieved, each task will communicate with the appropriate node/shard regardless of the discovered nodes.
That's in fact the reason we recommend folks set up only a few nodes to pick up the discovery - the heavy work is done on the target nodes.

Whether the discovered nodes or the specified ones are used - the metadata calls currently go in the same order. We can try and improve this by using some kind of randomizer for the discovery nodes but then again this is a one time, per job, lightweight call.

costin · 2014-08-28T19:40:18Z

@bseibel By the way, you can double check this by executing a query on a index across multiple nodes and see what connections are made - you can find this out by enabling logging (see the reference docs chapter).

bseibel · 2014-08-28T21:14:58Z

@costin You are correct in saying that each task should communicate with the appropriate node/shard regardless of the discovered nodes, I'm trying to say that this is not what is actually happening. :)

EsInputFormat (line 199):

settings.setHosts(esSplit.nodeIp).setPort(esSplit.httpPort);

These settings are used to create a RestRepository, which is passed to the RestClient constructor which does:

network = new NetworkClient(settings, SettingsUtils.nodes(settings));

which then passes in the list of hosts in the settings object with the previously discovered node list, because SettingsUtils.nodes uses the discovered list if it exists instead.

Thus each task's NetworkClient is going to select the same node, causing all queries to be directed to the same node in the cluster, even though we previously computed (determining splits) which node each split should optimally query.

bseibel · 2014-08-29T14:59:05Z

@costin Sorry I realize after a night of sleep that I've probably omitted a very important piece of useful information, this is happening when running as a Spark job :)

costin · 2014-09-01T21:45:02Z

@bseibel Hi. I've identified the issue (as well as improved the Hadoop code-base to be similar to that of native Spark) and pushed a fix plus additional logging to give insight into what is going on.
I'd like to do more testing to be sure the issue is fully resolved however the first batch passes just fine.

I've pushed a snapshot already in Maven for 2.1.0.BUILD-SNAPSHOT. Please give it a try and let me know whether it works for you or not.

bseibel · 2014-09-02T16:08:06Z

@costin Thanks! This takes care of the issue. And a much nicer fix than my one liner ;)

costin · 2014-09-02T17:09:11Z

Thanks for the feedback and the report - closing the issue.

See commit 8ba8fa6 Related #256

Keep original nodes when adding discoveredHosts in SettingsUtils

7114c6b

This fixes the case where all queries are hitting one node even when the partition is setting a specific node.

costin closed this in 8ba8fa6 Sep 1, 2014

costin reopened this Sep 1, 2014

costin added bug labels Sep 2, 2014

costin closed this Sep 2, 2014

costin mentioned this pull request Sep 2, 2014

Tasks are not correctly pinned to their target shards #258

Closed

bzz mentioned this pull request Sep 19, 2014

ArrayIndexOutOfBoundsException on Spark on 'pining' tasks #272

Closed

costin added a commit that referenced this pull request Oct 8, 2014

Fix corrent 'pinning' of tasks to target node

b17ebb9

See commit 8ba8fa6 Related #256

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep original nodes when adding discoveredHosts in SettingsUtils #256

Keep original nodes when adding discoveredHosts in SettingsUtils #256

bseibel commented Aug 28, 2014

costin commented Aug 28, 2014

bseibel commented Aug 28, 2014

costin commented Aug 28, 2014

costin commented Aug 28, 2014

bseibel commented Aug 28, 2014

bseibel commented Aug 29, 2014

costin commented Sep 1, 2014

bseibel commented Sep 2, 2014

costin commented Sep 2, 2014

Keep original nodes when adding discoveredHosts in SettingsUtils #256

Keep original nodes when adding discoveredHosts in SettingsUtils #256

Conversation

bseibel commented Aug 28, 2014

costin commented Aug 28, 2014

bseibel commented Aug 28, 2014

costin commented Aug 28, 2014

costin commented Aug 28, 2014

bseibel commented Aug 28, 2014

bseibel commented Aug 29, 2014

costin commented Sep 1, 2014

bseibel commented Sep 2, 2014

costin commented Sep 2, 2014