New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
elasticsearch fails to start tribe node #14573
Comments
You're specifying the custom config file location incorrectly. |
Hi @clintongormley , thanks for quick response. As you can see from the description I've provided I was using option -Ddefault.path.conf. I tried again same command with option --path.conf. There was no exception because of config access issue, but I had to specify also --path.data and --path.logs because for some reason those settings were ignored in the config I've provided. In my config I also specify nonstandard ports to use and those settings are also not used. Any advise what can be wrong? Thanks, |
Looks like config is ignored completely. If I specify all options via command line I still get exception like above: # sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch --path.conf=/etc/tribe-elasticsearch/ --path.logs=/var/log/elasticsearch --path.data=/var/lib/elasticsearch/ --transport.tcp.port=9301 --http.port=9201 --network.host=0.0.0.0 --tribels.cluster.name=logstash-data --tribe.els.discovery.zen.ping.multicast.enabled=false --tribe.els.discovery.zen.ping.unicast.hosts=["10.128.69.48","10.128.75.237"] log4j:WARN No appenders could be found for logger (bootstrap). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" java.security.AccessControlException: access denied ("java.io.FilePermission" "/usr/share/elasticsearch/config/elasticsearch.yml" "read") at java.security.AccessControlContext.checkPermission(AccessControlContext.java:457) at java.security.AccessController.checkPermission(AccessController.java:884) at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) at java.lang.SecurityManager.checkRead(SecurityManager.java:888) at sun.nio.fs.UnixPath.checkRead(UnixPath.java:795) at sun.nio.fs.UnixFileSystemProvider.checkAccess(UnixFileSystemProvider.java:290) at java.nio.file.Files.exists(Files.java:2385) at org.elasticsearch.node.internal.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:87) at org.elasticsearch.node.Node.(Node.java:128) at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:145) at org.elasticsearch.tribe.TribeService.(TribeService.java:136) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at <<>> at org.elasticsearch.node.Node.(Node.java:198) at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:145) at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:170) at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:270) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35) |
Thanks for persisting. I've managed to replicate this and it is indeed a bug. When the tribe node attempts to instantiate a node for the tribe service, it checks for access to the config directory, but that setting is no longer available to it and so it defaults to checking for path.home. This can be replicated with a simple config file, saved as
Start elasticsearch as:
And it fails with:
|
@javanna could you take a look at this please? |
I had a look at this. Only selected settings are forwarded to the inner tribe clients from the tribe node. |
If I understand the tribe node correctly, it is no different than any other client node (well, creating multiple client nodes internally). So to me, it should be passing along any settings it needs to configure the node (including |
@rjernst it doesn't have to do directly with the transport client, but the inner tribe nodes have a similar requirement when it comes to loading from config file. They should not be reading out of the config file but only inherit some selected settings from their "parent" node (the actual tribe node), and that is why we were previously setting |
I looked deeper, I can confirm this is not just a problem around passing in the right |
+1 the config used
|
There is a workaround for this bug. Assuming your tribe config directory is
Then edit
Then start elasticsearch as:
The tribe node will use |
Workaround above works, the only caveat is that depending on where the additional empty configuration file is located, we might not have the permissions to read from it. I think it should work if we simply add an empty configuration file under the tribe node config and point right to it, not only specifying its parent directory but the complete path that includes the filename:
|
Ran into this last night when attempting to set up a tribe node on 2.0. This will also affect users who attempt to set a custom transport.tcp.port for the tribe node. In this case, setting a custom transport.tcp.port for the tribe node causes a misleading
Settings for the 2 clusters:
and
The problem is that the tribe node will not start up as long as I have the transport.tcp.port: 11111 in place. If I don't set a custom transport port for the tribe node, it starts up fine and can connect with the 2 clusters. The following is the error that shows up when I attempt to set transport.tcp.port for the tribe node. Note that prior to starting the tribe node, I used lsof to confirm that there's no process on the machine using port 11111 (and it doesn't matter what port I set it to, as long as transport.tcp.port is set for the tribe node, it will throw the same bind exception).
Note that I cannot reproduce this on 1.7.2. On 1.7.2, I can set up a custom transport.tcp.port for the tribe node and it will start up fine. |
@ppf2 this happens because the tribe node process will start three nodes, the first one will get the configured port, and the second will try to get the same one as it reads from the same configuration file. The workaround provided by Clint above should work till we fix this properly. |
@javanna I am going to explore having the tribe node have its own subclass of Node which can customize this single behavior (how to get the node's settings). I don't think we should add back this general purpose flag as we need to keep the tons of ways Nodes can be configured to a minimum. |
@rjernst thanks that sounds good to me. |
Confirmed that the workaround works to prevent the BindTransportException error, thx! |
@rjernst Do we have a sense of whether the fix will make it to the upcoming 2.1 release? Or will it likely be after 2.1 (i.e. use the workaround until a later 2.x release)? |
@ppf2 Definitely after 2.1. I would not want to destabilize 2.1 with a refactoring like this. |
@rjernst sounds good, thx! |
This requires some fairly extensive changes, so we will target this for 2.2. In the meantime, we should document the workaround in the 2.1 docs. |
I opened a PR to fix this here: #15300. Note that I was able to do the fix simply enough that I think it will be ok to backport to 2.1.x |
thanks @rjernst |
Thanks @rjernst ! |
I'm late to the party but thought this might be useful for anyone coming across this. I found that the dummy config file isn't needed to work around the issue. Instead for creating a new directory (/etc/tribe-client in the example) path.conf can reference the current configuration directory. Using the above example where the config directory was /etc/tribe arbitrary configtransport.tcp.port: 9301 tribe: |
Is this fixed in 2.1.1? |
With v2.1.1, I still have to specify path.conf and I used the valid path as mentioned above by lb425. In my case, I also had to specify path.plugins for similar reason. Otherwise, I kept getting AccessControlException error. I did not have to specify both path.conf and path.plugins when I was using v1.7.3 |
WRT ES v2.1.1, I have to do the following to get the tribe node talking to two different clusters: cluster A and cluster B # tribe node's configuration (elasticsearch.yml) tribe.t1.cluster.name: repeat the same block but replace "t1" to "t2" for cluster B and fill in proper info related to cluster B but keep the tribe.t2.network.* the same with different tribe.t2.transport.tcp.port value from t1 if specified |
@thn-dev Setting network and path settings for tribe nodes (the t1, t2 here) should not be necessary. Can you share your full elasticsearch.yml for both the tribe node, as well as cluster A and cluster B? |
@rjernst I did not have to do network and path settings when I was using v1.7.3. It was a surprise to me when v2.1.1 kept giving me AccessControlException error message. Initially, it pointed to the "plugins" location, after I set it, it complained about the "config" location. If I did not do the network settings for t1 and t2, it was not able to connect to cluster A and/or B. This part is weird too. Again, I did not have to do this in v1.7.3. All ES instances are installed using .rpm file, not .zip file. My settings for tribe node is above with additional parameters
Cluster A and B, each has 1 master node, 3 data nodes with the following parameters' settings (I don't have all information with me at the moment)
|
@thn-dev I tried a very minimal configuration with both 2.1.1, and the 2.2 branch. The tribe settings necessary were only |
@rjernst Thank you for looking into it. As I mentioned before, I did not have to do that in v1.7.3. One thing I do know when I upgraded ES from v1.7.3 to 2.1.1, I did "rpm -Uvh elasticsearch-.rpm" instead of removing v1.7.3 completely. Everything that I have described so far is running in CentOS 6.5 or 6.7. I'm in the middle of doing a stress test right now, once I have the opportunity to redo the cluster, I will report back if installing ES 2.1.1 from scratch would make a difference or not. Once again, thank you. |
Still broken in ES 2.2.0. Simple setup. One tribe node, and a cluster of 7 nodes, all running ES 2.2.0. Config for tribe.
Cluster is up and running, reachable. Log from tribe node when ES is started.
|
The problem lies in the publish address of the tribe's client node - it's local host which prevents people from connecting back to the node, which is why it fails to join cluster:
The tribe itself does bind to a non local address |
@TinLe another option is to supply these settings from the command line, are you perhaps doing that? |
The two missing lines are:
In another email exchange with @sherry-ger, I got the correct settings for the tribe. I need to add the following for tribe to work.
The network.publish_host setting pointing to itself is something I would not have guessed.... In any case, I got tribe working in ES 2.2.0 now. |
FYI, tribe node not passing on settings in elasticsearch.yml is breaking plugins. |
Hi folks,
I'm trying to start tribe node using following config:
This config resides in the file /etc/tribe-elasticseach/elasticsearch.yml. I'm starting it using following command:
Elasticsearch fails with following output:
I'm not sure why it tries to access /usr/share/elasticsearch/config/elasticsearch.yml. There is no such file in the elasticsearch deb package. I created this file, but command above still fails with same output. Please advise how this can be resolved.
I'm running elasticsearch 2.0.0 installed from the debian package downloaded from the official site. I'm using ubuntu 14.
Thanks,
Kirill.
The text was updated successfully, but these errors were encountered: