Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fresh install or upgrade of logging stack to v3.6.0 === Unknown Discovery type [kubernetes] #5497

Closed
rhefner opened this issue Sep 21, 2017 · 21 comments

Comments

@rhefner
Copy link

commented Sep 21, 2017

Description

Attempting to upgrade our logging stack from v1.5.1 => v3.6.0 succeeds running through Ansible, but the ES containers do not deploy successfully.

I thought it might be a corruption, so I tried after wiping the storage for the ES containers which didn't make a diff. I then tried to install fresh and same problem occurs. v1.5.1 works fine, but would like to keep the logging aligned with the cluster version.

Not sure if this is the right place to put this or if there's someone else that maintains the v3.6.0 logging images (if the problem lies there) -- Any help would be appreciated.

Version
  • Your ansible version per ansible --version
ansible 2.3.2.0
  config file = /Users/hef/work/openshift-ansible/ansible.cfg
  configured module search path = Default w/o overrides
  python version = 2.7.13 (default, Jul 18 2017, 09:17:00) [GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
Steps To Reproduce

Upgrade with/without existing data from v1.5.1 to v3.6.0 || Fresh install of v3.6.0

  1. In Ansible repo, git checkout release-3.6
  2. git pull --rebase to update
  3. ansible-playbook playbooks/byo/openshift-cluster/openshift-logging.yml

(also tried on master branch with no luck)

Expected Results

Successful install and/or upgrade the container images in logging project to v3.6.0 + any other changes necessary to rev up to a v3.6.0 cluster.

Observed Results

ES containers do not come up (in crash loop) with the following output:

[2017-09-21 19:09:40,650][INFO ][container.run            ] Begin Elasticsearch startup script
--
  | [2017-09-21 19:09:40,663][INFO ][container.run            ] Comparing the specified RAM to the maximum recommended for Elasticsearch...
  | [2017-09-21 19:09:40,664][INFO ][container.run            ] Inspecting the maximum RAM available...
  | [2017-09-21 19:09:40,668][INFO ][container.run            ] ES_HEAP_SIZE: '1024m'
  | [2017-09-21 19:09:40,669][INFO ][container.run            ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof
  | [2017-09-21 19:09:40,672][INFO ][container.run            ] Checking if Elasticsearch is ready on https://localhost:9200
  | Exception in thread "main" java.lang.IllegalArgumentException: Unknown Discovery type [kubernetes]
  | at org.elasticsearch.discovery.DiscoveryModule.configure(DiscoveryModule.java:100)
  | at <<<guice>>>
  | at org.elasticsearch.node.Node.<init>(Node.java:213)
  | at org.elasticsearch.node.Node.<init>(Node.java:140)
  | at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
  | at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194)
  | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)
  | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
  | Refer to the log for complete error details.
Additional Information
@ewolinetz

This comment has been minimized.

Copy link
Contributor

commented Sep 21, 2017

@rhefner Can you provide the output from oc get configmap/logging-elasticsearch -o yaml ?

cc @portante @richm

@rhefner

This comment has been minimized.

Copy link
Author

commented Sep 21, 2017

@rhefner

This comment has been minimized.

Copy link
Author

commented Sep 21, 2017

By the way, once it crashes for a while, the pods output this:

Comparing the specificed RAM to the maximum recommended for ElasticSearch...
--
  | Inspecting the maximum RAM available...
  | ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms128M -Xmx1024m'
  | Exception in thread "main" java.lang.IllegalArgumentException: Could not resolve placeholder 'HAS_DATA'
  | at org.elasticsearch.common.property.PropertyPlaceholder.parseStringValue(PropertyPlaceholder.java:128)
  | at org.elasticsearch.common.property.PropertyPlaceholder.replacePlaceholders(PropertyPlaceholder.java:81)
  | at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1179)
  | at org.elasticsearch.node.internal.InternalSettingsPreparer.initializeSettings(InternalSettingsPreparer.java:131)
  | at org.elasticsearch.node.internal.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:100)
  | at org.elasticsearch.common.cli.CliTool.<init>(CliTool.java:107)
  | at org.elasticsearch.common.cli.CliTool.<init>(CliTool.java:100)
  | at org.elasticsearch.bootstrap.BootstrapCLIParser.<init>(BootstrapCLIParser.java:48)
  | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:242)
  | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
  | Refer to the log for complete error details.
  | Checking if Elasticsearch is ready on https://localhost:9200 ..


@ttindell2

This comment has been minimized.

Copy link

commented Sep 21, 2017

Having the same issue here. From what I know the elasticsearch.yml
discovery.type: kubernetes

should be:

discovery.zen.hosts_provider: kubernetes

I believe this is from a change in the kubernetes plugin

After changing this, I got farther, but have searchguard issues saying it was not initialized.

@rhefner

This comment has been minimized.

Copy link
Author

commented Sep 21, 2017

@ttindell2: Good call, changing the discovery type to zen.hosts_provider got me further as well. Now, ES is just timing out for me:

[2017-09-21 21:56:36,810][INFO ][container.run            ] Begin Elasticsearch startup script
--
  | [2017-09-21 21:56:36,826][INFO ][container.run            ] Comparing the specified RAM to the maximum recommended for Elasticsearch...
  | [2017-09-21 21:56:36,827][INFO ][container.run            ] Inspecting the maximum RAM available...
  | [2017-09-21 21:56:36,831][INFO ][container.run            ] ES_HEAP_SIZE: '1024m'
  | [2017-09-21 21:56:36,833][INFO ][container.run            ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof
  | [2017-09-21 21:56:36,837][INFO ][container.run            ] Checking if Elasticsearch is ready on https://localhost:9200
  | [2017-09-21 22:02:06,481][ERROR][container.run            ] Timed out waiting for Elasticsearch to be ready
  | cat: elasticsearch_connect_log.txt: No such file or directory
@ttindell2

This comment has been minimized.

Copy link

commented Sep 21, 2017

@rhefner
A very useful log is on the pod at /elasticsearch/${CLUSTER_NAME}/logs/

There are a few logs in there, but only one will have info. could you see what that log says?

@rhefner

This comment has been minimized.

Copy link
Author

commented Sep 22, 2017

@ttindell2 Seems like some searchguard issues, not sure if it's the same as yours..?

sh-4.2$ cat /elasticsearch/logging-es/logs/logging-es.log
[2017-09-22 00:41:40,797][INFO ][node                     ] [logging-es-data-master-bk8ocbgu] version[2.4.4], pid[1], build[fcbb46d/2017-01-03T11:33:16Z]
[2017-09-22 00:41:40,798][INFO ][node                     ] [logging-es-data-master-bk8ocbgu] initializing ...
[2017-09-22 00:41:42,415][INFO ][plugins                  ] [logging-es-data-master-bk8ocbgu] modules [reindex, lang-expression, lang-groovy], plugins [openshift-elasticsearch, cloud-kubernetes], sites []
[2017-09-22 00:41:42,530][INFO ][env                      ] [logging-es-data-master-bk8ocbgu] using [1] data paths, mounts [[/elasticsearch/persistent (/dev/loop0)]], net usable_space [99.9gb], net total_space [99.9gb], spins? [possibly], types [xfs]
[2017-09-22 00:41:42,530][INFO ][env                      ] [logging-es-data-master-bk8ocbgu] heap size [989.8mb], compressed ordinary object pointers [true]
[2017-09-22 00:41:43,545][INFO ][http                     ] [logging-es-data-master-bk8ocbgu] Using [org.elasticsearch.http.netty.NettyHttpServerTransport] as http transport, overridden by [search-guard2]
[2017-09-22 00:41:43,816][INFO ][transport                ] [logging-es-data-master-bk8ocbgu] Using [com.floragunn.searchguard.transport.SearchGuardTransportService] as transport service, overridden by [search-guard2]
[2017-09-22 00:41:43,817][INFO ][transport                ] [logging-es-data-master-bk8ocbgu] Using [com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] as transport, overridden by [search-guard-ssl]
[2017-09-22 00:41:48,511][INFO ][io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader] Trying to load Kibana mapping for io.fabric8.elasticsearch.kibana.mapping.app from plugin: /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json
[2017-09-22 00:41:48,516][INFO ][io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader] Trying to load Kibana mapping for io.fabric8.elasticsearch.kibana.mapping.ops from plugin: /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json
[2017-09-22 00:41:48,517][INFO ][io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader] Trying to load Kibana mapping for io.fabric8.elasticsearch.kibana.mapping.empty from plugin: /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json
[2017-09-22 00:41:48,698][INFO ][node                     ] [logging-es-data-master-bk8ocbgu] initialized
[2017-09-22 00:41:48,698][INFO ][node                     ] [logging-es-data-master-bk8ocbgu] starting ...
[2017-09-22 00:41:48,921][INFO ][discovery                ] [logging-es-data-master-bk8ocbgu] logging-es/ENzlPG2kTy2jfumf_9u80w
[2017-09-22 00:42:18,922][WARN ][discovery                ] [logging-es-data-master-bk8ocbgu] waited for 30s and no initial state was set by the discovery
[2017-09-22 00:42:19,105][INFO ][http                     ] [logging-es-data-master-bk8ocbgu] publish_address {10.1.0.245:9200}, bound_addresses {[::]:9200}
[2017-09-22 00:42:19,105][INFO ][node                     ] [logging-es-data-master-bk8ocbgu] started
[2017-09-22 00:42:19,125][WARN ][discovery.zen.ping.unicast] [logging-es-data-master-bk8ocbgu] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
SendRequestTransportException[[][[::1]:9300][internal:discovery/zen/unicast]]; nested: NodeNotConnectedException[[][[::1]:9300] Node not connected];
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:340)
	at com.floragunn.searchguard.transport.SearchGuardTransportService.sendRequest(SearchGuardTransportService.java:88)
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:440)
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:426)
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
	at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
	at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
	at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:945)
	at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
	at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
	at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: NodeNotConnectedException[[][[::1]:9300] Node not connected]
	at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:1141)
	at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:830)
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:329)
	... 13 more
[2017-09-22 00:42:31,213][WARN ][discovery.zen.ping.unicast] [logging-es-data-master-bk8ocbgu] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
SendRequestTransportException[[][[::1]:9300][internal:discovery/zen/unicast]]; nested: NodeNotConnectedException[[][[::1]:9300] Node not connected];
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:340)
	at com.floragunn.searchguard.transport.SearchGuardTransportService.sendRequest(SearchGuardTransportService.java:88)
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:440)
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:426)
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
	at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
	at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
	at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:945)
	at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
	at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
	at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: NodeNotConnectedException[[][[::1]:9300] Node not connected]
	at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:1141)
	at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:830)
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:329)
	... 13 more
[2017-09-22 00:42:48,922][ERROR][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] Failure while checking .searchguard.logging-es-data-master-bk8ocbgu index MasterNotDiscoveredException[null]
MasterNotDiscoveredException[null]
	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:234)
	at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:236)
	at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:816)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:43:10,305][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:43:18,927][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:43:19,323][WARN ][discovery.zen.ping.unicast] [logging-es-data-master-bk8ocbgu] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
SendRequestTransportException[[][[::1]:9300][internal:discovery/zen/unicast]]; nested: NodeNotConnectedException[[][[::1]:9300] Node not connected];
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:340)
	at com.floragunn.searchguard.transport.SearchGuardTransportService.sendRequest(SearchGuardTransportService.java:88)
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:440)
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:426)
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
	at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
	at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
	at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:945)
	at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
	at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
	at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: NodeNotConnectedException[[][[::1]:9300] Node not connected]
	at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:1141)
	at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:830)
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:329)
	... 13 more
[2017-09-22 00:43:51,929][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:44:24,931][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:44:46,504][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:44:57,933][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:45:01,527][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:45:30,934][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:46:03,936][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:46:16,681][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:46:36,937][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:46:52,738][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:47:09,939][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:47:25,788][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:47:28,792][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:47:31,796][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:47:42,940][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:47:46,822][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:48:04,846][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
	at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
	at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
	at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
	at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:48:15,944][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
@ttindell2

This comment has been minimized.

Copy link

commented Sep 22, 2017

@rhefner
yep same exact issue here. I havent found any solution to it yet. Rolling back to v1.5.1 didnt help either.

@richm

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2017

@wozniakjan any ideas?

@wozniakjan

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2017

You have an updated opneshift-ansible but old ES image. If you are getting the ES image from https://hub.docker.com/r/openshift/origin-logging-elasticsearch/tags/, it looks like only latest has been updated. I recommend not changing anything in the ES config map and pull the latest ES image.

If you want some background about what is exactly happening or other solution than updating the ES images, read on. In September, we introduced a new type of master discovery algorithm in ES images - by label and port, because discovering by service didn't work well with readiness probe.

It has relevant changes in:

  1. openshift-ansible - #5209
  • turning back on readiness probe
  • changing the discovery algorithm in ES configmap
  1. ES image - openshift/origin-aggregated-logging#609
  • new library supporting the new discovery algorithm

If you don't want to update the ES image then you need to:

  • disable readiness probe - oc edit dc logging-es-data-master-... each ES DeploymentConfig and remove part starting readinessProbe:
  • revert back the master discovery algorithm - oc edit cm logging-elasticsearch and change
cloud:
   kubernetes:
     pod_label: ${POD_LABEL}
     pod_port: 9300
     namespace: ${NAMESPACE} 

to

cloud:
   kubernetes:
     service: ${SERVICE_DNS}
     namespace: ${NAMESPACE} 
@ttindell2

This comment has been minimized.

Copy link

commented Sep 22, 2017

Reverted elasticsearch.yml back to what it was and pulled latest ES container. Cluster is now in yellow state. Cluster is ok now.

@ewolinetz

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2017

@rhefner does the suggested above resolve the issue you are seeing as well?

@rhefner

This comment has been minimized.

Copy link
Author

commented Sep 23, 2017

@ewolinetz Yes, it looks like it did, indeed.

@ewolinetz

This comment has been minimized.

Copy link
Contributor

commented Sep 25, 2017

Closing due to pulling from latest resolved the issue

@ewolinetz ewolinetz closed this Sep 25, 2017

@mhutter

This comment has been minimized.

Copy link
Contributor

commented Oct 26, 2017

@ewolinetz will there be an updated tagged image? I'm not happy about running latest in production :(

@ewolinetz

This comment has been minimized.

Copy link
Contributor

commented Oct 27, 2017

@mhutter yes, there will be. you will not be expected to run with the latest tag in production.

@wozniakjan

This comment has been minimized.

Copy link
Contributor

commented Oct 31, 2017

@tdudgeon

This comment has been minimized.

Copy link

commented Nov 1, 2017

Just tried a new origin deployment switching to the v3.6.1 images and ES is failing to start.
This was done with the ansible installer with these definititions:

openshift_release=v3.6
openshift_hosted_logging_deployer_version=v3.6.1

This is what is seen in the logs of the logging-es-data-master pod

[2017-11-01 15:10:02,491][INFO ][container.run            ] Begin Elasticsearch startup script
--
  | [2017-11-01 15:10:02,498][INFO ][container.run            ] Comparing the specified RAM to the maximum recommended for Elasticsearch...
  | [2017-11-01 15:10:02,499][INFO ][container.run            ] Inspecting the maximum RAM available...
  | [2017-11-01 15:10:02,503][INFO ][container.run            ] ES_HEAP_SIZE: '4096m'
  | [2017-11-01 15:10:02,506][INFO ][container.run            ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof
  | [2017-11-01 15:10:02,509][INFO ][container.run            ] Checking if Elasticsearch is ready on https://localhost:9200
  | Exception in thread "main" java.lang.IllegalArgumentException: Unknown Discovery type [kubernetes]
  | at org.elasticsearch.discovery.DiscoveryModule.configure(DiscoveryModule.java:100)
  | at <<<guice>>>
  | at org.elasticsearch.node.Node.<init>(Node.java:213)
  | at org.elasticsearch.node.Node.<init>(Node.java:140)
  | at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
  | at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194)
  | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)
  | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
  | Refer to the log for complete error details.
@wozniakjan

This comment has been minimized.

Copy link
Contributor

commented Nov 1, 2017

@tdudgeon thanks! Two details:

  1. openshift_hosted_logging_deployer_version was deprecated in #5176, please try to use openshift_logging_image_version=v3.6.1
  2. but unfortunately, our release engineers may have pushed 3.6.0 into 3.6.1 (judging from the same sha256), so only usable remains the latest
@wozniakjan

This comment has been minimized.

Copy link
Contributor

commented Nov 9, 2017

We have introduced and released new tag v3.6. This tag will be updated regularly, will no longer have to wait for release engineers to push a new image. More info here openshift/origin-aggregated-logging#758

This should be working now:
openshift_logging_image_version=v3.6

@slaterx

This comment has been minimized.

Copy link

commented Jan 23, 2018

Reproducible on v3.7:

[root@master ~]# oc version
oc v3.7.0+7ed6862
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://openshift2.example.com
openshift v3.7.0+7ed6862
kubernetes v1.7.6+a08f5eeb62

Elasticsearch config map was not showing latest discovery algorithm (cloud:(kubernetes: (service: ${SERVICE_DNS}, namespace: ${NAMESPACE})) )

After manually updating config map, container had the following logs:

sh-4.2$ tail -f /elasticsearch/logging-es/logs/logging-es.log
[2018-01-23 10:28:35,897][INFO ][node                     ] [logging-es-data-master-96f1ifqf] started
[2018-01-23 10:29:05,798][ERROR][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] Failure while checking .searchguard.logging-es-data-master-96f1ifqf index MasterNotDiscoveredException[null]
MasterNotDiscoveredException[null]
        at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:234)
        at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:236)
        at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:816)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2018-01-23 10:29:35,807][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
[2018-01-23 10:30:08,808][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
[2018-01-23 10:30:41,810][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
[2018-01-23 10:31:03,307][WARN ][discovery.zen.ping.unicast] [logging-es-data-master-96f1ifqf] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
SendRequestTransportException[[][[::1]:9300][internal:discovery/zen/unicast]]; nested: NodeNotConnectedException[[][[::1]:9300] Node not connected];
        at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:340)
        at com.floragunn.searchguard.transport.SearchGuardTransportService.sendRequest(SearchGuardTransportService.java:88)
        at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:440)
        at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:426)
        at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
        at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
        at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
        at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:945)
        at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
        at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
        at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: NodeNotConnectedException[[][[::1]:9300] Node not connected]
        at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:1141)
        at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:830)
        at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:329)
        ... 13 more
[2018-01-23 10:31:14,812][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
[2018-01-23 10:31:47,814][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.