New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Child docker image failed on 4 or more layers #44

Closed
prehor opened this Issue Mar 26, 2017 · 34 comments

Comments

Projects
None yet
10 participants
@prehor

prehor commented Mar 26, 2017

I found out that if my child Docker image has 4 or more file layers then Elasticsearch terminates with the following error:

# docker run elasticsearch-test:5.2.2
[2017-03-26T21:46:54,902][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-26T21:46:54,948][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:125) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.cli.Command.main(Command.java:88) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:89) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:82) ~[elasticsearch-5.2.2.jar:5.2.2]
Caused by: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:260) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:252) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:232) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-5.2.2.jar:5.2.2]
	... 6 more
Caused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data/nodes/0
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:239) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:252) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:232) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-5.2.2.jar:5.2.2]
	... 6 more
Caused by: java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91) ~[?:?]
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65) ~[?:?]
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368) ~[?:?]
	at java.nio.file.Files.getFileStore(Files.java:1461) ~[?:1.8.0_92-internal]
	at org.elasticsearch.env.ESFileStore.getMatchingFileStore(ESFileStore.java:107) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.Environment.getFileStore(Environment.java:340) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.NodeEnvironment$NodePath.<init>(NodeEnvironment.java:108) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:227) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:252) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:232) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-5.2.2.jar:5.2.2]
	... 6 more

There is my Dockerfile for elasticsearch-test image:

# Failed on 4 or more layers
FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2

# Failed on 4 or more layers
# FROM elasticsearch:5.2.2-alpine

# Never failed
# FROM elasticsearch:5.2.2

RUN echo test 0 > ftest0
RUN echo test 1 > ftest1
# RUN echo test 2 > ftest2
# RUN echo test 3 > ftest3
# RUN echo test 4 > ftest4
# RUN echo test 5 > ftest5
# RUN echo test 6 > ftest6

COPY test0 dtest0/
COPY test1 dtest1/
# COPY test2 dtest2/
# COPY test3 dtest3/
# COPY test4 dtest4/
# COPY test5 dtest5/
# COPY test6 dtest6/

Elasticsearch failed on any arbitrary combination of 4 or more RUN or COPY layers.

All folders testN contains file with the same name. File contains approximately 15 random characters.

Out of curiosity I have tried images from Docker Hub - Alpine based image exhibits the same behavior but Debian based image seems that it is not affected.

I'm running on Docker for macOS 1.13.1 (15353) and OS X 10.11.6:

# docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 158
Server Version: 1.13.1
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
Swarm: active
 NodeID: srwwkecxby41p6pv153ywy6y0
 Is Manager: true
 ClusterID: fdmz6u2uni02t5ozsxnkgdrkg
 Managers: 1
 Nodes: 1
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 192.168.65.2
 Manager Addresses:
  192.168.65.2:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1
runc version: 9df8b306d01f59d3a8029be411de015b7304dd8f
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.8-moby
Operating System: Alpine Linux v3.5
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.787 GiB
Name: moby
ID: 3ODG:GSIF:YOIL:LAXH:DB32:YYZZ:PIIX:A5XB:WXR2:QZ3C:EVZA:NZ6A
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 41
 Goroutines: 136
 System Time: 2017-03-26T21:35:01.318890699Z
 EventsListeners: 1
No Proxy: *.local, 169.254/16
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
# uname -a
Darwin Marvin.local 15.6.0 Darwin Kernel Version 15.6.0: Mon Jan  9 23:07:29 PST 2017; root:xnu-3248.60.11.2.1~1/RELEASE_X86_64 x86_64
@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Mar 27, 2017

Contributor

I haven't been able to reproduce this on Linux[1].

Have you considered cleaning up stopped+dangling containers+images? You can use the new docker system df / docker system prune commands as explained here.

Also note that the default user in the image is elasticsearch and your COPY commands will retain whatever uid:gid the local file has on your host OS. To change the ownership you'd to add something like:

USER root
chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/dtest0
USER elasticsearch

[1]

$ cat Dockerfile 
FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2

RUN echo test 0 >ftest0
COPY test0 dtest0/

$ docker build . -t myes
Sending build context to Docker daemon 3.072 kB
Step 1/3 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
 ---> 89315294a341
Step 2/3 : RUN echo test 0 >ftest0
 ---> Using cache
 ---> df2f7444499a
Step 3/3 : COPY test0 dtest0/
 ---> d03df0cde01c
Removing intermediate container 9ca833ab10b1
Successfully built d03df0cde01c


$ docker run --rm myes
[2017-03-27T05:59:17,316][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-27T05:59:17,386][INFO ][o.e.e.NodeEnvironment    ] [ILEy7pa] using [1] data paths, mounts [[/ (/dev/mapper/docker-253:1-9175146-d64fce121a2b42163502576e5ee0dd4359485ff4fb33ba434c7618b48925c71c)]], net usable_space [9.7gb], net total_space [9.9gb], spins? [possibly], types [xfs]
[2017-03-27T05:59:17,387][INFO ][o.e.e.NodeEnvironment    ] [ILEy7pa] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-03-27T05:59:17,387][INFO ][o.e.n.Node               ] node name [ILEy7pa] derived from node ID [ILEy7pahST2f_EFzlATSew]; set [node.name] to override
[2017-03-27T05:59:17,389][INFO ][o.e.n.Node               ] version[5.2.2], pid[1], build[f9d9b74/2017-02-24T17:26:45.835Z], OS[Linux/4.9.10-100.fc24.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_92-internal/25.92-b14]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [aggs-matrix-stats]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [ingest-common]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [lang-expression]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [lang-groovy]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [lang-mustache]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [lang-painless]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [percolator]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [reindex]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [transport-netty3]
[2017-03-27T05:59:18,412][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [transport-netty4]
[2017-03-27T05:59:18,412][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded plugin [x-pack]
[2017-03-27T05:59:18,801][WARN ][o.e.d.s.g.GroovyScriptEngineService] [groovy] scripts are deprecated, use [painless] scripts instead
[2017-03-27T05:59:20,400][INFO ][o.e.n.Node               ] initialized
[2017-03-27T05:59:20,400][INFO ][o.e.n.Node               ] [ILEy7pa] starting ...
[2017-03-27T05:59:20,554][WARN ][i.n.u.i.MacAddressUtil   ] Failed to find a usable hardware address from the network interfaces; using random bytes: 0e:6e:ea:e5:d9:e9:f4:ca
[2017-03-27T05:59:20,595][INFO ][o.e.t.TransportService   ] [ILEy7pa] publish_address {172.17.0.2:9300}, bound_addresses {[::]:9300}
[2017-03-27T05:59:20,599][INFO ][o.e.b.BootstrapChecks    ] [ILEy7pa] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-03-27T05:59:23,655][INFO ][o.e.c.s.ClusterService   ] [ILEy7pa] new_master {ILEy7pa}{ILEy7pahST2f_EFzlATSew}{Z_j7Z_UXT3CydFdxNPeb-w}{172.17.0.2}{172.17.0.2:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-03-27T05:59:23,680][INFO ][o.e.h.HttpServer         ] [ILEy7pa] publish_address {172.17.0.2:9200}, bound_addresses {[::]:9200}
[2017-03-27T05:59:23,680][INFO ][o.e.n.Node               ] [ILEy7pa] started
[2017-03-27T05:59:23,743][INFO ][o.e.g.GatewayService     ] [ILEy7pa] recovered [0] indices into cluster_state
[2017-03-27T05:59:24,066][INFO ][o.e.l.LicenseService     ] [ILEy7pa] license [c85286fa-da68-4b5c-ae2c-df08757dddea] mode [trial] - valid
[2017-03-27T05:59:30,594][INFO ][o.e.c.m.MetaDataCreateIndexService] [ILEy7pa] [.monitoring-data-2] creating index, cause [auto(bulk api)], templates [.monitoring-data-2], shards [1]/[1], mappings [logstash, _default_, node, cluster_info, kibana]
[2017-03-27T05:59:30,699][INFO ][o.e.c.m.MetaDataCreateIndexService] [ILEy7pa] [.monitoring-es-2-2017.03.27] creating index, cause [auto(bulk api)], templates [.monitoring-es-2], shards [1]/[1], mappings [_default_, node, shards, index_stats, index_recovery, cluster_state, cluster_stats, indices_stats, node_stats]
[2017-03-27T05:59:30,867][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [cluster_stats]
[2017-03-27T05:59:30,916][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [node_stats]
[2017-03-27T05:59:41,003][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [indices_stats]
[2017-03-27T05:59:41,028][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [index_stats]
[2017-03-27T05:59:41,050][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [cluster_stats]
Contributor

dliappis commented Mar 27, 2017

I haven't been able to reproduce this on Linux[1].

Have you considered cleaning up stopped+dangling containers+images? You can use the new docker system df / docker system prune commands as explained here.

Also note that the default user in the image is elasticsearch and your COPY commands will retain whatever uid:gid the local file has on your host OS. To change the ownership you'd to add something like:

USER root
chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/dtest0
USER elasticsearch

[1]

$ cat Dockerfile 
FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2

RUN echo test 0 >ftest0
COPY test0 dtest0/

$ docker build . -t myes
Sending build context to Docker daemon 3.072 kB
Step 1/3 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
 ---> 89315294a341
Step 2/3 : RUN echo test 0 >ftest0
 ---> Using cache
 ---> df2f7444499a
Step 3/3 : COPY test0 dtest0/
 ---> d03df0cde01c
Removing intermediate container 9ca833ab10b1
Successfully built d03df0cde01c


$ docker run --rm myes
[2017-03-27T05:59:17,316][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-27T05:59:17,386][INFO ][o.e.e.NodeEnvironment    ] [ILEy7pa] using [1] data paths, mounts [[/ (/dev/mapper/docker-253:1-9175146-d64fce121a2b42163502576e5ee0dd4359485ff4fb33ba434c7618b48925c71c)]], net usable_space [9.7gb], net total_space [9.9gb], spins? [possibly], types [xfs]
[2017-03-27T05:59:17,387][INFO ][o.e.e.NodeEnvironment    ] [ILEy7pa] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-03-27T05:59:17,387][INFO ][o.e.n.Node               ] node name [ILEy7pa] derived from node ID [ILEy7pahST2f_EFzlATSew]; set [node.name] to override
[2017-03-27T05:59:17,389][INFO ][o.e.n.Node               ] version[5.2.2], pid[1], build[f9d9b74/2017-02-24T17:26:45.835Z], OS[Linux/4.9.10-100.fc24.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_92-internal/25.92-b14]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [aggs-matrix-stats]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [ingest-common]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [lang-expression]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [lang-groovy]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [lang-mustache]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [lang-painless]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [percolator]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [reindex]
[2017-03-27T05:59:18,411][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [transport-netty3]
[2017-03-27T05:59:18,412][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded module [transport-netty4]
[2017-03-27T05:59:18,412][INFO ][o.e.p.PluginsService     ] [ILEy7pa] loaded plugin [x-pack]
[2017-03-27T05:59:18,801][WARN ][o.e.d.s.g.GroovyScriptEngineService] [groovy] scripts are deprecated, use [painless] scripts instead
[2017-03-27T05:59:20,400][INFO ][o.e.n.Node               ] initialized
[2017-03-27T05:59:20,400][INFO ][o.e.n.Node               ] [ILEy7pa] starting ...
[2017-03-27T05:59:20,554][WARN ][i.n.u.i.MacAddressUtil   ] Failed to find a usable hardware address from the network interfaces; using random bytes: 0e:6e:ea:e5:d9:e9:f4:ca
[2017-03-27T05:59:20,595][INFO ][o.e.t.TransportService   ] [ILEy7pa] publish_address {172.17.0.2:9300}, bound_addresses {[::]:9300}
[2017-03-27T05:59:20,599][INFO ][o.e.b.BootstrapChecks    ] [ILEy7pa] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-03-27T05:59:23,655][INFO ][o.e.c.s.ClusterService   ] [ILEy7pa] new_master {ILEy7pa}{ILEy7pahST2f_EFzlATSew}{Z_j7Z_UXT3CydFdxNPeb-w}{172.17.0.2}{172.17.0.2:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-03-27T05:59:23,680][INFO ][o.e.h.HttpServer         ] [ILEy7pa] publish_address {172.17.0.2:9200}, bound_addresses {[::]:9200}
[2017-03-27T05:59:23,680][INFO ][o.e.n.Node               ] [ILEy7pa] started
[2017-03-27T05:59:23,743][INFO ][o.e.g.GatewayService     ] [ILEy7pa] recovered [0] indices into cluster_state
[2017-03-27T05:59:24,066][INFO ][o.e.l.LicenseService     ] [ILEy7pa] license [c85286fa-da68-4b5c-ae2c-df08757dddea] mode [trial] - valid
[2017-03-27T05:59:30,594][INFO ][o.e.c.m.MetaDataCreateIndexService] [ILEy7pa] [.monitoring-data-2] creating index, cause [auto(bulk api)], templates [.monitoring-data-2], shards [1]/[1], mappings [logstash, _default_, node, cluster_info, kibana]
[2017-03-27T05:59:30,699][INFO ][o.e.c.m.MetaDataCreateIndexService] [ILEy7pa] [.monitoring-es-2-2017.03.27] creating index, cause [auto(bulk api)], templates [.monitoring-es-2], shards [1]/[1], mappings [_default_, node, shards, index_stats, index_recovery, cluster_state, cluster_stats, indices_stats, node_stats]
[2017-03-27T05:59:30,867][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [cluster_stats]
[2017-03-27T05:59:30,916][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [node_stats]
[2017-03-27T05:59:41,003][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [indices_stats]
[2017-03-27T05:59:41,028][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [index_stats]
[2017-03-27T05:59:41,050][INFO ][o.e.c.m.MetaDataMappingService] [ILEy7pa] [.monitoring-es-2-2017.03.27/vKBUK49EQVma5V022VaVAA] update_mapping [cluster_stats]

@dliappis dliappis self-assigned this Mar 27, 2017

@prehor

This comment has been minimized.

Show comment
Hide comment
@prehor

prehor Mar 28, 2017

You used only two file layers in your Dockerfile (one RUN and one COPY) which is running fine in my environment too. My problem started with four file layers in my image (any combination of four RUN or COPY commands in Dockerfile).

Dockerfile which I posted is only proof of concept for testing the problem. You can also use this Dockerfile with files created outside of Elasticsearch installation so permissions doesn't matter:

FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
RUN echo test 0 > /tmp/test0
RUN echo test 1 > /tmp/test1
RUN echo test 2 > /tmp/test2
RUN echo test 3 > /tmp/test3

I think that problem is connected to first raised exception:

Caused by: java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91) ~[?:?]
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65) ~[?:?]
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368) ~[?:?]
	at java.nio.file.Files.getFileStore(Files.java:1461) ~[?:1.8.0_92-internal]
	at org.elasticsearch.env.ESFileStore.getMatchingFileStore(ESFileStore.java:107) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.Environment.getFileStore(Environment.java:340) ~[elasticsearch-5.2.2.jar:5.2.2]

which seems to be related to the number of stacked filesystems. Maybe it's related to OpenJDK or musl libc in Alpine Linux, because Alpine based image from Docker Hub is also affected but Debian based image from Docker Hub is not affected.

And yes, my docker environment was pruned and I used freshly downloaded image from Elastic.

prehor commented Mar 28, 2017

You used only two file layers in your Dockerfile (one RUN and one COPY) which is running fine in my environment too. My problem started with four file layers in my image (any combination of four RUN or COPY commands in Dockerfile).

Dockerfile which I posted is only proof of concept for testing the problem. You can also use this Dockerfile with files created outside of Elasticsearch installation so permissions doesn't matter:

FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
RUN echo test 0 > /tmp/test0
RUN echo test 1 > /tmp/test1
RUN echo test 2 > /tmp/test2
RUN echo test 3 > /tmp/test3

I think that problem is connected to first raised exception:

Caused by: java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91) ~[?:?]
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65) ~[?:?]
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368) ~[?:?]
	at java.nio.file.Files.getFileStore(Files.java:1461) ~[?:1.8.0_92-internal]
	at org.elasticsearch.env.ESFileStore.getMatchingFileStore(ESFileStore.java:107) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.Environment.getFileStore(Environment.java:340) ~[elasticsearch-5.2.2.jar:5.2.2]

which seems to be related to the number of stacked filesystems. Maybe it's related to OpenJDK or musl libc in Alpine Linux, because Alpine based image from Docker Hub is also affected but Debian based image from Docker Hub is not affected.

And yes, my docker environment was pruned and I used freshly downloaded image from Elastic.

@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Mar 30, 2017

Contributor

@prehor I tried with your Dockerfile and still can't reproduce the problem on Linux (with docker 1.13.1):

Do you think you could re-try your test on a Linux machine/VM or another workstation?

$ cat >Dockerfile
FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
RUN echo test 0 > /tmp/test0
RUN echo test 1 > /tmp/test1
RUN echo test 2 > /tmp/test2
RUN echo test 3 > /tmp/test3

$ docker build . -t myes
Sending build context to Docker daemon 3.072 kB
Step 1/5 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
 ---> 89315294a341
Step 2/5 : RUN echo test 0 > /tmp/test0
 ---> Running in 7e5b88726a8b
 ---> 9dee8be3c610
Removing intermediate container 7e5b88726a8b
Step 3/5 : RUN echo test 1 > /tmp/test1
 ---> Running in 841b1e4ead9e
 ---> 996bd7b12158
Removing intermediate container 841b1e4ead9e
Step 4/5 : RUN echo test 2 > /tmp/test2
 ---> Running in 57b5294a7dd4
 ---> d4febab17a72
Removing intermediate container 57b5294a7dd4
Step 5/5 : RUN echo test 3 > /tmp/test3
 ---> Running in cc6293f09243
 ---> 71779b7ed65d
Removing intermediate container cc6293f09243
Successfully built 71779b7ed65d

$ docker run --rm myes
[2017-03-30T09:27:42,565][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T09:27:42,640][INFO ][o.e.e.NodeEnvironment    ] [ZQJQ1Ww] using [1] data paths, mounts [[/ (/dev/mapper/docker-253:1-9175146-9542a2626355405b98d7a8df5901398253b2e5e41da3c60b874ef344aee9d179)]], net usable_space [9.7gb], net total_space [9.9gb], spins? [possibly], types [xfs]
[2017-03-30T09:27:42,640][INFO ][o.e.e.NodeEnvironment    ] [ZQJQ1Ww] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-03-30T09:27:42,641][INFO ][o.e.n.Node               ] node name [ZQJQ1Ww] derived from node ID [ZQJQ1WwrTfWuuBvjV2XhNw]; set [node.name] to override
[2017-03-30T09:27:42,643][INFO ][o.e.n.Node               ] version[5.2.2], pid[1], build[f9d9b74/2017-02-24T17:26:45.835Z], OS[Linux/4.9.10-100.fc24.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_92-internal/25.92-b14]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [aggs-matrix-stats]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [ingest-common]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [lang-expression]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [lang-groovy]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [lang-mustache]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [lang-painless]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [percolator]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [reindex]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [transport-netty3]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [transport-netty4]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded plugin [x-pack]
[2017-03-30T09:27:44,144][WARN ][o.e.d.s.g.GroovyScriptEngineService] [groovy] scripts are deprecated, use [painless] scripts instead
[2017-03-30T09:27:45,894][INFO ][o.e.n.Node               ] initialized
[2017-03-30T09:27:45,894][INFO ][o.e.n.Node               ] [ZQJQ1Ww] starting ...
[2017-03-30T09:27:46,089][WARN ][i.n.u.i.MacAddressUtil   ] Failed to find a usable hardware address from the network interfaces; using random bytes: a8:51:56:50:89:30:2a:6f
[2017-03-30T09:27:46,144][INFO ][o.e.t.TransportService   ] [ZQJQ1Ww] publish_address {172.17.0.2:9300}, bound_addresses {[::]:9300}
[2017-03-30T09:27:46,149][INFO ][o.e.b.BootstrapChecks    ] [ZQJQ1Ww] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-03-30T09:27:49,209][INFO ][o.e.c.s.ClusterService   ] [ZQJQ1Ww] new_master {ZQJQ1Ww}{ZQJQ1WwrTfWuuBvjV2XhNw}{-6yhYW44RNW0JHNxDQQEzQ}{172.17.0.2}{172.17.0.2:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-03-30T09:27:49,236][INFO ][o.e.h.HttpServer         ] [ZQJQ1Ww] publish_address {172.17.0.2:9200}, bound_addresses {[::]:9200}
[2017-03-30T09:27:49,236][INFO ][o.e.n.Node               ] [ZQJQ1Ww] started
[2017-03-30T09:27:49,312][INFO ][o.e.g.GatewayService     ] [ZQJQ1Ww] recovered [0] indices into cluster_state
[2017-03-30T09:27:49,695][INFO ][o.e.l.LicenseService     ] [ZQJQ1Ww] license [96130472-89ae-421f-9f55-0615758b0607] mode [trial] - valid
[2017-03-30T09:27:56,113][INFO ][o.e.c.m.MetaDataCreateIndexService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30] creating index, cause [auto(bulk api)], templates [.monitoring-es-2], shards [1]/[1], mappings [_default_, node, shards, index_stats, index_recovery, cluster_state, cluster_stats, node_stats, indices_stats]
[2017-03-30T09:27:56,210][INFO ][o.e.c.m.MetaDataCreateIndexService] [ZQJQ1Ww] [.monitoring-data-2] creating index, cause [auto(bulk api)], templates [.monitoring-data-2], shards [1]/[1], mappings [node, _default_, logstash, kibana, cluster_info]
[2017-03-30T09:27:56,401][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [cluster_stats]
[2017-03-30T09:27:56,437][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [node_stats]
[2017-03-30T09:28:06,550][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [indices_stats]
[2017-03-30T09:28:06,588][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [index_stats]
[2017-03-30T09:28:06,619][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [cluster_stats]
Contributor

dliappis commented Mar 30, 2017

@prehor I tried with your Dockerfile and still can't reproduce the problem on Linux (with docker 1.13.1):

Do you think you could re-try your test on a Linux machine/VM or another workstation?

$ cat >Dockerfile
FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
RUN echo test 0 > /tmp/test0
RUN echo test 1 > /tmp/test1
RUN echo test 2 > /tmp/test2
RUN echo test 3 > /tmp/test3

$ docker build . -t myes
Sending build context to Docker daemon 3.072 kB
Step 1/5 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
 ---> 89315294a341
Step 2/5 : RUN echo test 0 > /tmp/test0
 ---> Running in 7e5b88726a8b
 ---> 9dee8be3c610
Removing intermediate container 7e5b88726a8b
Step 3/5 : RUN echo test 1 > /tmp/test1
 ---> Running in 841b1e4ead9e
 ---> 996bd7b12158
Removing intermediate container 841b1e4ead9e
Step 4/5 : RUN echo test 2 > /tmp/test2
 ---> Running in 57b5294a7dd4
 ---> d4febab17a72
Removing intermediate container 57b5294a7dd4
Step 5/5 : RUN echo test 3 > /tmp/test3
 ---> Running in cc6293f09243
 ---> 71779b7ed65d
Removing intermediate container cc6293f09243
Successfully built 71779b7ed65d

$ docker run --rm myes
[2017-03-30T09:27:42,565][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T09:27:42,640][INFO ][o.e.e.NodeEnvironment    ] [ZQJQ1Ww] using [1] data paths, mounts [[/ (/dev/mapper/docker-253:1-9175146-9542a2626355405b98d7a8df5901398253b2e5e41da3c60b874ef344aee9d179)]], net usable_space [9.7gb], net total_space [9.9gb], spins? [possibly], types [xfs]
[2017-03-30T09:27:42,640][INFO ][o.e.e.NodeEnvironment    ] [ZQJQ1Ww] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-03-30T09:27:42,641][INFO ][o.e.n.Node               ] node name [ZQJQ1Ww] derived from node ID [ZQJQ1WwrTfWuuBvjV2XhNw]; set [node.name] to override
[2017-03-30T09:27:42,643][INFO ][o.e.n.Node               ] version[5.2.2], pid[1], build[f9d9b74/2017-02-24T17:26:45.835Z], OS[Linux/4.9.10-100.fc24.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_92-internal/25.92-b14]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [aggs-matrix-stats]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [ingest-common]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [lang-expression]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [lang-groovy]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [lang-mustache]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [lang-painless]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [percolator]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [reindex]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [transport-netty3]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded module [transport-netty4]
[2017-03-30T09:27:43,687][INFO ][o.e.p.PluginsService     ] [ZQJQ1Ww] loaded plugin [x-pack]
[2017-03-30T09:27:44,144][WARN ][o.e.d.s.g.GroovyScriptEngineService] [groovy] scripts are deprecated, use [painless] scripts instead
[2017-03-30T09:27:45,894][INFO ][o.e.n.Node               ] initialized
[2017-03-30T09:27:45,894][INFO ][o.e.n.Node               ] [ZQJQ1Ww] starting ...
[2017-03-30T09:27:46,089][WARN ][i.n.u.i.MacAddressUtil   ] Failed to find a usable hardware address from the network interfaces; using random bytes: a8:51:56:50:89:30:2a:6f
[2017-03-30T09:27:46,144][INFO ][o.e.t.TransportService   ] [ZQJQ1Ww] publish_address {172.17.0.2:9300}, bound_addresses {[::]:9300}
[2017-03-30T09:27:46,149][INFO ][o.e.b.BootstrapChecks    ] [ZQJQ1Ww] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-03-30T09:27:49,209][INFO ][o.e.c.s.ClusterService   ] [ZQJQ1Ww] new_master {ZQJQ1Ww}{ZQJQ1WwrTfWuuBvjV2XhNw}{-6yhYW44RNW0JHNxDQQEzQ}{172.17.0.2}{172.17.0.2:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-03-30T09:27:49,236][INFO ][o.e.h.HttpServer         ] [ZQJQ1Ww] publish_address {172.17.0.2:9200}, bound_addresses {[::]:9200}
[2017-03-30T09:27:49,236][INFO ][o.e.n.Node               ] [ZQJQ1Ww] started
[2017-03-30T09:27:49,312][INFO ][o.e.g.GatewayService     ] [ZQJQ1Ww] recovered [0] indices into cluster_state
[2017-03-30T09:27:49,695][INFO ][o.e.l.LicenseService     ] [ZQJQ1Ww] license [96130472-89ae-421f-9f55-0615758b0607] mode [trial] - valid
[2017-03-30T09:27:56,113][INFO ][o.e.c.m.MetaDataCreateIndexService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30] creating index, cause [auto(bulk api)], templates [.monitoring-es-2], shards [1]/[1], mappings [_default_, node, shards, index_stats, index_recovery, cluster_state, cluster_stats, node_stats, indices_stats]
[2017-03-30T09:27:56,210][INFO ][o.e.c.m.MetaDataCreateIndexService] [ZQJQ1Ww] [.monitoring-data-2] creating index, cause [auto(bulk api)], templates [.monitoring-data-2], shards [1]/[1], mappings [node, _default_, logstash, kibana, cluster_info]
[2017-03-30T09:27:56,401][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [cluster_stats]
[2017-03-30T09:27:56,437][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [node_stats]
[2017-03-30T09:28:06,550][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [indices_stats]
[2017-03-30T09:28:06,588][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [index_stats]
[2017-03-30T09:28:06,619][INFO ][o.e.c.m.MetaDataMappingService] [ZQJQ1Ww] [.monitoring-es-2-2017.03.30/_0FPahRvSeebGKx9eNqjow] update_mapping [cluster_stats]
@artholl

This comment has been minimized.

Show comment
Hide comment
@artholl

artholl Mar 30, 2017

Hello,

I got the same issue using @dliappis test dockerfile on docker for windows:

FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
RUN echo test 0 > /tmp/test0
RUN echo test 1 > /tmp/test1
RUN echo test 2 > /tmp/test2
RUN echo test 3 > /tmp/test3
uname -a
MINGW64_NT-10.0 LENOV766DSY 2.5.0(0.295/5/3) 2016-03-31 18:47 x86_64 Msys

Here is my docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 1018
Server Version: 17.03.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 977c511eda0925a723debdc94d09459af49d082a
runc version: a01dafd48bc1c7cc12bdb01206f9fea7dd6feb70
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.12-moby
Operating System: Alpine Linux v3.5
OSType: linux
Architecture: x86_64
CPUs: 3
Total Memory: 3.837 GiB
Name: moby
ID: QRPC:2AQX:6HMG:MNID:46XM:CWMY:KYAB:ZNT5:LHB4:J6F3:NRUN:GRIG
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 29
 Goroutines: 44
 System Time: 2017-03-30T14:32:16.6494708Z
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
Live Restore Enabled: false

And here are the log :

[2017-03-30T14:27:52,620][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T14:27:52,682][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:125) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.cli.Command.main(Command.java:88) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:89) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:82) ~[elasticsearch-5.2.2.jar:5.2.2]
Caused by: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:260) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:252) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:232) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-5.2.2.jar:5.2.2]
	... 6 more
Caused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data/nodes/0
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:239) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:252) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:232) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-5.2.2.jar:5.2.2]
	... 6 more
Caused by: java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91) ~[?:?]
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65) ~[?:?]
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368) ~[?:?]
	at java.nio.file.Files.getFileStore(Files.java:1461) ~[?:1.8.0_92-internal]
	at org.elasticsearch.env.ESFileStore.getMatchingFileStore(ESFileStore.java:107) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.Environment.getFileStore(Environment.java:340) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.NodeEnvironment$NodePath.<init>(NodeEnvironment.java:108) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:227) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:252) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:232) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-5.2.2.jar:5.2.2]
	... 6 more

artholl commented Mar 30, 2017

Hello,

I got the same issue using @dliappis test dockerfile on docker for windows:

FROM docker.elastic.co/elasticsearch/elasticsearch:5.2.2
RUN echo test 0 > /tmp/test0
RUN echo test 1 > /tmp/test1
RUN echo test 2 > /tmp/test2
RUN echo test 3 > /tmp/test3
uname -a
MINGW64_NT-10.0 LENOV766DSY 2.5.0(0.295/5/3) 2016-03-31 18:47 x86_64 Msys

Here is my docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 1018
Server Version: 17.03.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 977c511eda0925a723debdc94d09459af49d082a
runc version: a01dafd48bc1c7cc12bdb01206f9fea7dd6feb70
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.12-moby
Operating System: Alpine Linux v3.5
OSType: linux
Architecture: x86_64
CPUs: 3
Total Memory: 3.837 GiB
Name: moby
ID: QRPC:2AQX:6HMG:MNID:46XM:CWMY:KYAB:ZNT5:LHB4:J6F3:NRUN:GRIG
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 29
 Goroutines: 44
 System Time: 2017-03-30T14:32:16.6494708Z
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
Live Restore Enabled: false

And here are the log :

[2017-03-30T14:27:52,620][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T14:27:52,682][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:125) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.cli.Command.main(Command.java:88) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:89) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:82) ~[elasticsearch-5.2.2.jar:5.2.2]
Caused by: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:260) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:252) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:232) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-5.2.2.jar:5.2.2]
	... 6 more
Caused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data/nodes/0
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:239) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:252) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:232) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-5.2.2.jar:5.2.2]
	... 6 more
Caused by: java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91) ~[?:?]
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65) ~[?:?]
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368) ~[?:?]
	at java.nio.file.Files.getFileStore(Files.java:1461) ~[?:1.8.0_92-internal]
	at org.elasticsearch.env.ESFileStore.getMatchingFileStore(ESFileStore.java:107) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.Environment.getFileStore(Environment.java:340) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.NodeEnvironment$NodePath.<init>(NodeEnvironment.java:108) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:227) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:252) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.node.Node.<init>(Node.java:232) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:241) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-5.2.2.jar:5.2.2]
	... 6 more
@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Mar 30, 2017

Contributor

@artholl I see your system has MINGW installed; I would avoid running any docker build from mingw (or cygwin) terminal. Additionally it should be noted that building non Windows images from a Windows host can cause issues, as indicated by the SECURITY WARNING message (see below).

While me and @jarpy both run Windows workstations as well, we run Docker builds on Linux workstations or VMs to avoid such issues.

Despite all that, I was successful building a Docker image on top of the 5.3.0 image on a typical Windows 10 Pro workstation:

> type Dockerfile
FROM docker.elastic.co/elasticsearch/elasticsearch:5.3.0
RUN echo test 0 > /tmp/test0
RUN echo test 1 > /tmp/test1
RUN echo test 2 > /tmp/test2
RUN echo test 3 > /tmp/test3

> docker build . -t myes
Sending build context to Docker daemon 5.632 kB
Step 1/5 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.3.0
 ---> ccec59a7dd84
Step 2/5 : RUN echo test 0 > /tmp/test0
 ---> Running in 57f5be76e7d0
 ---> 11bc52d1d31c
Removing intermediate container 57f5be76e7d0
Step 3/5 : RUN echo test 1 > /tmp/test1
 ---> Running in be948c7ac4a0
 ---> f39d0a1be93b
Removing intermediate container be948c7ac4a0
Step 4/5 : RUN echo test 2 > /tmp/test2
 ---> Running in 30b0bd1d9f4d
 ---> 355c4611169a
Removing intermediate container 30b0bd1d9f4d
Step 5/5 : RUN echo test 3 > /tmp/test3
 ---> Running in c543433c71bf
 ---> 9f8f433c39c1
Removing intermediate container c543433c71bf
Successfully built 9f8f433c39c1
SECURITY WARNING: You are building a Docker image from Windows against a non-Windows Docker host. All files and directories added to build context will have '-rwxr-xr-x' permissions. It is recommended to double check and reset permissions for sensitive files and directories.

> docker run --rm myes
[2017-03-30T19:27:21,733][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T19:27:21,926][INFO ][o.e.e.NodeEnvironment    ] [k6Gp-Ae] using [1] data paths, mounts [[/ (none)]], net usable_space [51.7gb], net total_space [55.1gb], spins? [possibly], types [aufs]
[2017-03-30T19:27:21,927][INFO ][o.e.e.NodeEnvironment    ] [k6Gp-Ae] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-03-30T19:27:21,933][INFO ][o.e.n.Node               ] node name [k6Gp-Ae] derived from node ID [k6Gp-AekTLm2oIUyiGeX6A]; set [node.name] to override
[2017-03-30T19:27:21,934][INFO ][o.e.n.Node               ] version[5.3.0], pid[1], build[3adb13b/2017-03-23T03:31:50.652Z], OS[Linux/4.9.12-moby/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_92-internal/25.92-b14]
[2017-03-30T19:27:26,906][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [aggs-matrix-stats]
[2017-03-30T19:27:26,906][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [ingest-common]
[2017-03-30T19:27:26,907][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [lang-expression]
[2017-03-30T19:27:26,907][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [lang-groovy]
[2017-03-30T19:27:26,907][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [lang-mustache]
[2017-03-30T19:27:26,907][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [lang-painless]
[2017-03-30T19:27:26,908][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [percolator]
[2017-03-30T19:27:26,908][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [reindex]
[2017-03-30T19:27:26,909][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [transport-netty3]
[2017-03-30T19:27:26,909][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [transport-netty4]
[2017-03-30T19:27:26,911][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded plugin [x-pack]
[2017-03-30T19:27:33,270][INFO ][o.e.n.Node               ] initialized
[2017-03-30T19:27:33,270][INFO ][o.e.n.Node               ] [k6Gp-Ae] starting ...
[2017-03-30T19:27:33,548][WARN ][i.n.u.i.MacAddressUtil   ] Failed to find a usable hardware address from the network interfaces; using random bytes: 65:a8:29:f7:84:a8:96:d5
[2017-03-30T19:27:33,664][INFO ][o.e.t.TransportService   ] [k6Gp-Ae] publish_address {172.17.0.2:9300}, bound_addresses {[::]:9300}
[2017-03-30T19:27:33,675][INFO ][o.e.b.BootstrapChecks    ] [k6Gp-Ae] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-03-30T19:27:35,005][INFO ][o.e.m.j.JvmGcMonitorService] [k6Gp-Ae] [gc][young][1][5] duration [980ms], collections [1]/[1.5s], total [980ms]/[2s], memory [97.1mb]->[107.5mb]/[1.9gb], all_pools {[young] [58.5mb]->[33.9mb]/[133.1mb]}{[survivor] [16.6mb]->[14.4mb]/[16.6mb]}{[old] [21.9mb]->[59.6mb]/[1.8gb]}
[2017-03-30T19:27:35,008][WARN ][o.e.m.j.JvmGcMonitorService] [k6Gp-Ae] [gc][1] overhead, spent [980ms] collecting in the last [1.5s]
[2017-03-30T19:27:36,770][INFO ][o.e.c.s.ClusterService   ] [k6Gp-Ae] new_master {k6Gp-Ae}{k6Gp-AekTLm2oIUyiGeX6A}{VHO1WsjhTd2_lKsAD4mcuw}{172.17.0.2}{172.17.0.2:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-03-30T19:27:36,834][INFO ][o.e.x.s.t.n.SecurityNetty4HttpServerTransport] [k6Gp-Ae] publish_address {172.17.0.2:9200}, bound_addresses {[::]:9200}
[2017-03-30T19:27:36,840][INFO ][o.e.n.Node               ] [k6Gp-Ae] started
[2017-03-30T19:27:36,965][INFO ][o.e.g.GatewayService     ] [k6Gp-Ae] recovered [0] indices into cluster_state
[2017-03-30T19:27:38,025][INFO ][o.e.l.LicenseService     ] [k6Gp-Ae] license [7e24f59a-3464-465a-b3aa-44006000acbb] mode [trial] - valid
[2017-03-30T19:27:43,554][INFO ][o.e.c.m.MetaDataCreateIndexService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30] creating index, cause [auto(bulk api)], templates [.monitoring-es-2], shards [1]/[1], mappings [shards, _default_, node, index_stats, index_recovery, cluster_state, cluster_stats, indices_stats, node_stats]
[2017-03-30T19:27:43,879][INFO ][o.e.c.m.MetaDataCreateIndexService] [k6Gp-Ae] [.monitoring-data-2] creating index, cause [auto(bulk api)], templates [.monitoring-data-2], shards [1]/[1], mappings [logstash, _default_, node, kibana, cluster_info]
[2017-03-30T19:27:44,489][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [cluster_stats]
[2017-03-30T19:27:45,610][WARN ][o.e.m.j.JvmGcMonitorService] [k6Gp-Ae] [gc][young][11][6] duration [1s], collections [1]/[1.5s], total [1s]/[3s], memory [192.2mb]->[110.3mb]/[1.9gb], all_pools {[young] [118.1mb]->[2.4mb]/[133.1mb]}{[survivor] [14.4mb]->[11.5mb]/[16.6mb]}{[old] [59.6mb]->[96.4mb]/[1.8gb]}
[2017-03-30T19:27:45,610][WARN ][o.e.m.j.JvmGcMonitorService] [k6Gp-Ae] [gc][11] overhead, spent [1s] collecting in the last [1.5s]
[2017-03-30T19:27:45,747][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [node_stats]
[2017-03-30T19:27:53,427][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [index_stats]
[2017-03-30T19:27:53,501][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [cluster_stats]
[2017-03-30T19:27:53,557][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [indices_stats]
Contributor

dliappis commented Mar 30, 2017

@artholl I see your system has MINGW installed; I would avoid running any docker build from mingw (or cygwin) terminal. Additionally it should be noted that building non Windows images from a Windows host can cause issues, as indicated by the SECURITY WARNING message (see below).

While me and @jarpy both run Windows workstations as well, we run Docker builds on Linux workstations or VMs to avoid such issues.

Despite all that, I was successful building a Docker image on top of the 5.3.0 image on a typical Windows 10 Pro workstation:

> type Dockerfile
FROM docker.elastic.co/elasticsearch/elasticsearch:5.3.0
RUN echo test 0 > /tmp/test0
RUN echo test 1 > /tmp/test1
RUN echo test 2 > /tmp/test2
RUN echo test 3 > /tmp/test3

> docker build . -t myes
Sending build context to Docker daemon 5.632 kB
Step 1/5 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.3.0
 ---> ccec59a7dd84
Step 2/5 : RUN echo test 0 > /tmp/test0
 ---> Running in 57f5be76e7d0
 ---> 11bc52d1d31c
Removing intermediate container 57f5be76e7d0
Step 3/5 : RUN echo test 1 > /tmp/test1
 ---> Running in be948c7ac4a0
 ---> f39d0a1be93b
Removing intermediate container be948c7ac4a0
Step 4/5 : RUN echo test 2 > /tmp/test2
 ---> Running in 30b0bd1d9f4d
 ---> 355c4611169a
Removing intermediate container 30b0bd1d9f4d
Step 5/5 : RUN echo test 3 > /tmp/test3
 ---> Running in c543433c71bf
 ---> 9f8f433c39c1
Removing intermediate container c543433c71bf
Successfully built 9f8f433c39c1
SECURITY WARNING: You are building a Docker image from Windows against a non-Windows Docker host. All files and directories added to build context will have '-rwxr-xr-x' permissions. It is recommended to double check and reset permissions for sensitive files and directories.

> docker run --rm myes
[2017-03-30T19:27:21,733][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T19:27:21,926][INFO ][o.e.e.NodeEnvironment    ] [k6Gp-Ae] using [1] data paths, mounts [[/ (none)]], net usable_space [51.7gb], net total_space [55.1gb], spins? [possibly], types [aufs]
[2017-03-30T19:27:21,927][INFO ][o.e.e.NodeEnvironment    ] [k6Gp-Ae] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-03-30T19:27:21,933][INFO ][o.e.n.Node               ] node name [k6Gp-Ae] derived from node ID [k6Gp-AekTLm2oIUyiGeX6A]; set [node.name] to override
[2017-03-30T19:27:21,934][INFO ][o.e.n.Node               ] version[5.3.0], pid[1], build[3adb13b/2017-03-23T03:31:50.652Z], OS[Linux/4.9.12-moby/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_92-internal/25.92-b14]
[2017-03-30T19:27:26,906][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [aggs-matrix-stats]
[2017-03-30T19:27:26,906][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [ingest-common]
[2017-03-30T19:27:26,907][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [lang-expression]
[2017-03-30T19:27:26,907][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [lang-groovy]
[2017-03-30T19:27:26,907][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [lang-mustache]
[2017-03-30T19:27:26,907][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [lang-painless]
[2017-03-30T19:27:26,908][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [percolator]
[2017-03-30T19:27:26,908][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [reindex]
[2017-03-30T19:27:26,909][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [transport-netty3]
[2017-03-30T19:27:26,909][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded module [transport-netty4]
[2017-03-30T19:27:26,911][INFO ][o.e.p.PluginsService     ] [k6Gp-Ae] loaded plugin [x-pack]
[2017-03-30T19:27:33,270][INFO ][o.e.n.Node               ] initialized
[2017-03-30T19:27:33,270][INFO ][o.e.n.Node               ] [k6Gp-Ae] starting ...
[2017-03-30T19:27:33,548][WARN ][i.n.u.i.MacAddressUtil   ] Failed to find a usable hardware address from the network interfaces; using random bytes: 65:a8:29:f7:84:a8:96:d5
[2017-03-30T19:27:33,664][INFO ][o.e.t.TransportService   ] [k6Gp-Ae] publish_address {172.17.0.2:9300}, bound_addresses {[::]:9300}
[2017-03-30T19:27:33,675][INFO ][o.e.b.BootstrapChecks    ] [k6Gp-Ae] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-03-30T19:27:35,005][INFO ][o.e.m.j.JvmGcMonitorService] [k6Gp-Ae] [gc][young][1][5] duration [980ms], collections [1]/[1.5s], total [980ms]/[2s], memory [97.1mb]->[107.5mb]/[1.9gb], all_pools {[young] [58.5mb]->[33.9mb]/[133.1mb]}{[survivor] [16.6mb]->[14.4mb]/[16.6mb]}{[old] [21.9mb]->[59.6mb]/[1.8gb]}
[2017-03-30T19:27:35,008][WARN ][o.e.m.j.JvmGcMonitorService] [k6Gp-Ae] [gc][1] overhead, spent [980ms] collecting in the last [1.5s]
[2017-03-30T19:27:36,770][INFO ][o.e.c.s.ClusterService   ] [k6Gp-Ae] new_master {k6Gp-Ae}{k6Gp-AekTLm2oIUyiGeX6A}{VHO1WsjhTd2_lKsAD4mcuw}{172.17.0.2}{172.17.0.2:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-03-30T19:27:36,834][INFO ][o.e.x.s.t.n.SecurityNetty4HttpServerTransport] [k6Gp-Ae] publish_address {172.17.0.2:9200}, bound_addresses {[::]:9200}
[2017-03-30T19:27:36,840][INFO ][o.e.n.Node               ] [k6Gp-Ae] started
[2017-03-30T19:27:36,965][INFO ][o.e.g.GatewayService     ] [k6Gp-Ae] recovered [0] indices into cluster_state
[2017-03-30T19:27:38,025][INFO ][o.e.l.LicenseService     ] [k6Gp-Ae] license [7e24f59a-3464-465a-b3aa-44006000acbb] mode [trial] - valid
[2017-03-30T19:27:43,554][INFO ][o.e.c.m.MetaDataCreateIndexService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30] creating index, cause [auto(bulk api)], templates [.monitoring-es-2], shards [1]/[1], mappings [shards, _default_, node, index_stats, index_recovery, cluster_state, cluster_stats, indices_stats, node_stats]
[2017-03-30T19:27:43,879][INFO ][o.e.c.m.MetaDataCreateIndexService] [k6Gp-Ae] [.monitoring-data-2] creating index, cause [auto(bulk api)], templates [.monitoring-data-2], shards [1]/[1], mappings [logstash, _default_, node, kibana, cluster_info]
[2017-03-30T19:27:44,489][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [cluster_stats]
[2017-03-30T19:27:45,610][WARN ][o.e.m.j.JvmGcMonitorService] [k6Gp-Ae] [gc][young][11][6] duration [1s], collections [1]/[1.5s], total [1s]/[3s], memory [192.2mb]->[110.3mb]/[1.9gb], all_pools {[young] [118.1mb]->[2.4mb]/[133.1mb]}{[survivor] [14.4mb]->[11.5mb]/[16.6mb]}{[old] [59.6mb]->[96.4mb]/[1.8gb]}
[2017-03-30T19:27:45,610][WARN ][o.e.m.j.JvmGcMonitorService] [k6Gp-Ae] [gc][11] overhead, spent [1s] collecting in the last [1.5s]
[2017-03-30T19:27:45,747][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [node_stats]
[2017-03-30T19:27:53,427][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [index_stats]
[2017-03-30T19:27:53,501][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [cluster_stats]
[2017-03-30T19:27:53,557][INFO ][o.e.c.m.MetaDataMappingService] [k6Gp-Ae] [.monitoring-es-2-2017.03.30/GVuO-x30ST2Iav8pWk4iUw] update_mapping [indices_stats]
@prehor

This comment has been minimized.

Show comment
Hide comment
@prehor

prehor Mar 30, 2017

Today I have upgraded my Docker for mac to 17.03.1-ce-mac5 (16048) and reset my docker host to factory default (I'm sorry but I don't have any other docker host available now). Unfortunately the problem still exists:

> cat Dockerfile
FROM docker.elastic.co/elasticsearch/elasticsearch:%%BASE_IMAGE_TAG%%
RUN echo test 0 > /tmp/ftest0
RUN echo test 1 > /tmp/ftest1
RUN echo test 2 > /tmp/ftest2
RUN echo test 3 > /tmp/ftest3

> docker build -t es-test .
Sending build context to Docker daemon  16.9 kB
Step 1/5 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.3.0
5.3.0: Pulling from elasticsearch/elasticsearch
3690ec4760f9: Pull complete
f52154c3d3fc: Pull complete
4075cc5db14a: Pull complete
2d52f619a2d6: Pull complete
8bb7b04253b3: Pull complete
1cb5a2a95e15: Pull complete
e2f60cff0f2f: Pull complete
9eb2cf345f74: Pull complete
ff68f5fa0401: Pull complete
90f6e7841041: Pull complete
Digest: sha256:56ac964338bc74f3874d63271433f6555648d55405a89c96f56a18dee48456eb
Status: Downloaded newer image for docker.elastic.co/elasticsearch/elasticsearch:5.3.0
 ---> ccec59a7dd84
Step 2/5 : RUN echo test 0 > /tmp/ftest0
 ---> Running in b784a0ebafba
 ---> 879153c89167
Removing intermediate container b784a0ebafba
Step 3/5 : RUN echo test 1 > /tmp/ftest1
 ---> Running in ccd049d13239
 ---> afb0028249b2
Removing intermediate container ccd049d13239
Step 4/5 : RUN echo test 2 > /tmp/ftest2
 ---> Running in 774a8ea8bd63
 ---> d99b7a9a755e
Removing intermediate container 774a8ea8bd63
Step 5/5 : RUN echo test 3 > /tmp/ftest3
 ---> Running in d00844f41d93
 ---> 500b42458aab
Removing intermediate container d00844f41d93
Successfully built 500b42458aab

> > docker run es-test
[2017-03-30T20:01:51,750][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T20:01:51,829][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:127) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:114) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:58) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.cli.Command.main(Command.java:88) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:91) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:84) ~[elasticsearch-5.3.0.jar:5.3.0]
Caused by: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:260) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:258) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:238) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:360) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:123) ~[elasticsearch-5.3.0.jar:5.3.0]
	... 6 more
Caused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data/nodes/0
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:239) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:258) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:238) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:360) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:123) ~[elasticsearch-5.3.0.jar:5.3.0]
	... 6 more
Caused by: java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91) ~[?:?]
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65) ~[?:?]
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368) ~[?:?]
	at java.nio.file.Files.getFileStore(Files.java:1461) ~[?:1.8.0_92-internal]
	at org.elasticsearch.env.ESFileStore.getMatchingFileStore(ESFileStore.java:107) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.env.Environment.getFileStore(Environment.java:340) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.env.NodeEnvironment$NodePath.<init>(NodeEnvironment.java:108) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:227) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:258) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:238) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:360) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:123) ~[elasticsearch-5.3.0.jar:5.3.0]
	... 6 more

I have tried to squash my docker image and it helps:

> docker build -t es-test --squash .
Sending build context to Docker daemon  16.9 kB
Step 1/5 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.3.0
 ---> ccec59a7dd84
Step 2/5 : RUN echo test 0 > /tmp/ftest0
 ---> Using cache
 ---> 879153c89167
Step 3/5 : RUN echo test 1 > /tmp/ftest1
 ---> Using cache
 ---> afb0028249b2
Step 4/5 : RUN echo test 2 > /tmp/ftest2
 ---> Using cache
 ---> d99b7a9a755e
Step 5/5 : RUN echo test 3 > /tmp/ftest3
 ---> Using cache
 ---> 500b42458aab
Successfully built 500b42458aab

> docker run es-test
[2017-03-30T20:03:14,585][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T20:03:15,226][INFO ][o.e.e.NodeEnvironment    ] [5oozheL] using [1] data paths, mounts [[/ (overlay)]], net usable_space [58.2gb], net total_space [62.7gb], spins? [possibly], types [overlay]
[2017-03-30T20:03:15,227][INFO ][o.e.e.NodeEnvironment    ] [5oozheL] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-03-30T20:03:15,228][INFO ][o.e.n.Node               ] node name [5oozheL] derived from node ID [5oozheL1SgK7vIY9uIk17A]; set [node.name] to override
[2017-03-30T20:03:15,229][INFO ][o.e.n.Node               ] version[5.3.0], pid[1], build[3adb13b/2017-03-23T03:31:50.652Z], OS[Linux/4.9.13-moby/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_92-internal/25.92-b14]
[2017-03-30T20:03:19,691][INFO ][o.e.p.PluginsService     ] [5oozheL] loaded module [aggs-matrix-stats]
[2017-03-30T20:03:19,692][INFO ][o.e.p.PluginsService     ] [5oozheL] loaded module [ingest-common]
...

prehor commented Mar 30, 2017

Today I have upgraded my Docker for mac to 17.03.1-ce-mac5 (16048) and reset my docker host to factory default (I'm sorry but I don't have any other docker host available now). Unfortunately the problem still exists:

> cat Dockerfile
FROM docker.elastic.co/elasticsearch/elasticsearch:%%BASE_IMAGE_TAG%%
RUN echo test 0 > /tmp/ftest0
RUN echo test 1 > /tmp/ftest1
RUN echo test 2 > /tmp/ftest2
RUN echo test 3 > /tmp/ftest3

> docker build -t es-test .
Sending build context to Docker daemon  16.9 kB
Step 1/5 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.3.0
5.3.0: Pulling from elasticsearch/elasticsearch
3690ec4760f9: Pull complete
f52154c3d3fc: Pull complete
4075cc5db14a: Pull complete
2d52f619a2d6: Pull complete
8bb7b04253b3: Pull complete
1cb5a2a95e15: Pull complete
e2f60cff0f2f: Pull complete
9eb2cf345f74: Pull complete
ff68f5fa0401: Pull complete
90f6e7841041: Pull complete
Digest: sha256:56ac964338bc74f3874d63271433f6555648d55405a89c96f56a18dee48456eb
Status: Downloaded newer image for docker.elastic.co/elasticsearch/elasticsearch:5.3.0
 ---> ccec59a7dd84
Step 2/5 : RUN echo test 0 > /tmp/ftest0
 ---> Running in b784a0ebafba
 ---> 879153c89167
Removing intermediate container b784a0ebafba
Step 3/5 : RUN echo test 1 > /tmp/ftest1
 ---> Running in ccd049d13239
 ---> afb0028249b2
Removing intermediate container ccd049d13239
Step 4/5 : RUN echo test 2 > /tmp/ftest2
 ---> Running in 774a8ea8bd63
 ---> d99b7a9a755e
Removing intermediate container 774a8ea8bd63
Step 5/5 : RUN echo test 3 > /tmp/ftest3
 ---> Running in d00844f41d93
 ---> 500b42458aab
Removing intermediate container d00844f41d93
Successfully built 500b42458aab

> > docker run es-test
[2017-03-30T20:01:51,750][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T20:01:51,829][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:127) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:114) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:58) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.cli.Command.main(Command.java:88) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:91) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:84) ~[elasticsearch-5.3.0.jar:5.3.0]
Caused by: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/docker-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:260) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:258) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:238) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:360) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:123) ~[elasticsearch-5.3.0.jar:5.3.0]
	... 6 more
Caused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data/nodes/0
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:239) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:258) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:238) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:360) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:123) ~[elasticsearch-5.3.0.jar:5.3.0]
	... 6 more
Caused by: java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91) ~[?:?]
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65) ~[?:?]
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51) ~[?:?]
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368) ~[?:?]
	at java.nio.file.Files.getFileStore(Files.java:1461) ~[?:1.8.0_92-internal]
	at org.elasticsearch.env.ESFileStore.getMatchingFileStore(ESFileStore.java:107) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.env.Environment.getFileStore(Environment.java:340) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.env.NodeEnvironment$NodePath.<init>(NodeEnvironment.java:108) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:227) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:258) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.node.Node.<init>(Node.java:238) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap$6.<init>(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:242) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:360) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:123) ~[elasticsearch-5.3.0.jar:5.3.0]
	... 6 more

I have tried to squash my docker image and it helps:

> docker build -t es-test --squash .
Sending build context to Docker daemon  16.9 kB
Step 1/5 : FROM docker.elastic.co/elasticsearch/elasticsearch:5.3.0
 ---> ccec59a7dd84
Step 2/5 : RUN echo test 0 > /tmp/ftest0
 ---> Using cache
 ---> 879153c89167
Step 3/5 : RUN echo test 1 > /tmp/ftest1
 ---> Using cache
 ---> afb0028249b2
Step 4/5 : RUN echo test 2 > /tmp/ftest2
 ---> Using cache
 ---> d99b7a9a755e
Step 5/5 : RUN echo test 3 > /tmp/ftest3
 ---> Using cache
 ---> 500b42458aab
Successfully built 500b42458aab

> docker run es-test
[2017-03-30T20:03:14,585][INFO ][o.e.n.Node               ] [] initializing ...
[2017-03-30T20:03:15,226][INFO ][o.e.e.NodeEnvironment    ] [5oozheL] using [1] data paths, mounts [[/ (overlay)]], net usable_space [58.2gb], net total_space [62.7gb], spins? [possibly], types [overlay]
[2017-03-30T20:03:15,227][INFO ][o.e.e.NodeEnvironment    ] [5oozheL] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-03-30T20:03:15,228][INFO ][o.e.n.Node               ] node name [5oozheL] derived from node ID [5oozheL1SgK7vIY9uIk17A]; set [node.name] to override
[2017-03-30T20:03:15,229][INFO ][o.e.n.Node               ] version[5.3.0], pid[1], build[3adb13b/2017-03-23T03:31:50.652Z], OS[Linux/4.9.13-moby/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_92-internal/25.92-b14]
[2017-03-30T20:03:19,691][INFO ][o.e.p.PluginsService     ] [5oozheL] loaded module [aggs-matrix-stats]
[2017-03-30T20:03:19,692][INFO ][o.e.p.PluginsService     ] [5oozheL] loaded module [ingest-common]
...
@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Mar 30, 2017

Contributor

This reproduces for me on Docker for Windows version 17.03.0-ce-win1, (run from Powershell).

It does not reproduce on Docker for Linux version 17.03.0-ce.

Interesting. I think the Docker VM, as used by OSX and Windows is introducing the problem.

Contributor

jarpy commented Mar 30, 2017

This reproduces for me on Docker for Windows version 17.03.0-ce-win1, (run from Powershell).

It does not reproduce on Docker for Linux version 17.03.0-ce.

Interesting. I think the Docker VM, as used by OSX and Windows is introducing the problem.

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Mar 30, 2017

Contributor

The error is also at run time, not at build time. The same image runs on my Linux host, but not on my "Windows" host. I suspect a limit, that can hopefully be adjusted.

Contributor

jarpy commented Mar 30, 2017

The error is also at run time, not at build time. The same image runs on my Linux host, but not on my "Windows" host. I suspect a limit, that can hopefully be adjusted.

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Mar 31, 2017

Contributor

Ah. My Linux host is using aufs, the Windows/Alpine host is using overlay2. This is almost certainly an overlay problem.

See: https://bugs.openjdk.java.net/browse/JDK-8165852

Contributor

jarpy commented Mar 31, 2017

Ah. My Linux host is using aufs, the Windows/Alpine host is using overlay2. This is almost certainly an overlay problem.

See: https://bugs.openjdk.java.net/browse/JDK-8165852

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Mar 31, 2017

Contributor

The exception is thrown from here, a function which, among things, iterates over the contents of /proc/mounts. Just look at the size of the entry for the overlay filesystem:

bash-4.3$ cat /proc/mounts  | grep overlay
overlay / overlay rw,relatime,lowerdir=/var/lib/docker/overlay2/l/A3AFSC7UMLGP3RSJ3GLYU5QISA:/var/lib/docker/overlay2/l/U5DS5KSEVFLCSMRMQHSP4WKPV2:/var/lib/docker/overlay2/l/JB44VJRVYXDEUFSPQBQZC5KSTD:/var/lib/docker/overlay2/l/6FNQPK5TDM4O64MBPAI4WSH4VT:/var/lib/docker/overlay2/l/B53IOX2SGG24VUNC4DQPW4FIUO:/var/lib/docker/overlay2/l/GSSGTPYU2NTJQADDFGKJSLIX6F:/var/lib/docker/overlay2/l/3F5GHL6JSC3IQEOLJ6UOLR3ALD:/var/lib/docker/overlay2/l/ENNR2KKWTLXFQZRKBROP226564:/var/lib/docker/overlay2/l/NSFVF5QLCE5SUVVKZ3KQ6K7L4Y:/var/lib/docker/overlay2/l/XANLMD5PTSTIUK2LUQ5IOJZ4TY:/var/lib/docker/overlay2/l/CXHNN3LK73XRON5BLFGEPOIJIS:/var/lib/docker/overlay2/l/UZAEVH44OCMYQA3UUPMMWVPE6B:/var/lib/docker/overlay2/l/D2WZ6SYTS5FWUYP2LASIC3OYQ5:/var/lib/docker/overlay2/l/V4T4HTM6GAN3CVBNFP7HGLKZE7:/var/lib/docker/overlay2/l/63NJ5LBFI3WM4AGSJFMCLG7KMK,upperdir=/var/lib/docker/overlay2/46ae7377ee511513605cc359236c3ee8eab96491f69f244f220329ab0b469899/diff,workdir=/var/lib/docker/overlay2/46ae7377ee511513605cc359236c3ee8eab96491f69f244f220329ab0b469899/work 0 0

With four layers in the test Dockerfile, but not with three, it's a little over 1 KiB. Very suspicious.

Sorry to bug you, @jasontedor, but it's been 10 years since I wrote any Java, so I have no idea if I'm on the right track. Do you see anything obvious in findMountEntry() or getMountEntries() that would make them choke on a large line in /proc/mounts? Thanks.

Contributor

jarpy commented Mar 31, 2017

The exception is thrown from here, a function which, among things, iterates over the contents of /proc/mounts. Just look at the size of the entry for the overlay filesystem:

bash-4.3$ cat /proc/mounts  | grep overlay
overlay / overlay rw,relatime,lowerdir=/var/lib/docker/overlay2/l/A3AFSC7UMLGP3RSJ3GLYU5QISA:/var/lib/docker/overlay2/l/U5DS5KSEVFLCSMRMQHSP4WKPV2:/var/lib/docker/overlay2/l/JB44VJRVYXDEUFSPQBQZC5KSTD:/var/lib/docker/overlay2/l/6FNQPK5TDM4O64MBPAI4WSH4VT:/var/lib/docker/overlay2/l/B53IOX2SGG24VUNC4DQPW4FIUO:/var/lib/docker/overlay2/l/GSSGTPYU2NTJQADDFGKJSLIX6F:/var/lib/docker/overlay2/l/3F5GHL6JSC3IQEOLJ6UOLR3ALD:/var/lib/docker/overlay2/l/ENNR2KKWTLXFQZRKBROP226564:/var/lib/docker/overlay2/l/NSFVF5QLCE5SUVVKZ3KQ6K7L4Y:/var/lib/docker/overlay2/l/XANLMD5PTSTIUK2LUQ5IOJZ4TY:/var/lib/docker/overlay2/l/CXHNN3LK73XRON5BLFGEPOIJIS:/var/lib/docker/overlay2/l/UZAEVH44OCMYQA3UUPMMWVPE6B:/var/lib/docker/overlay2/l/D2WZ6SYTS5FWUYP2LASIC3OYQ5:/var/lib/docker/overlay2/l/V4T4HTM6GAN3CVBNFP7HGLKZE7:/var/lib/docker/overlay2/l/63NJ5LBFI3WM4AGSJFMCLG7KMK,upperdir=/var/lib/docker/overlay2/46ae7377ee511513605cc359236c3ee8eab96491f69f244f220329ab0b469899/diff,workdir=/var/lib/docker/overlay2/46ae7377ee511513605cc359236c3ee8eab96491f69f244f220329ab0b469899/work 0 0

With four layers in the test Dockerfile, but not with three, it's a little over 1 KiB. Very suspicious.

Sorry to bug you, @jasontedor, but it's been 10 years since I wrote any Java, so I have no idea if I'm on the right track. Do you see anything obvious in findMountEntry() or getMountEntries() that would make them choke on a large line in /proc/mounts? Thanks.

@prehor

This comment has been minimized.

Show comment
Hide comment
@prehor

prehor Mar 31, 2017

Excellent work @jarpy. Docker for mac use overlay2 storage driver too (docker/for-mac#516).

I have switched storage driver to aufs and the problem is gone (but some other may arise).

prehor commented Mar 31, 2017

Excellent work @jarpy. Docker for mac use overlay2 storage driver too (docker/for-mac#516).

I have switched storage driver to aufs and the problem is gone (but some other may arise).

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Mar 31, 2017

Contributor

Thanks @prehor. We'll be switching to glibc soon, so this should get solved properly!

Contributor

jarpy commented Mar 31, 2017

Thanks @prehor. We'll be switching to glibc soon, so this should get solved properly!

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Mar 31, 2017

Contributor

Actually, my money is on the native call to getmntent(3).

Oh look! What's the chances we don't have this patch in our musl libc?

https://git.musl-libc.org/cgit/musl/commit/src/misc/mntent.c?id=05973dc3bbc1aca9b3c8347de6879ed72147ab3b

Contributor

jarpy commented Mar 31, 2017

Actually, my money is on the native call to getmntent(3).

Oh look! What's the chances we don't have this patch in our musl libc?

https://git.musl-libc.org/cgit/musl/commit/src/misc/mntent.c?id=05973dc3bbc1aca9b3c8347de6879ed72147ab3b

@jarpy jarpy added the bug label Mar 31, 2017

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Mar 31, 2017

Contributor

And I'm a bit late in saying this, but thank you for the excellent reproduction steps. They were extremely useful once we (kinda by accident) found the right environment to run them in.

Contributor

jarpy commented Mar 31, 2017

And I'm a bit late in saying this, but thank you for the excellent reproduction steps. They were extremely useful once we (kinda by accident) found the right environment to run them in.

@jarpy jarpy assigned jarpy and unassigned dliappis Mar 31, 2017

@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Mar 31, 2017

Contributor

Awesome investigation @jarpy thanks!

My tests on windows were on 17.03.0-ce and 17.03.1-ce, however the storage driver listed there is aufs.

It looks very likely that it is related to the musl bug that @jarpy mentioned, which would require an update of our Alpine base.

For 5.4.0 we will be moving away from Alpine and use a glibc based base-image so this problem will go away.
For 5.3.x, we will try updating the Alpine base and report back here is the problem goes away.

Thanks @prehor and @artholl for reporting this.

Contributor

dliappis commented Mar 31, 2017

Awesome investigation @jarpy thanks!

My tests on windows were on 17.03.0-ce and 17.03.1-ce, however the storage driver listed there is aufs.

It looks very likely that it is related to the musl bug that @jarpy mentioned, which would require an update of our Alpine base.

For 5.4.0 we will be moving away from Alpine and use a glibc based base-image so this problem will go away.
For 5.3.x, we will try updating the Alpine base and report back here is the problem goes away.

Thanks @prehor and @artholl for reporting this.

@jasontedor

This comment has been minimized.

Show comment
Hide comment
@jasontedor

jasontedor Mar 31, 2017

Member

Sorry to bug you, @jasontedor, but it's been 10 years since I wrote any Java, so I have no idea if I'm on the right track.

No worries, though it looks like you already got to the bottom of this one. Outstanding work! 😄

Member

jasontedor commented Mar 31, 2017

Sorry to bug you, @jasontedor, but it's been 10 years since I wrote any Java, so I have no idea if I'm on the right track.

No worries, though it looks like you already got to the bottom of this one. Outstanding work! 😄

@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Mar 31, 2017

Contributor

I built a version of es 5.3.0 that can be pulled from docker.elastic.co/elasticsearch/elasticsearch:5.3.0-alpine4.5 based on alpine:latest (4.5 with the latest jvm 1.8.121) which certainly includes a fix for the getmntent_r musl bug.
Unfortunately the problem persists, for me, trying to run a customized image that has at least 4 new layer, with overlay2, which makes sense as apparently our existing alpine base, 4.4 also includes the fix. All is good with aufs or devicemapper storage drivers.

This will require further investigation. (also cc @danielmitterdorfer )

Contributor

dliappis commented Mar 31, 2017

I built a version of es 5.3.0 that can be pulled from docker.elastic.co/elasticsearch/elasticsearch:5.3.0-alpine4.5 based on alpine:latest (4.5 with the latest jvm 1.8.121) which certainly includes a fix for the getmntent_r musl bug.
Unfortunately the problem persists, for me, trying to run a customized image that has at least 4 new layer, with overlay2, which makes sense as apparently our existing alpine base, 4.4 also includes the fix. All is good with aufs or devicemapper storage drivers.

This will require further investigation. (also cc @danielmitterdorfer )

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Mar 31, 2017

Contributor

alpine:latest seems to be on musl 1.1.15, where the latest is 1.1.16:


$ docker pull alpine && docker run alpine apk info musl
Using default tag: latest
latest: Pulling from library/alpine
Digest: sha256:58e1a1bb75db1b5a24a462dd5e2915277ea06438c3f105138f97eb53149673c4
Status: Image is up to date for alpine:latest
WARNING: Ignoring APKINDEX.c51f8f92.tar.gz: No such file or directory
WARNING: Ignoring APKINDEX.d09172fd.tar.gz: No such file or directory
musl-1.1.15-r6 description:
the musl c library (libc) implementation

...and the patch isn't in the v1.1.15 branch of musl:

$ git clone git://git.musl-libc.org/musl && cd musl
Cloning into 'musl'...
remote: Counting objects: 26971, done.
remote: Compressing objects: 100% (9651/9651), done.
remote: Total 26971 (delta 19187), reused 23815 (delta 16999)
Receiving objects: 100% (26971/26971), 4.15 MiB | 1.76 MiB/s, done.
Resolving deltas: 100% (19187/19187), done.
Checking connectivity... done.

$ grep SENTINEL src/misc/mntent.c
#define SENTINEL (char *)&internal_buf
        int cnt, n[8], use_internal = (linebuf == SENTINEL);
        return getmntent_r(f, &mnt, SENTINEL, 0);

$ git checkout -b v1.1.15 v1.1.15
Switched to a new branch 'v1.1.15'

$ grep SENTINEL src/misc/mntent.c
$
Contributor

jarpy commented Mar 31, 2017

alpine:latest seems to be on musl 1.1.15, where the latest is 1.1.16:


$ docker pull alpine && docker run alpine apk info musl
Using default tag: latest
latest: Pulling from library/alpine
Digest: sha256:58e1a1bb75db1b5a24a462dd5e2915277ea06438c3f105138f97eb53149673c4
Status: Image is up to date for alpine:latest
WARNING: Ignoring APKINDEX.c51f8f92.tar.gz: No such file or directory
WARNING: Ignoring APKINDEX.d09172fd.tar.gz: No such file or directory
musl-1.1.15-r6 description:
the musl c library (libc) implementation

...and the patch isn't in the v1.1.15 branch of musl:

$ git clone git://git.musl-libc.org/musl && cd musl
Cloning into 'musl'...
remote: Counting objects: 26971, done.
remote: Compressing objects: 100% (9651/9651), done.
remote: Total 26971 (delta 19187), reused 23815 (delta 16999)
Receiving objects: 100% (26971/26971), 4.15 MiB | 1.76 MiB/s, done.
Resolving deltas: 100% (19187/19187), done.
Checking connectivity... done.

$ grep SENTINEL src/misc/mntent.c
#define SENTINEL (char *)&internal_buf
        int cnt, n[8], use_internal = (linebuf == SENTINEL);
        return getmntent_r(f, &mnt, SENTINEL, 0);

$ git checkout -b v1.1.15 v1.1.15
Switched to a new branch 'v1.1.15'

$ grep SENTINEL src/misc/mntent.c
$
@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Apr 1, 2017

Contributor

Thanks for checking this @jarpy !

Cognizant of the risk that this comment may be treated as an April Fools joke :) (happy April 1st everyone!), I believe I saw a backporting patch that the Alpine folks have done for this already in musl 1.1.14-r12 here :

diff --git a/main/musl/APKBUILD b/main/musl/APKBUILD
index e010239..02c68e4 100644
--- a/main/musl/APKBUILD
+++ b/main/musl/APKBUILD
@@ -2,7 +2,7 @@
 # Maintainer: Timo Teräs <timo.teras@iki.fi>
 pkgname=musl
 pkgver=1.1.14
-pkgrel=11
+pkgrel=12
 pkgdesc="the musl c library (libc) implementation"
 url="http://www.musl-libc.org/"
 arch="all"
@@ -33,6 +33,7 @@ source="http://www.musl-libc.org/releases/musl-$pkgver.tar.gz
 	0019-fix-misordered-syscall-arguments-for-posix_fadvise-o.patch
 	0020-improve-abort-fallback-behavior-when-raising-SIGABRT.patch
 	0021-fix-asctime-day-month-names-not-to-vary-by-locale.patch
+	0001-use-dynamic-buffer-for-getmntent.patch

Just to verify this, I read a bit how to build source packages on Alpine and ended up in the Aports tree. Indeed I could trace the patch already committed on Sep 14 2016:

$ git clone git://dev.alpinelinux.org/aports
Cloning into 'aports'...
remote: Counting objects: 307103, done.
remote: Compressing objects: 100% (91045/91045), done.
remote: Total 307103 (delta 188875), reused 303737 (delta 185805)
Receiving objects: 100% (307103/307103), 113.76 MiB | 4.15 MiB/s, done.
Resolving deltas: 100% (188875/188875), done.
Checking connectivity... done.

$ cd aports/

$ git log -p | grep -A40 6a2b1e8bc87aca9f100a08c15335246a1744b1fd
commit 6a2b1e8bc87aca9f100a08c15335246a1744b1fd
Author: Natanael Copa <ncopa@alpinelinux.org>
Date:   Wed Sep 14 12:38:58 2016 +0200

    main/musl: fix for getmntent
    
    ref #5703

diff --git a/main/musl/0001-use-dynamic-buffer-for-getmntent.patch b/main/musl/0001-use-dynamic-buffer-for-getmntent.patch
new file mode 100644
index 0000000..12feaa8
--- /dev/null
+++ b/main/musl/0001-use-dynamic-buffer-for-getmntent.patch
@@ -0,0 +1,66 @@
+From c0dd2db34a8a5d51699406a90a4f87cb3129d5a1 Mon Sep 17 00:00:00 2001
+From: Natanael Copa <ncopa@alpinelinux.org>
+Date: Thu, 8 Sep 2016 09:06:22 +0200
+Subject: [PATCH] use dynamic buffer for getmntent
+
+overlayfs may have fairly long lines so we use getline to allocate a
+buffer dynamically. The buffer will be allocated on first use, expand as
+needed, but will never be free'ed.
+
+Downstream bug: http://bugs.alpinelinux.org/issues/5703
+
+Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
+---
+ src/misc/mntent.c | 16 ++++++++++++----
+ 1 file changed, 12 insertions(+), 4 deletions(-)
+
+diff --git a/src/misc/mntent.c b/src/misc/mntent.c
+index a16d652..722a030 100644
+--- a/src/misc/mntent.c
++++ b/src/misc/mntent.c
+@@ -3,6 +3,11 @@
+ #include <mntent.h>
+ #include <errno.h>
+ 
++static char *internal_buf = NULL;
++static size_t internal_bufsize = 0;
++
Contributor

dliappis commented Apr 1, 2017

Thanks for checking this @jarpy !

Cognizant of the risk that this comment may be treated as an April Fools joke :) (happy April 1st everyone!), I believe I saw a backporting patch that the Alpine folks have done for this already in musl 1.1.14-r12 here :

diff --git a/main/musl/APKBUILD b/main/musl/APKBUILD
index e010239..02c68e4 100644
--- a/main/musl/APKBUILD
+++ b/main/musl/APKBUILD
@@ -2,7 +2,7 @@
 # Maintainer: Timo Teräs <timo.teras@iki.fi>
 pkgname=musl
 pkgver=1.1.14
-pkgrel=11
+pkgrel=12
 pkgdesc="the musl c library (libc) implementation"
 url="http://www.musl-libc.org/"
 arch="all"
@@ -33,6 +33,7 @@ source="http://www.musl-libc.org/releases/musl-$pkgver.tar.gz
 	0019-fix-misordered-syscall-arguments-for-posix_fadvise-o.patch
 	0020-improve-abort-fallback-behavior-when-raising-SIGABRT.patch
 	0021-fix-asctime-day-month-names-not-to-vary-by-locale.patch
+	0001-use-dynamic-buffer-for-getmntent.patch

Just to verify this, I read a bit how to build source packages on Alpine and ended up in the Aports tree. Indeed I could trace the patch already committed on Sep 14 2016:

$ git clone git://dev.alpinelinux.org/aports
Cloning into 'aports'...
remote: Counting objects: 307103, done.
remote: Compressing objects: 100% (91045/91045), done.
remote: Total 307103 (delta 188875), reused 303737 (delta 185805)
Receiving objects: 100% (307103/307103), 113.76 MiB | 4.15 MiB/s, done.
Resolving deltas: 100% (188875/188875), done.
Checking connectivity... done.

$ cd aports/

$ git log -p | grep -A40 6a2b1e8bc87aca9f100a08c15335246a1744b1fd
commit 6a2b1e8bc87aca9f100a08c15335246a1744b1fd
Author: Natanael Copa <ncopa@alpinelinux.org>
Date:   Wed Sep 14 12:38:58 2016 +0200

    main/musl: fix for getmntent
    
    ref #5703

diff --git a/main/musl/0001-use-dynamic-buffer-for-getmntent.patch b/main/musl/0001-use-dynamic-buffer-for-getmntent.patch
new file mode 100644
index 0000000..12feaa8
--- /dev/null
+++ b/main/musl/0001-use-dynamic-buffer-for-getmntent.patch
@@ -0,0 +1,66 @@
+From c0dd2db34a8a5d51699406a90a4f87cb3129d5a1 Mon Sep 17 00:00:00 2001
+From: Natanael Copa <ncopa@alpinelinux.org>
+Date: Thu, 8 Sep 2016 09:06:22 +0200
+Subject: [PATCH] use dynamic buffer for getmntent
+
+overlayfs may have fairly long lines so we use getline to allocate a
+buffer dynamically. The buffer will be allocated on first use, expand as
+needed, but will never be free'ed.
+
+Downstream bug: http://bugs.alpinelinux.org/issues/5703
+
+Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
+---
+ src/misc/mntent.c | 16 ++++++++++++----
+ 1 file changed, 12 insertions(+), 4 deletions(-)
+
+diff --git a/src/misc/mntent.c b/src/misc/mntent.c
+index a16d652..722a030 100644
+--- a/src/misc/mntent.c
++++ b/src/misc/mntent.c
+@@ -3,6 +3,11 @@
+ #include <mntent.h>
+ #include <errno.h>
+ 
++static char *internal_buf = NULL;
++static size_t internal_bufsize = 0;
++
@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Apr 1, 2017

Contributor

I was looking at this again today and noticed a reference to sun.nio.fs.LinuxFileSystemProvider.getFileStore in the jvm stacktrace further down after findMountEntry.

I think we are seeing a manifestation of the OpenJDK bug jdk-8165852 linked in @jarpy's earlier comment, which is still Open/Unresolved.

The interesting thing is that with the help of the Class below[1] (mentioned in the bug report) I was consistently able to reproduce the trace produced by Elasticsearch regardless of the docker base; this issue happens both on musl or glibc based docker images with the overlay/overlay2 storage driver.

I used the MyMain class[1] to test two docker base images on two Linux machines:

  • Host OS: Fedora 25, Docker 1.13.1, DeviceMapper storage driver

    Docker images:

    • alpine:latest
    • fedora:latest
  • Host OS: Fedora 25, Docker 17.03.1-ce, overlay storage driver

    Docker images:

    • alpine:latest
    • fedora: latest

The results were a PASS regardless of the image used with the devicemapper driver and a consistent FAIL with the overlay driver.

I am not sure what can be done here until jdk-8165852 is resolved in java.nio apart from advising not to run on overlay/overlay2.
Thoughts @jasontedor , @jarpy ? (cc @drewr)

FAIL: alpine:latest, using overlay

$ docker run --rm -ti alpine:latest /bin/sh
/ # ( apk update && apk add openjdk8 ) 2>&1 >/dev/null

/ # export PATH=/usr/lib/jvm/default-jvm/bin:$PATH

/ # cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.5.2
PRETTY_NAME="Alpine Linux v3.5"
HOME_URL="http://alpinelinux.org"
BUG_REPORT_URL="http://bugs.alpinelinux.org"

/ # cat >MyMain.java
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}

/ # javac MyMain.java

/ # touch file.txt

/ # java MyMain /file.txt
Exception in thread "main" java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91)
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65)
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44)
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51)
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39)
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368)
	at java.nio.file.Files.getFileStore(Files.java:1461)
	at MyMain.main(MyMain.java:8)

FAIL: fedora:latest, overlay

$ docker run --rm -ti fedora:latest /bin/bash
[root@b3a26352e373 /]# cat /etc/os-release
NAME=Fedora
VERSION="25 (Twenty Five)"
...

[root@b3a26352e373 /]# dnf install -y java-1.8.0-openjdk-devel
Last metadata expiration check: 0:00:54 ago on Sat Apr  1 16:44:45 2017.
Dependencies resolved.
================================================================================================================================================================================
 Package                                              Arch                            Version                                            Repository                        Size
================================================================================================================================================================================
Installing:

<snip>

[root@b3a26352e373 /]# cat >MyMain.java
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}

[root@b3a26352e373 /]# javac MyMain.java

[root@b3a26352e373 /]# touch file.txt

[root@b3a26352e373 /]# java MyMain /file.txt
Exception in thread "main" java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91)
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65)
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44)
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51)
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39)
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368)
	at java.nio.file.Files.getFileStore(Files.java:1461)
	at MyMain.main(MyMain.java:8)

SUCCESS: alpine:latest using devicemapper storage

$ docker run --rm -ti alpine:latest /bin/sh
/ # ( apk update && apk add openjdk8 ) 2>&1 >/dev/null
/ # export PATH=/usr/lib/jvm/default-jvm/bin:$PATH
/ # cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.5.2
PRETTY_NAME="Alpine Linux v3.5"
HOME_URL="http://alpinelinux.org"
BUG_REPORT_URL="http://bugs.alpinelinux.org"

/ # cat >MyMain.java
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}

/ # javac MyMain.java

/ # touch file.txt

/ # java MyMain /file.txt
/ (/dev/mapper/docker-253:1-9175146-312c66541cdf0902694a27c5c888eabbde7e892a412de1b4dacc1bfd690b56c8)

SUCCESS: fedora:latest using devicemapper storage

dl@MS-7885 ~ $ docker run --rm -ti fedora:latest /bin/bash
[root@c4666f6c2148 /]# cat /etc/os-release
NAME=Fedora
VERSION="25 (Twenty Five)"
<snip>

[root@c4666f6c2148 /]# dnf install -y java-1.8.0-openjdk-devel
Fedora 25 - x86_64 - Updates                                                                                                                    4.3 MB/s |  21 MB     00:04
<snip>
Complete!

[root@c4666f6c2148 /]# cat >MyMain.java
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}
[root@c4666f6c2148 /]# javac MyMain.java

[root@c4666f6c2148 /]# touch file.txt

[root@c4666f6c2148 /]# java MyMain /file.txt
/ (/dev/mapper/docker-253:1-9175146-f8c88915542aa9232ffeffdd0decd0d003c6568fa90ef45a4be8ecf5cbf23c91)

[1]

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}
Contributor

dliappis commented Apr 1, 2017

I was looking at this again today and noticed a reference to sun.nio.fs.LinuxFileSystemProvider.getFileStore in the jvm stacktrace further down after findMountEntry.

I think we are seeing a manifestation of the OpenJDK bug jdk-8165852 linked in @jarpy's earlier comment, which is still Open/Unresolved.

The interesting thing is that with the help of the Class below[1] (mentioned in the bug report) I was consistently able to reproduce the trace produced by Elasticsearch regardless of the docker base; this issue happens both on musl or glibc based docker images with the overlay/overlay2 storage driver.

I used the MyMain class[1] to test two docker base images on two Linux machines:

  • Host OS: Fedora 25, Docker 1.13.1, DeviceMapper storage driver

    Docker images:

    • alpine:latest
    • fedora:latest
  • Host OS: Fedora 25, Docker 17.03.1-ce, overlay storage driver

    Docker images:

    • alpine:latest
    • fedora: latest

The results were a PASS regardless of the image used with the devicemapper driver and a consistent FAIL with the overlay driver.

I am not sure what can be done here until jdk-8165852 is resolved in java.nio apart from advising not to run on overlay/overlay2.
Thoughts @jasontedor , @jarpy ? (cc @drewr)

FAIL: alpine:latest, using overlay

$ docker run --rm -ti alpine:latest /bin/sh
/ # ( apk update && apk add openjdk8 ) 2>&1 >/dev/null

/ # export PATH=/usr/lib/jvm/default-jvm/bin:$PATH

/ # cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.5.2
PRETTY_NAME="Alpine Linux v3.5"
HOME_URL="http://alpinelinux.org"
BUG_REPORT_URL="http://bugs.alpinelinux.org"

/ # cat >MyMain.java
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}

/ # javac MyMain.java

/ # touch file.txt

/ # java MyMain /file.txt
Exception in thread "main" java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91)
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65)
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44)
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51)
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39)
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368)
	at java.nio.file.Files.getFileStore(Files.java:1461)
	at MyMain.main(MyMain.java:8)

FAIL: fedora:latest, overlay

$ docker run --rm -ti fedora:latest /bin/bash
[root@b3a26352e373 /]# cat /etc/os-release
NAME=Fedora
VERSION="25 (Twenty Five)"
...

[root@b3a26352e373 /]# dnf install -y java-1.8.0-openjdk-devel
Last metadata expiration check: 0:00:54 ago on Sat Apr  1 16:44:45 2017.
Dependencies resolved.
================================================================================================================================================================================
 Package                                              Arch                            Version                                            Repository                        Size
================================================================================================================================================================================
Installing:

<snip>

[root@b3a26352e373 /]# cat >MyMain.java
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}

[root@b3a26352e373 /]# javac MyMain.java

[root@b3a26352e373 /]# touch file.txt

[root@b3a26352e373 /]# java MyMain /file.txt
Exception in thread "main" java.io.IOException: Mount point not found
	at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:91)
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:65)
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:44)
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51)
	at sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39)
	at sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368)
	at java.nio.file.Files.getFileStore(Files.java:1461)
	at MyMain.main(MyMain.java:8)

SUCCESS: alpine:latest using devicemapper storage

$ docker run --rm -ti alpine:latest /bin/sh
/ # ( apk update && apk add openjdk8 ) 2>&1 >/dev/null
/ # export PATH=/usr/lib/jvm/default-jvm/bin:$PATH
/ # cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.5.2
PRETTY_NAME="Alpine Linux v3.5"
HOME_URL="http://alpinelinux.org"
BUG_REPORT_URL="http://bugs.alpinelinux.org"

/ # cat >MyMain.java
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}

/ # javac MyMain.java

/ # touch file.txt

/ # java MyMain /file.txt
/ (/dev/mapper/docker-253:1-9175146-312c66541cdf0902694a27c5c888eabbde7e892a412de1b4dacc1bfd690b56c8)

SUCCESS: fedora:latest using devicemapper storage

dl@MS-7885 ~ $ docker run --rm -ti fedora:latest /bin/bash
[root@c4666f6c2148 /]# cat /etc/os-release
NAME=Fedora
VERSION="25 (Twenty Five)"
<snip>

[root@c4666f6c2148 /]# dnf install -y java-1.8.0-openjdk-devel
Fedora 25 - x86_64 - Updates                                                                                                                    4.3 MB/s |  21 MB     00:04
<snip>
Complete!

[root@c4666f6c2148 /]# cat >MyMain.java
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}
[root@c4666f6c2148 /]# javac MyMain.java

[root@c4666f6c2148 /]# touch file.txt

[root@c4666f6c2148 /]# java MyMain /file.txt
/ (/dev/mapper/docker-253:1-9175146-f8c88915542aa9232ffeffdd0decd0d003c6568fa90ef45a4be8ecf5cbf23c91)

[1]

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class MyMain {

public static void main(String[] args) throws IOException {
System.out.println(Files.getFileStore(Paths.get(args[0])));
}
}
@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Apr 2, 2017

Contributor

Possibly related to: elastic/elasticsearch#14337

Contributor

dliappis commented Apr 2, 2017

Possibly related to: elastic/elasticsearch#14337

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Apr 2, 2017

Contributor

Top work, @dliappis. I'm so sorry about the red herring; I really thought I had it figured out! It didn't occur to me (though it should have) that Alpine would maintain its own tree of patches against musl.

Contributor

jarpy commented Apr 2, 2017

Top work, @dliappis. I'm so sorry about the red herring; I really thought I had it figured out! It didn't occur to me (though it should have) that Alpine would maintain its own tree of patches against musl.

@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Apr 3, 2017

Contributor

We are troubleshooting this together with @danielmitterdorfer , no need to also spend cycles on this @jasontedor

Contributor

dliappis commented Apr 3, 2017

We are troubleshooting this together with @danielmitterdorfer , no need to also spend cycles on this @jasontedor

@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Apr 3, 2017

Contributor
  1. It's easy to confirm if the musl version included in Alpine contains the old getmntent_r patch mentioned earlier; if overlay / in /proc/mounts exceeds the 256 byte limit, df won't show anything.
    Both our old alpine:3.4.4-r12 and new alpine:3.5.5 bases work fine in that respect.

  2. I built a test image of elasticsearch 5.3.0 based on centos:7 (docker.elastic.co/elasticsearch/elasticsearch:5.3.0-centos) and Elasticsearch starts fine from a customized image with numerous layers added, using overlayfs. Therefore @jarpy 's hunch about the length of the overlay / line in proc/mounts exceeding 1024 is right.

  3. The jvm bug 8165852 that I was able to reproduce regardless of glibc/musl is not related to the ES start failures we are seeing here. If ES where to check for the existence of a /tmp/foo file it would fail due to that too. As per the issue:

    the parent of a path component has a different device ID than that of the current path component and so the loop of the upwards traversal towards the root is broken out of but there is no entry for the current path component in /proc/mounts.

  4. There is a problem when calling getmtnent_r with a 1024B buffer on Alpine. This is the buffer size used by the JVM JNI interface and on Alpine this returns an error[1][2], whereas on glibc it returns a mount entry truncated to 1024 bytes[2].

  5. The error in 4. is propagated up to the getMountEntries method in the JVM, which in turn propagates back to findMountEntry() method that ES uses during bootstrap to inspect the dir /usr/share/elasticsearch/data/nodes/0/indices

  6. And finally 5. of courses raises Mount point not found

Next steps

  1. File a bug for getmntent_r on Alpine/musl. The code in [1] and steps in [2] can be used to reproduce what JNI does.

  2. This bug doesn't affect running docker.elastic.co/elasticsearch/elasticsearch:5.3.0 as is with the overlay/overlay2 storage driver, unless, through customization, the entry for overlay / in /proc/mounts exceeds 1024bytes.

  3. Elasticsearch 5.4.0 will be based on a glibc based Docker image so this won't be an issue.

  4. It is a bit strange that java calls to JNI for getmntent get redirected to getmntent_r using a 1024 buffer instead of just plainly using getmntent. That could be worth of an openjdk report too.

[1]

$ cat getmntent_r_tester.c 
#include <stdio.h>
#include <mntent.h>

void print_mount(const struct mntent *fs);

int main(int argc, char **argv)
{
    FILE *fp;
    struct mntent *fs;
    struct mntent ent;
    char buf[1024];
    int buflen = sizeof(buf);

    fp = setmntent("/proc/mounts", "r");    /* read only */
    if (fp == NULL) {
        printf("Error calling setmntent\n");
        return(1);
    }
    while ((fs = getmntent_r(fp, &ent, (char*)&buf, buflen )) != NULL)
    print_mount(fs);

    endmntent(fp);
}

void print_mount(const struct mntent *fs)
{
    printf("%s %s %s %s %d %d\n",
        fs->mnt_fsname,
        fs->mnt_dir,
        fs->mnt_type,
        fs->mnt_opts,
        fs->mnt_freq,
        fs->mnt_passno);
}

[2] Reproduction steps

a) ensure docker image runs on overlay/overlay2 and /proc/mounts has >1024byte entry for /

[root@63f3de511973 ~]# grep overlay /proc/mounts | wc -c
1069

b) run the following

$ gcc -o getmntent_r_tester getmntent_r_tester.c

$ ./getmntent_r_tester /tmp/foo # no output up to Alpine 3.5.2

$ ./getmntent_r_tester /tmp/foo # works on glibc
$ ./getmnt_r_tester /tmp/foo 
overlay / overlay rw,seclabel,relatime,lowerdir=/var/lib/docker/overlay2/l/LXC3B4627E6FDKGKI4VWHNAPXH:/var/lib/docker/overlay2/l/3UBPKMSJ4T4QF7RI3UW7DOH3GK:/var/lib/docker/overlay2/l/TEF22JJA3JGGYCAJOREYB5CY7S:/var/lib/docker/overlay2/l/P6ZYS7MKMVJGPULKQUOUWFCJJ6:/var/lib/docker/overlay2/l/GCYZ4K6KXOPQMZ7X3NMQAJBC2L:/var/lib/docker/overlay2/l/M4UZQLFSMDR2XRXYV575UVV7YH:/var/lib/docker/overlay2/l/WSU2NX6Q5XRNHZ3SIOYPRY25NU:/var/lib/docker/overlay2/l/F2UGDBPBK7V4KMNESX4EJ6J5I4:/var/lib/docker/overlay2/l/SXMXPXTDPIGRRYKPAUQT2D5BHQ:/var/lib/docker/overlay2/l/UW5HJNVFBO2UC2OLMYATB6YDIF:/var/lib/docker/overlay2/l/NX6BU7PB744YK3CVSV2VFDM2N6:/var/lib/docker/overlay2/l/MTYPKNDUKDBAOTLBBTZOR75D6Z:/var/lib/docker/overlay2/l/TDU3ZWOTETCVFJDJUXOMYK557E:/var/lib/docker/overlay2/l/6QTKWHXJBHTS6ALQUWTZKAEFOX:/var/lib/docker/overlay2/l/TDFCD6VKPUNVCJYFT7A2Z6MBT2,upperdir=/var/lib/docker/overlay2/2db15568aae60cc82a9fd52b15ff4570f2d835a72d1538265e6f8d431c3fd977/diff,workdir=/var/lib/docker/overlay2/2db15568aae60cc82a9fd52b15ff 0 0
Contributor

dliappis commented Apr 3, 2017

  1. It's easy to confirm if the musl version included in Alpine contains the old getmntent_r patch mentioned earlier; if overlay / in /proc/mounts exceeds the 256 byte limit, df won't show anything.
    Both our old alpine:3.4.4-r12 and new alpine:3.5.5 bases work fine in that respect.

  2. I built a test image of elasticsearch 5.3.0 based on centos:7 (docker.elastic.co/elasticsearch/elasticsearch:5.3.0-centos) and Elasticsearch starts fine from a customized image with numerous layers added, using overlayfs. Therefore @jarpy 's hunch about the length of the overlay / line in proc/mounts exceeding 1024 is right.

  3. The jvm bug 8165852 that I was able to reproduce regardless of glibc/musl is not related to the ES start failures we are seeing here. If ES where to check for the existence of a /tmp/foo file it would fail due to that too. As per the issue:

    the parent of a path component has a different device ID than that of the current path component and so the loop of the upwards traversal towards the root is broken out of but there is no entry for the current path component in /proc/mounts.

  4. There is a problem when calling getmtnent_r with a 1024B buffer on Alpine. This is the buffer size used by the JVM JNI interface and on Alpine this returns an error[1][2], whereas on glibc it returns a mount entry truncated to 1024 bytes[2].

  5. The error in 4. is propagated up to the getMountEntries method in the JVM, which in turn propagates back to findMountEntry() method that ES uses during bootstrap to inspect the dir /usr/share/elasticsearch/data/nodes/0/indices

  6. And finally 5. of courses raises Mount point not found

Next steps

  1. File a bug for getmntent_r on Alpine/musl. The code in [1] and steps in [2] can be used to reproduce what JNI does.

  2. This bug doesn't affect running docker.elastic.co/elasticsearch/elasticsearch:5.3.0 as is with the overlay/overlay2 storage driver, unless, through customization, the entry for overlay / in /proc/mounts exceeds 1024bytes.

  3. Elasticsearch 5.4.0 will be based on a glibc based Docker image so this won't be an issue.

  4. It is a bit strange that java calls to JNI for getmntent get redirected to getmntent_r using a 1024 buffer instead of just plainly using getmntent. That could be worth of an openjdk report too.

[1]

$ cat getmntent_r_tester.c 
#include <stdio.h>
#include <mntent.h>

void print_mount(const struct mntent *fs);

int main(int argc, char **argv)
{
    FILE *fp;
    struct mntent *fs;
    struct mntent ent;
    char buf[1024];
    int buflen = sizeof(buf);

    fp = setmntent("/proc/mounts", "r");    /* read only */
    if (fp == NULL) {
        printf("Error calling setmntent\n");
        return(1);
    }
    while ((fs = getmntent_r(fp, &ent, (char*)&buf, buflen )) != NULL)
    print_mount(fs);

    endmntent(fp);
}

void print_mount(const struct mntent *fs)
{
    printf("%s %s %s %s %d %d\n",
        fs->mnt_fsname,
        fs->mnt_dir,
        fs->mnt_type,
        fs->mnt_opts,
        fs->mnt_freq,
        fs->mnt_passno);
}

[2] Reproduction steps

a) ensure docker image runs on overlay/overlay2 and /proc/mounts has >1024byte entry for /

[root@63f3de511973 ~]# grep overlay /proc/mounts | wc -c
1069

b) run the following

$ gcc -o getmntent_r_tester getmntent_r_tester.c

$ ./getmntent_r_tester /tmp/foo # no output up to Alpine 3.5.2

$ ./getmntent_r_tester /tmp/foo # works on glibc
$ ./getmnt_r_tester /tmp/foo 
overlay / overlay rw,seclabel,relatime,lowerdir=/var/lib/docker/overlay2/l/LXC3B4627E6FDKGKI4VWHNAPXH:/var/lib/docker/overlay2/l/3UBPKMSJ4T4QF7RI3UW7DOH3GK:/var/lib/docker/overlay2/l/TEF22JJA3JGGYCAJOREYB5CY7S:/var/lib/docker/overlay2/l/P6ZYS7MKMVJGPULKQUOUWFCJJ6:/var/lib/docker/overlay2/l/GCYZ4K6KXOPQMZ7X3NMQAJBC2L:/var/lib/docker/overlay2/l/M4UZQLFSMDR2XRXYV575UVV7YH:/var/lib/docker/overlay2/l/WSU2NX6Q5XRNHZ3SIOYPRY25NU:/var/lib/docker/overlay2/l/F2UGDBPBK7V4KMNESX4EJ6J5I4:/var/lib/docker/overlay2/l/SXMXPXTDPIGRRYKPAUQT2D5BHQ:/var/lib/docker/overlay2/l/UW5HJNVFBO2UC2OLMYATB6YDIF:/var/lib/docker/overlay2/l/NX6BU7PB744YK3CVSV2VFDM2N6:/var/lib/docker/overlay2/l/MTYPKNDUKDBAOTLBBTZOR75D6Z:/var/lib/docker/overlay2/l/TDU3ZWOTETCVFJDJUXOMYK557E:/var/lib/docker/overlay2/l/6QTKWHXJBHTS6ALQUWTZKAEFOX:/var/lib/docker/overlay2/l/TDFCD6VKPUNVCJYFT7A2Z6MBT2,upperdir=/var/lib/docker/overlay2/2db15568aae60cc82a9fd52b15ff4570f2d835a72d1538265e6f8d431c3fd977/diff,workdir=/var/lib/docker/overlay2/2db15568aae60cc82a9fd52b15ff 0 0
@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Apr 3, 2017

Contributor

Many thanks to @danielmitterdorfer for the help here!

Contributor

dliappis commented Apr 3, 2017

Many thanks to @danielmitterdorfer for the help here!

@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Apr 5, 2017

Contributor

I opened this alpinelinux issue

Contributor

dliappis commented Apr 5, 2017

I opened this alpinelinux issue

@ncopa

This comment has been minimized.

Show comment
Hide comment
@ncopa

ncopa May 30, 2017

Nice work!

This is actually a bug in openjdk8, which only allocates 1024 byte buffer for getmntent_r. Gnu libc will truncate it, while musl libc will return error.

This is fixed in openjdk8-8.131.11-r1.

ncopa commented May 30, 2017

Nice work!

This is actually a bug in openjdk8, which only allocates 1024 byte buffer for getmntent_r. Gnu libc will truncate it, while musl libc will return error.

This is fixed in openjdk8-8.131.11-r1.

@vinayak12

This comment has been minimized.

Show comment
Hide comment
@vinayak12

vinayak12 Jun 1, 2017

@ncopa how to take jdk fix (openjdk8-8.131.11-r1)?

vinayak12 commented Jun 1, 2017

@ncopa how to take jdk fix (openjdk8-8.131.11-r1)?

@ndegory

This comment has been minimized.

Show comment
Hide comment
@ndegory

ndegory Jun 5, 2017

@vinayak12 this is the version you get when you install openjdk8-jre on alpine:3.6.

ndegory commented Jun 5, 2017

@vinayak12 this is the version you get when you install openjdk8-jre on alpine:3.6.

@olalonde

This comment has been minimized.

Show comment
Hide comment
@olalonde

olalonde Jul 4, 2017

Is there a workaround for this? Having same issue with docker.elastic.co/elasticsearch/elasticsearch:5.3.2

olalonde commented Jul 4, 2017

Is there a workaround for this? Having same issue with docker.elastic.co/elasticsearch/elasticsearch:5.3.2

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Jul 4, 2017

Contributor

Your best bet is to try Elasticsearch 5.4, on the new centos:7 base.

Contributor

jarpy commented Jul 4, 2017

Your best bet is to try Elasticsearch 5.4, on the new centos:7 base.

@olalonde

This comment has been minimized.

Show comment
Hide comment
@olalonde

olalonde Jul 4, 2017

Ah thanks, I will... I had ended up rewriting my Dockerfile to fit in a single massive RUN command :)

olalonde commented Jul 4, 2017

Ah thanks, I will... I had ended up rewriting my Dockerfile to fit in a single massive RUN command :)

@jarpy

This comment has been minimized.

Show comment
Hide comment
@jarpy

jarpy Jul 5, 2017

Contributor

:) I know that pain. Just look at some of the stuff we have to do to save layers!

Contributor

jarpy commented Jul 5, 2017

:) I know that pain. Just look at some of the stuff we have to do to save layers!

@dliappis

This comment has been minimized.

Show comment
Hide comment
@dliappis

dliappis Jul 5, 2017

Contributor

Yes the best option is to use images >5.4 that are built on top of centos:7.

For the ones that need to stick to 5.3, 5.3.3 has been rebuilt on top of the latest base incorprating the openjdk fixes in Alpine mentioned earlier:

$ docker run --rm docker.elastic.co/elasticsearch/elasticsearch:5.3.3 java -version
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (IcedTea 3.4.0) (Alpine 8.131.11-r2)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)
Contributor

dliappis commented Jul 5, 2017

Yes the best option is to use images >5.4 that are built on top of centos:7.

For the ones that need to stick to 5.3, 5.3.3 has been rebuilt on top of the latest base incorprating the openjdk fixes in Alpine mentioned earlier:

$ docker run --rm docker.elastic.co/elasticsearch/elasticsearch:5.3.3 java -version
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (IcedTea 3.4.0) (Alpine 8.131.11-r2)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment