Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

es.default.path.data conflicts with path.data when path.data is a list #6887

Closed
mattrobenolt opened this issue Jul 16, 2014 · 2 comments · Fixed by #8381
Closed

es.default.path.data conflicts with path.data when path.data is a list #6887

mattrobenolt opened this issue Jul 16, 2014 · 2 comments · Fixed by #8381
Labels
>bug :Core/Infra/Settings Settings infrastructure and APIs help wanted adoptme

Comments

@mattrobenolt
Copy link

Version 1.2.2

An elastic search instance is running with the following arguments:

/usr/lib/jvm/java-7-oracle/bin/java -Xms15g -Xmx15g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.pidfile=/var/run/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/elasticsearch-1.2.2.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* -Des.default.config=/etc/elasticsearch/elasticsearch.yml -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch

es.default.path.data is coming from the deb package's init.d script.

Inside the elasticsearch.yml, if we declare path.data as an array, we end up with 3 data directories according to ES.

path:
  data:
    - /var/lib/elasticsearch/data1
    - /var/lib/elasticsearch/data2

When ES runs, it now thinks there are 3 data directories. The one being set as default through the CLI arguments, plus the 2 new ones that were added in the config.

According to /_nodes/stats?all=true, we have this (snipped to relevant parts):

{
    "nodes": {
        "FOSlksI8ToK6G4YdIylYvQ": {
            "fs": {
                "data": [{
                    "path": "/var/lib/elasticsearch/home/nodes/0",
                    "mount": "/",
                    "dev": "/dev/sda6",
                    "total_in_bytes": 982560202752,
                    "free_in_bytes": 860723675136,
                    "available_in_bytes": 810788864000,
                    "disk_reads": 12402994,
                    "disk_writes": 14437097,
                    "disk_io_op": 26840091,
                    "disk_read_size_in_bytes": 364877214720,
                    "disk_write_size_in_bytes": 1342705430528,
                    "disk_io_size_in_bytes": 1707582645248,
                    "disk_queue": "2.4",
                    "disk_service_time": "2.1"
                }, {
                    "path": "/var/lib/elasticsearch/data1/home/nodes/0",
                    "mount": "/var/lib/elasticsearch/data1",
                    "dev": "/dev/sdc1",
                    "total_in_bytes": 787239469056,
                    "free_in_bytes": 787167297536,
                    "available_in_bytes": 747154268160,
                    "disk_reads": 691,
                    "disk_writes": 49627,
                    "disk_io_op": 50318,
                    "disk_read_size_in_bytes": 2851840,
                    "disk_write_size_in_bytes": 12656103424,
                    "disk_io_size_in_bytes": 12658955264,
                    "disk_queue": "2.4",
                    "disk_service_time": "2.1"
                }, {
                    "path": "/var/lib/elasticsearch/data2/home/nodes/0",
                    "mount": "/var/lib/elasticsearch/data2",
                    "dev": "/dev/sdd1",
                    "total_in_bytes": 787239469056,
                    "free_in_bytes": 787142717440,
                    "available_in_bytes": 747129688064,
                    "disk_reads": 9461,
                    "disk_writes": 883718,
                    "disk_io_op": 893179,
                    "disk_read_size_in_bytes": 39285760,
                    "disk_write_size_in_bytes": 174312841216,
                    "disk_io_size_in_bytes": 174352126976,
                    "disk_queue": "2.4",
                    "disk_service_time": "2.1"
                }]
            },
        }
    }
}

And if we look at /_nodes we see what the raw settings are:

{
    "nodes": {
        "FOSlksI8ToK6G4YdIylYvQ": {
            "settings": {
                "path": {
                    "data": "/var/lib/elasticsearch",
                    "work": "/tmp/elasticsearch",
                    "home": "/usr/share/elasticsearch",
                    "conf": "/etc/elasticsearch",
                    "logs": "/var/log/elasticsearch",
                    "data.0": "/var/lib/elasticsearch/data1",
                    "data.1": "/var/lib/elasticsearch/data2"
                }
            }
        }
    }
}

So in this case, we see that there are 3 different data keys according to settings, and 3 different data directories. A data, data.0 and data.1.

Now, if we declare in our yaml, data as just a comma separated as a string, this does the correct behavior and overrides what the default was.

path:
    data: /var/lib/elasticsearch/data1,/var/lib/elasticsearch/data2
                "data": [{
                    "path": "/var/lib/elasticsearch/data1/home/nodes/0",
                    "mount": "/var/lib/elasticsearch/data1",
                    "dev": "/dev/sdc1",
                    "total_in_bytes": 787239469056,
                    "free_in_bytes": 787167391744,
                    "available_in_bytes": 747154362368,
                    "disk_reads": 687,
                    "disk_writes": 49803,
                    "disk_io_op": 50490,
                    "disk_read_size_in_bytes": 2835456,
                    "disk_write_size_in_bytes": 12657053696,
                    "disk_io_size_in_bytes": 12659889152,
                    "disk_queue": "0",
                    "disk_service_time": "0"
                }, {
                    "path": "/var/lib/elasticsearch/data2/home/nodes/0",
                    "mount": "/var/lib/elasticsearch/data2",
                    "dev": "/dev/sdd1",
                    "total_in_bytes": 787239469056,
                    "free_in_bytes": 787167391744,
                    "available_in_bytes": 747154362368,
                    "disk_reads": 3436,
                    "disk_writes": 684613,
                    "disk_io_op": 688049,
                    "disk_read_size_in_bytes": 14308352,
                    "disk_write_size_in_bytes": 132454891520,
                    "disk_io_size_in_bytes": 132469199872,
                    "disk_queue": "0",
                    "disk_service_time": "0"
                }]
                "path": {
                    "data": "/var/lib/elasticsearch/data1,/var/lib/elasticsearch/data2",
                    "work": "/tmp/elasticsearch",
                    "home": "/usr/share/elasticsearch",
                    "conf": "/etc/elasticsearch",
                    "logs": "/var/log/elasticsearch"
                },

So the behavior that we're seeing is that the data array is being flattened into a list, and them being appended to the default. Whereas I'd expect this new list to override the default.

I feel that the problem here is that data can be declared both as a string and as a list. It may be more correct if data were always a list internally, and when declaring as a string, it's coerced into a list, even if that list had one item. But I'm not proposing the solution, just a thought.

I understand that this may not necessarily be a bug, but it is extremely unexpected behavior and took quite a bit of debugging to track down exactly what was going on.

@clintongormley
Copy link

I agree about unexpected! Thanks for reporting this.

@mattrobenolt
Copy link
Author

@clintongormley Let me know if there's anything else I can do to help, but I assume this is easily reproduced. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Settings Settings infrastructure and APIs help wanted adoptme
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants