Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FsInfo device deduplication #94744

Merged
merged 9 commits into from Mar 27, 2023

Conversation

joegallo
Copy link
Contributor

@joegallo joegallo commented Mar 24, 2023

#12053 changed the deduplication to be based on path (prior to that it had been by dev). I submit that deduplication based on path is not correct, and that what we actually want is deduplication based on mount.

Consider the following GET _nodes/stats/fs snippet for a single node cluster on my box:

      "fs" : {
        "timestamp" : 1679672983377,
        "total" : {
          "total_in_bytes" : 2000481927168,
          "free_in_bytes" : 1129042354176,
          "available_in_bytes" : 1129042354176
        },
        "data" : [
          {
            "path" : "/Users/joegallo/Desktop/elasticsearch-8.8.0-SNAPSHOT/data1",
            "mount" : "/System/Volumes/Data (/dev/disk1s2)",
            "type" : "apfs",
            "total_in_bytes" : 1000240963584,
            "free_in_bytes" : 564521177088,
            "available_in_bytes" : 564521177088
          },
          {
            "path" : "/Users/joegallo/Desktop/elasticsearch-8.8.0-SNAPSHOT/data2",
            "mount" : "/System/Volumes/Data (/dev/disk1s2)",
            "type" : "apfs",
            "total_in_bytes" : 1000240963584,
            "free_in_bytes" : 564521177088,
            "available_in_bytes" : 564521177088
          }
        ]
      }

The paths are different, as they must be, so no de-duplication is done and the total is 2 terabytes. I assure you, though, there's only one drive attached to my laptop, and it's 1TB. 😉

By way of contrast, with this PR in place, we'd see:

      "fs" : {
        "timestamp" : 1679677503196,
        "total" : {
          "total_in_bytes" : 1000240963584,
          "free_in_bytes" : 560331444224,
          "available_in_bytes" : 560331444224
        },
        "data" : [
          {
            "path" : "/Users/joegallo/Desktop/elasticsearch-8.8.0-SNAPSHOT/data1",
            "mount" : "/System/Volumes/Data (/dev/disk1s2)",
            "type" : "apfs",
            "total_in_bytes" : 1000240963584,
            "free_in_bytes" : 560331444224,
            "available_in_bytes" : 560331444224
          },
          {
            "path" : "/Users/joegallo/Desktop/elasticsearch-8.8.0-SNAPSHOT/data2",
            "mount" : "/System/Volumes/Data (/dev/disk1s2)",
            "type" : "apfs",
            "total_in_bytes" : 1000240963584,
            "free_in_bytes" : 560331444224,
            "available_in_bytes" : 560331444224
          }
        ]
      }

And the total.total_in_bytes reflects the reality that I have just 1TB of storage.

Not entirely that different from #32569 which I closed yesterday.

@joegallo joegallo requested a review from nik9000 March 24, 2023 17:06
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Mar 24, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @joegallo, I've created a changelog YAML for you.

@joegallo
Copy link
Contributor Author

joegallo commented Mar 24, 2023

Note: the splash damage on this is pretty small -- at least as far as I understand, it would only apply when using multiple data paths on a single drive.

Related to #24472 only in the absolutely loosest sense -- I was doing some investigation of that ticket and happened across this behavior. That ticket is about mistaken de-duplication at the cluster level when multiple nodes share an ip address but have independent storage. This PR doesn't help there at all.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it but I don't have enough information to approve it. Sorry. I'm a bit terrified. It really seems right.

@joegallo joegallo requested a review from rjernst March 24, 2023 18:18
@joegallo
Copy link
Contributor Author

joegallo commented Mar 24, 2023

Another note to myself for myself, the deduplication in question for this PR only applies within a single node. The .nodes.fs reported on _cluster/stats is changed, because that's the sum of the individual node .fs.total values (note: this is subject to de-duplication by ip, feel free to read #24472 for more) -- see the following code:

// now do the stats that should be deduped by hardware (implemented by ip deduping)
TransportAddress publishAddress = nodeResponse.nodeInfo().getInfo(TransportInfo.class).address().publishAddress();
final InetAddress inetAddress = publishAddress.address().getAddress();
if (seenAddresses.add(inetAddress) == false) {
continue;
}
if (nodeResponse.nodeStats().getFs() != null) {
this.fs.add(nodeResponse.nodeStats().getFs().getTotal());
}

Coming back to my single node cluster, the _cluster/stats numbers for fs are correct now, too, but the change was in the math at the individual node level, there's nothing different in this PR about the math at the cluster level.

joegallo@galactic:~/Code/elastic/elasticsearch $ curl -s $ES_AUTH -XGET "$ES_URL/_cluster/stats?pretty" | jq .nodes.fs
{
  "total_in_bytes": 1000240963584,
  "free_in_bytes": 562491006976,
  "available_in_bytes": 562491006976
}

1TB, ✅.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your reasoning seems sound. However, mount is Nullable (I assume because of Windows?). So we should fall back to path if mount is null?

That code for spins hasn't existed in years.
They're not especially nullable as compared to, say, path. All of them
can be null in the case of an FsInfo.Path that represents a total
rather than a concrete entry. Any of them can be null in tests.
@joegallo
Copy link
Contributor Author

So we should fall back to path if mount is null?

I don't think so -- Elasticsearch won't start if the paths are duplicated, so that's one thing, but more than that I just don't think it makes sense to de-duplicate on paths for this purpose.

[2023-03-24T17:05:34,301][ERROR][o.e.b.Elasticsearch      ] [galactic.localdomain] fatal exception while booting Elasticsearchjava.lang.IllegalStateException: path [/Users/joegallo/Desktop/elasticsearch-8.8.0-SNAPSHOT/data1] is duplicated by [/Users/joegallo/Desktop/elasticsearch-8.8.0-SNAPSHOT/data1]
        at org.elasticsearch.server@8.8.0-SNAPSHOT/org.elasticsearch.bootstrap.Security.addFilePermissions(Security.java:245)
        at org.elasticsearch.server@8.8.0-SNAPSHOT/org.elasticsearch.bootstrap.Security.createPermissions(Security.java:178)
        at org.elasticsearch.server@8.8.0-SNAPSHOT/org.elasticsearch.bootstrap.Security.configure(Security.java:125)
        at org.elasticsearch.server@8.8.0-SNAPSHOT/org.elasticsearch.bootstrap.Elasticsearch.initPhase2(Elasticsearch.java:188)
        at org.elasticsearch.server@8.8.0-SNAPSHOT/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:66)

However, mount is Nullable (I assume because of Windows?).

Today has been a github archaeology day, which makes it a good day.

#1622 is the original issue for adding fs stats. It was closed via 0a3c941.

In that original implementation, there were two ways of populating an FsInfo.Path (at the time it was called a FsStats.Info):

  • There was JmxFsProbe which populated path, total, free, and available (but not mount) (link).
  • There was SigarFsProbe which populated all the same things, and mount (plus some fields that we don't have anymore like dev, etc). (link)

So in that world, yes, mount was actually legit @Nullable. #12053 dropped all that, though, and replaced it with more-or-less the current FsProbe implementation in which mount is never null in that way.


"in that way" sure is an interesting way of phrasing this, but sure, it can totally be null, just like path and type can be:

FsInfo.Path p = new FsInfo.Path(); // they're all just null, because of the double use of Path as representing a entry or a total-of-N-entries

I added some commits that drop the @Nullable annotations and tweak some comments.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great explanation! LGTM

@joegallo
Copy link
Contributor Author

yw! Thanks for the review, this PR has been a lot of fun (not all that much code writing, but the investigation was delightful).

@joegallo joegallo merged commit ea54ab3 into elastic:main Mar 27, 2023
@joegallo joegallo deleted the fix-fs-info-device-deduplication branch March 27, 2023 20:19
@joegallo joegallo removed the v8.7.1 label Mar 28, 2023
joegallo added a commit to joegallo/elasticsearch that referenced this pull request Mar 28, 2023
joegallo added a commit to joegallo/elasticsearch that referenced this pull request Mar 28, 2023
elasticsearchmachine pushed a commit that referenced this pull request Mar 28, 2023
* Refactor _cluster/stats .nodes.fs deduplication (#94780)

* Fix FsInfo device deduplication (#94744)

* Fix _cluster/stats .nodes.fs deduplication (#94798)
elasticsearchmachine pushed a commit that referenced this pull request Mar 28, 2023
* Refactor _cluster/stats .nodes.fs deduplication (#94780)

* Fix FsInfo device deduplication (#94744)

* Fix _cluster/stats .nodes.fs deduplication (#94798)
saarikabhasi pushed a commit to saarikabhasi/elasticsearch that referenced this pull request Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Stats Statistics tracking and retrieval APIs Team:Data Management Meta label for data/management team v7.17.10 v8.7.1 v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants