Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allocwatcher: don't destroy local allocdir after migration #18108

Merged
merged 1 commit into from Aug 2, 2023

Conversation

tgross
Copy link
Member

@tgross tgross commented Jul 31, 2023

When ephemeral disks are migrated from an allocation on the same node, allocation logs for the previous allocation are lost.

There are two workflows for the best-effort attempt to migrate the allocation data between the old and new allocations. For previous allocations on other clients (the "remote" workflow), we create a local allocdir and download the data from the previous client into it. That data is then moved into the new allocdir and we delete the allocdir of the previous alloc.

For "local" previous allocations we don't need to create an extra directory for the previous allocation and instead move the files directly from one to the other. But we still delete the old allocdir entirely, which includes all the logs!

There doesn't seem to be any reason to destroy the local previous allocdir, as the usual client garbage collection should destroy it later on when needed. By not deleting it, the previous allocation's logs are still available for the user to read.

Fixes: #18034

@tgross
Copy link
Member Author

tgross commented Aug 1, 2023

After confirming the behavior in #18034, I tested the patch in this PR with the following job specification.

jobspec
job "example" {

  group "web" {

    network {
      mode = "bridge"
      port "www" {
        to = 8001
      }
    }

    ephemeral_disk {
      migrate = true
      sticky  = true
    }

    task "httpd" {

      driver = "docker"

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-vv", "-f", "-p", "8001", "-h", "/local"]
        ports   = ["www"]
      }

      resources {
        cpu    = 128
        memory = 100
      }

    }
  }
}

Run the job:

$ nomad job run ./example.nomad.hcl
...
    2023-08-01T11:32:50-04:00: Allocation "b1461797" created: node "d9f5ad8c", group "web"
...

Write to its logs and to its ephemeral data and verify those writes have been made:

$ curl -so /dev/null "http://10.37.105.80:24429/alloc-b1461797"
$ nomad alloc exec b1461797 /bin/sh -c 'echo $NOMAD_ALLOC_ID >> /alloc/data/test.txt'

$ nomad alloc fs b1461797 /alloc/data/test.txt
b1461797-a661-d640-f8b8-b6e0e4b91d2e

$ nomad alloc logs -stderr b1461797
[::ffff:10.37.105.1]:52584: url:/alloc-b1461797
[::ffff:10.37.105.1]:52584: response:404

Update the job and run again:

$ nomad job run ./example.nomad.hcl
...
    2023-08-01T11:34:47-04:00: Allocation "525dd5d2" created: node "d9f5ad8c", group "web"
...

Verify that the logs for the previous allocation still exist, that the ephemeral disk has been migrated, but that the old allocation's ephemeral disk has been moved as expected.

$ nomad alloc logs -stderr b1461797
[::ffff:10.37.105.1]:52584: url:/alloc-b1461797
[::ffff:10.37.105.1]:52584: response:404

$ nomad alloc fs b1461797 /alloc/data/test.txt
Unexpected response code: 404 (rpc error: stat /var/nomad/data/alloc/b1461797-a661-d640-f8b8-b6e0e4b91d2e/alloc/data/test.txt: no such file or directory)

$ nomad alloc fs 525dd5d2 /alloc/data/test.txt
b1461797-a661-d640-f8b8-b6e0e4b91d2e

When ephemeral disks are migrated from an allocation on the same node,
allocation logs for the previous allocation are lost.

There are two workflows for the best-effort attempt to migrate the allocation
data between the old and new allocations. For previous allocations on other
clients (the "remote" workflow), we create a local allocdir and download the
data from the previous client into it. That data is then moved into the new
allocdir and we delete the allocdir of the previous alloc.

For "local" previous allocations we don't need to create an extra directory for
the previous allocation and instead move the files directly from one to the
other. But we still delete the old allocdir _entirely_, which includes all the
logs!

There doesn't seem to be any reason to destroy the local previous allocdir, as
the usual client garbage collection should destroy it later on when needed. By
not deleting it, the previous allocation's logs are still available for the user
to read.

Fixes: #18034
Copy link
Member

@shoenig shoenig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tgross tgross merged commit 8ad663d into main Aug 2, 2023
25 checks passed
@tgross tgross deleted the migrate-preserve-logs branch August 2, 2023 13:41
@tgross tgross added backport/1.4.x backport to 1.4.x release line backport/1.5.x backport to 1.5.x release line backport/1.6.x backport to 1.6.x release line labels Aug 2, 2023
tgross added a commit that referenced this pull request Aug 2, 2023
When ephemeral disks are migrated from an allocation on the same node,
allocation logs for the previous allocation are lost.

There are two workflows for the best-effort attempt to migrate the allocation
data between the old and new allocations. For previous allocations on other
clients (the "remote" workflow), we create a local allocdir and download the
data from the previous client into it. That data is then moved into the new
allocdir and we delete the allocdir of the previous alloc.

For "local" previous allocations we don't need to create an extra directory for
the previous allocation and instead move the files directly from one to the
other. But we still delete the old allocdir _entirely_, which includes all the
logs!

There doesn't seem to be any reason to destroy the local previous allocdir, as
the usual client garbage collection should destroy it later on when needed. By
not deleting it, the previous allocation's logs are still available for the user
to read.

Fixes: #18034
@tgross
Copy link
Member Author

tgross commented Aug 2, 2023

BPA failed at random for the 1.4.x release branch, I've backported that by hand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.4.x backport to 1.4.x release line backport/1.5.x backport to 1.5.x release line backport/1.6.x backport to 1.6.x release line theme/client theme/data migration theme/logging type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allocation Logs Immediately Removed on Allocation Replacement
3 participants