remount volumes on restart #30

kcmannem · 2019-09-03T16:20:55Z

Fixes concourse/concourse#4264

Signed-off-by: Krishna Mannem kmannem@pivotal.io

Signed-off-by: Krishna Mannem <kmannem@pivotal.io>

volume/driver/overlay_linux.go

Signed-off-by: Krishna Mannem <kmannem@pivotal.io>

volume/driver/overlay_linux.go

ddadlani · 2019-09-04T15:59:05Z

@xtreme-sameer-vohra and I tried out this PR and it seems to work. This was our process:

Acceptance

Run Concourse with one worker using docker-compose
Set a pipeline and watch it run successfully
Run docker restart concourse_worker_1 and re-run the pipeline

Without this PR, we saw that booklit/unit would error out with task config not found, implying that the booklit volume was empty. We also double checked on the worker, that the volume corresponding to booklit was empty but the overlay directory had all the bits in it.

With the PR, we saw that the task ran successfully, and that the volume corresponding to booklit was correctly populated, and that cat /proc/mounts showed the corresponding overlay mount on the worker.

Asides

In both the passing and failing cases, we saw check containers failing with unknown handle because the ATC was not yet aware that the containers were no longer present on the worker. This resolves itself once the missing_since column in the containers table triggers those containers to be GCed, but this is not a clear error.
To test this case, we needed to disable worker retiring. This is because, when running docker restart, the worker transitions to retiring and then back to running. This caused the volumes to get GCed from the DB, but the worker restarted too quickly for the actual volumes to be cleaned up from the worker. This meant that we did not use the existing volumes on the worker, and also that every restart caused volumes to pile up. However this is specific to using docker restart as workers do not transition from retiring to running.

remount volumes on restart

eaeabd1

Signed-off-by: Krishna Mannem <kmannem@pivotal.io>

kcmannem requested review from xtreme-sameer-vohra and ddadlani September 3, 2019 16:20

kcmannem mentioned this pull request Sep 3, 2019

empty volumes when using overlay on Prod concourse/concourse#4264

Closed

ddadlani self-assigned this Sep 3, 2019

cirocosta reviewed Sep 3, 2019

View reviewed changes

volume/driver/overlay_linux.go Outdated Show resolved Hide resolved

volume/driver/overlay_linux.go Outdated Show resolved Hide resolved

volume/driver/overlay_linux.go Outdated Show resolved Hide resolved

skipping mount recovery when live dir empty

4f3f483

Signed-off-by: Krishna Mannem <kmannem@pivotal.io>

kcmannem requested a review from cirocosta September 4, 2019 15:05

ddadlani reviewed Sep 4, 2019

View reviewed changes

volume/driver/overlay_linux.go Show resolved Hide resolved

volume/driver/overlay_linux.go Outdated Show resolved Hide resolved

kcmannem mentioned this pull request Sep 4, 2019

pass context to baggageclaimclient.StreamIn/Out calls concourse/concourse#4243

Merged

ddadlani merged commit fca4859 into master Sep 4, 2019

ddadlani deleted the fix-mtab branch September 4, 2019 16:24

cirocosta mentioned this pull request Dec 24, 2019

Investigate worker recovery for k8s ungracefully restarting pods out-of-band due to high memory consumption concourse/hush-house#94

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remount volumes on restart #30

remount volumes on restart #30

kcmannem commented Sep 3, 2019 •

edited

Loading

ddadlani commented Sep 4, 2019

remount volumes on restart #30

remount volumes on restart #30

Conversation

kcmannem commented Sep 3, 2019 • edited Loading

ddadlani commented Sep 4, 2019

kcmannem commented Sep 3, 2019 •

edited

Loading