Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error recovery in device mapper #3470

Merged
merged 3 commits into from Aug 5, 2019
Merged

Conversation

mxpv
Copy link
Member

@mxpv mxpv commented Jul 30, 2019

This PR attempts to mitigate some cases described in #3436

It introduces faulty state for devices and devmapper device ID.

A device that is failed to create/activate and failed to rollback (delete back), will be marked as faulty in metadata store (instead of cleaning up the metadata and ending up in an inconsistent state). The snapshotter will try to recreate this device with a different devID in a subsequent call. This would allow us to move on and prevent the snapshotter to entirely stuck on a first faulty ID.

A faulty device ID means that the snapshotter could not handle it (nor create nor rollback), and it needs manual handling. It'll be marked in metadata store and won't be reused by snapshotter until addressed and manually marked as ok.

@renzhengeek @jiazhiguang Could you please give it a try on your env.

mxpv added 2 commits July 30, 2019 15:17
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
@theopenlab-ci
Copy link

theopenlab-ci bot commented Jul 30, 2019

Build succeeded.

activeErr = p.activateDevice(ctx, info)
if activeErr != nil {
return activeErr
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

What if crash happens in between these steps and rollback code doesn't get chance to run?

This possibility also applies to other places to change device or metadata, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if crash happens in between these steps and rollback code doesn't get chance to run?

Each device mapper operation is wrapped with state transition function (have a look on 1 and 2). If we're failed to activate a device and crash happens, so no defer was run, you'll end up with something like this in your metadata database (meaning that the device didn't finish the transition):

DevName=X ID=9 State=Activating Error="failed to activate"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially each operation with devmapper device is recorded (https://github.com/containerd/containerd/blob/master/snapshots/devmapper/device_info.go#L34), so you can get a clear picture about device state and last operation at any time. For example, for CreateThinDevice the following states will be used:

Unknown <-- preallocated metadata info and devID
Creating <-- createDevice
Created
Activating <-- activateDevice
Activated

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

thanks for your detailed explanation ;-) Yes, I know, in the crash case, the metadata DB can reflect correctly the device state before crashing, such as creating, activating etc.

Actually, my question was, in that case, there is no chance to the rollback code to mark the device as faulty, right?

If so, after machine comes up and containerd get started again, we still have "object already exists" error from the AddDevice() - 878a320#diff-4132e94f99d36d21b09c446fb20687faL100.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you crash right in between create and rollback, then yes, a device won't be marked as Faulty one. In this case, what behavior would you expect to see?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a crash happens, containerd must restart again, so devmapper will be reloaded. Could we take the chance to scan the device list and mark the device as faulty who's state is not in a final state - Deactivated and Activated. Or I feel it's safe to cleanup that device from metadata and thin pool completely.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you crash right in between create and rollback, then yes, a device won't be marked as Faulty one. In this case, what behavior would you expect to see?

Hi,

I just come up this idea. I'd like to know what do you think?thanks.

diff --git a/snapshots/devmapper/pool_device.go b/snapshots/devmapper/pool_device.go
index 22e78c1..4eaab6e 100644
--- a/snapshots/devmapper/pool_device.go
+++ b/snapshots/devmapper/pool_device.go
@@ -150,6 +150,16 @@ func (p *PoolDevice) CreateThinDevice(ctx context.Context, deviceName string, vi
                        retErr = multierror.Append(retErr, p.metadata.MarkFaulty(ctx, info.Name))
                        return
                }
+
+               // If ErrAlreadyExists returns, it indicates the device name is already in metadata,
+               // but not aware by upper layer. This may occur when crash happens and there is no
+               // chance to mark the device faulty. So, mark it here and upper layer will retry with
+               // the same device name.
+               if metaErr == ErrAlreadyExists {
+                       log.G(ctx).Warnf("conflicting thin device %s, mark it faulty", info.Name)
+                       retErr = multierror.Append(retErr, p.metadata.MarkFaulty(ctx, info.Name))
+                       return
+               }
        }()
 
        // Save initial device metadata and allocate new device ID from store
@@ -241,6 +251,12 @@ func (p *PoolDevice) CreateSnapshotDevice(ctx context.Context, deviceName string
                        retErr = multierror.Append(retErr, p.metadata.MarkFaulty(ctx, snapInfo.Name))
                        return
                }
+
+               if metaErr == ErrAlreadyExists {
+                       log.G(ctx).Warnf("conflicting snapshot %s, mark it faulty", snapInfo.Name)
+                       retErr = multierror.Append(retErr, p.metadata.MarkFaulty(ctx, snapInfo.Name))
+                       return
+               }
        }()
 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've addressed this use case in #3489. In case of crash, the snapshotter will mark devices as faulty after restart. Can you check if it works for you?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've addressed this use case in #3489. In case of crash, the snapshotter will mark devices as faulty after restart. Can you check if it works for you?

Great, thanks!

@renzhengeek
Copy link

Hmm,
this problem is a independent one, which once occurred and seems cannot be recovered.

37373c1

devmapper: deferred remove can break consistency

@mxpv
Copy link
Member Author

mxpv commented Jul 31, 2019

this problem is a independent one, which once occurred and seems cannot be recovered.

This makes sense. devmapper needs more precise control when to do unmounts, and certainly that should happen before remove. Though I think we don't need needUmount check as it should work for all snapshotters.

Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
@theopenlab-ci
Copy link

theopenlab-ci bot commented Jul 31, 2019

Build succeeded.

@codecov-io
Copy link

codecov-io commented Jul 31, 2019

Codecov Report

Merging #3470 into master will decrease coverage by 0.02%.
The diff coverage is 47.29%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3470      +/-   ##
==========================================
- Coverage   44.24%   44.22%   -0.03%     
==========================================
  Files         124      124              
  Lines       13732    13780      +48     
==========================================
+ Hits         6076     6094      +18     
- Misses       6725     6747      +22     
- Partials      931      939       +8
Flag Coverage Δ
#linux 47.99% <47.29%> (-0.05%) ⬇️
#windows 39.86% <ø> (ø) ⬆️
Impacted Files Coverage Δ
snapshots/devmapper/device_info.go 0% <0%> (ø) ⬆️
snapshots/devmapper/pool_device.go 55.22% <37.73%> (-6.18%) ⬇️
snapshots/devmapper/metadata.go 67.34% <78.94%> (+0.93%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a49df98...3741fd8. Read the comment docs.

@renzhengeek
Copy link

Though I think we don't need needUmount check as it should work for all snapshotters.

Yes, thanks.

@samuelkarp
Copy link
Member

A faulty device ID means that the snapshotter could not handle it (nor create nor rollback), and it needs manual handling. It'll be marked in metadata store and won't be reused by snapshotter until addressed and manually marked as ok.

Do you plan to add some documentation as part of this PR for what a user can do to remediate the faulty device ID and to mark it as ok once it has been remediated? If not, is that planned for a follow-up PR?

@renzhengeek
Copy link

A faulty device ID means that the snapshotter could not handle it (nor create nor rollback), and it needs manual handling. It'll be marked in metadata store and won't be reused by snapshotter until addressed and manually marked as ok.

Do you plan to add some documentation as part of this PR for what a user can do to remediate the faulty device ID and to mark it as ok once it has been remediated? If not, is that planned for a follow-up PR?

Once it's correctly marked as faulty, I think it's safe to just delete those devices which are not aware by upper layer and contains user data.

@mxpv
Copy link
Member Author

mxpv commented Aug 2, 2019

Do you plan to add some documentation as part of this PR for what a user can do to remediate the faulty device ID and to mark it as ok once it has been remediated? If not, is that planned for a follow-up PR?

@samuelkarp Yes, I plan to update the README in a follow up PR.

Could we take the chance to scan the device list and mark the device

@renzhengeek SGTM. At init time we can check devices that are not in final state and mark them faulty. However I'd prefer to play safe and leave any device deletion for user in order to prevent unexpected situations. Sometimes it's possible to restore a device instead of cleanup.

@crosbymichael
Copy link
Member

LGTM

Copy link
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@estesp estesp merged commit cb46663 into containerd:master Aug 5, 2019
@estesp
Copy link
Member

estesp commented Aug 5, 2019

Should any of this cleanup/err handling get taken back into release/1.2?

@mxpv
Copy link
Member Author

mxpv commented Aug 5, 2019

@estesp no, this is devmapper specific err handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants