Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Want ability to take a crucible dataset out of provisioning pool #3480

Open
askfongjojo opened this issue Jul 3, 2023 · 4 comments
Open
Assignees
Milestone

Comments

@askfongjojo
Copy link

Similar to #2483 (the ability to take a sled out of provisioning pool), there are multiple circumstatnces (e.g. physical disk failure, crucible-agent failing to come up #3416) that causes a crucible dataset to be non-functional. Regardless of whether the logic for picking crucible is fully randomized, we need a support tool to mark a crucible as "bad".

The tool (probably an API) will be used by Oxide support initially. Eventually, we'd want a heart-beat mechanism in place to make the decision of when to take a crucible out of or back into the provisioning pool.

@askfongjojo askfongjojo added this to the MVP milestone Jul 3, 2023
@askfongjojo
Copy link
Author

This API can also be triggered optionally by the "take sled out of provisioning pool" feature since talking a sled out will mean not provisioning to its disks in most cases.

@askfongjojo
Copy link
Author

Volume/region background clean-up also needs to skip crucible zones that are already removed - #4331.

@morlandi7 morlandi7 modified the milestones: MVP, 5 Nov 7, 2023
@morlandi7 morlandi7 modified the milestones: 5, 6 Dec 5, 2023
@morlandi7 morlandi7 added the known issue To include in customer documentation and training label Dec 18, 2023
@morlandi7 morlandi7 modified the milestones: 6, 7 Jan 25, 2024
@askfongjojo askfongjojo removed the known issue To include in customer documentation and training label Mar 9, 2024
@morlandi7 morlandi7 modified the milestones: 7, 8 Mar 12, 2024
@davepacheco
Copy link
Collaborator

@sunshowers I think this issue was resolved as of #5032. Is that right?

@davepacheco
Copy link
Collaborator

Answering my own question: we discussed this briefly in today's update call. This ticket is different from #5032 in two ways: it covers not just an underlying database facility, but an API and tool that can be used by support (at least) to control this behavior; and it covers dataset-level granularity, whereas #5032 only covers sled-level.

@morlandi7 morlandi7 modified the milestones: 8, 9 May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants