Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC Deletion Process #57

Merged
merged 40 commits into from Mar 15, 2019
Merged

RFC Deletion Process #57

merged 40 commits into from Mar 15, 2019

Conversation

Bento007
Copy link
Member

@Bento007 Bento007 commented Jan 11, 2019

Last call Feb. 22, 2019
March 15: Last call for oversight review

@brianraymor
Copy link
Collaborator

@Bento007 - is this RFC in progress and not ready for community review?

rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
* This introduces the ability to permanently destroy data from the DSS.
* There is no limit on the amount of data that can be rapidly permanently deleted.
* Permanently deletion does not occur until after the grace period has elapsed. This could
be problematic if files need to be removed sooner.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wranglers say that the general expectation based on other archiving services is that when it comes to consent issues is to have complete removal in 24-48 hours. They are going to discuss this further at the wrangler F2F on Mon 4th Feb. For redaction not linked to consent it's okay to take weeks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the presentations from Dave Bernick from a FISMA compliance perspective speed of deletion doesn't need to be so quick. That being said from a reputational perspective (and considering any physical deletion request will need to navigate a certain amount of bureaucracy before the technical request is made) the final step taking 24 to 48hrs seems like a good aim

rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
Copy link
Contributor

@mweiden mweiden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meant for my last review to be a "Request changes"


|Bundle.Version|admin|reason|AWS Deletion Markers (key,ID)|GCP Previous Generations (key, previous generation)|
|--------------|-----|------|--------------------|------------------------|
|1234-2134-3145|admin@email.com|consent|(file.obj, 1234), ...| (file.obj, 4321), ...|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that the deletion table is an implementation detail of the DSS, internal to DSS and not exposed in the API? I think it's OK and appropriate to list the details of it here, as long as that fact is called out. More generally, throughout this RFC, more clarity is necessary to separate the background information, the API for the deletion actions, and the implementation details of the deletion service infrastructure.

### Drawbacks and Limitations [optional]

* This introduces the ability to permanently destroy data from the DSS.
* There is no limit on the amount of data that can be rapidly permanently deleted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be addressed by a suite of safety checks that could throttle deletions in the deletion daemon based on size delta, object count delta, etc. and send alerts to the DCP operators.

Copy link
Member

@kislyuk kislyuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should strive to build a system that relies on data wranglers passing ad hoc information to service operators, and service operators running maintenance scripts, for an operation as core to the data lifecycle as deletion. Such a system is error-prone, non-scalable, and stifles data operations by keeping data management tools out of the hands of data operators.

More generally, I think we should re-examine our architectural assumptions and principles here, and consider how this RFC corresponds to the core DCP principle of federated services with boundaries defined by public, documented HTTP APIs. This principle is key to our agility.

rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
### Tombstones

Tombstones are markers places in the DSS to indicate previously existing data has been removed. Tombstones exist for
files and bundles stored in the DSS. Tombstones come in two varieties, versioned and unversioned tombstones. A
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tombstones exist for files and bundles stored in the DSS. - This seems to contradict the earlier point about there only being physical deletion for files.

rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
rfcs/text/0000-dss-deletion-daemon.md Outdated Show resolved Hide resolved
A user must have explicit permission to perform a **logical deletion** of a bundle. For a **physical deletion** of a
bundle, the user must have explicit permission to **physically delete** files and bundles.

(!) The deletion of a bundle does not handle the deletion of secondary analysis bundles referencing this bundle via the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parthshahva you are considering this for redaction design?

@Bento007 Bento007 merged commit 6acb3fb into master Mar 15, 2019
@mweiden mweiden deleted the tsmith-deletion branch March 15, 2019 19:41
Bento007 added a commit that referenced this pull request Mar 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet