Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC0011: Entry invalidation #9

Open
wants to merge 11 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@arnau
Copy link
Collaborator

arnau commented Mar 1, 2018

Context

The append-only nature of Registers doesn't cope well with
mistakes. A situation that has arisen more than once is getting two records
about the same thing with different identifiers because a human mistake.
Another situation is having a change in the history of a record that is wrong.

We need a way to mark entries as invalid so we can keep compatibility with
other copies of the Register (consumers, replicas, etc) and allow computing
the correct(ed) list of records.

Guidance to review

This RFC is still in progress. Right now it suggests two alternative ways to approach the solution with different tradeoffs. This PR should be used as a discussion point to pick the best one.

This RFC is the attempt to solve the root problem of #3

arnau added some commits Feb 12, 2018

Start entry invalidation RFC
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Add entry sketch
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
More sketching
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Add alternative approach
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
List of properties of the solution
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Rename reserved key to chore:revoked-entries
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Add downside for alternative approach
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Clarify alternative proposals
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Add CSV example
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Reformat examples
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>
Tidy up proposal A
Signed-off-by: Arnau Siches <arnau.siches@digital.cabinet-office.gov.uk>

@arnau arnau requested a review from michaelabenyohai Mar 1, 2018

@arnau arnau added rfc wip labels Mar 1, 2018

@MatMoore
Copy link

MatMoore left a comment

Of the two options I prefer the first one, but the more I think about this the more I dislike this feature.

It seems like the advantage of doing this is mostly about correcting the history of the register, and I don't think that nicer historical data is worth focusing on right now, given the complexity it would add.

Maybe this could work as a third option:

  1. Allow custodians to mark an entry as a correction. If there exists a correction entry with an entry number greater than the latest entry in a copy of a register, then the copy should be incrementally updated to at least the entry number of the latest correction
  2. Allow custodians to completely remove a key from a register as a new entry (for example by setting all of its field to null). Removing a key must only be done as a correction.
#### RSF

```
invalidate-entry user 2018-02-12T10:11:12Z 3;234;355

This comment has been minimized.

Copy link
@MatMoore

MatMoore Aug 6, 2018

invalidate-entry doesn't match the naming in the JSON representation below

  • invalidate vs revoke
  • entry vs entries
* Entries are consumed regularly (no breaking changes).
* This approach uses reserved key in the user space. Requires avoiding clashes
with user data.
* Knowledge of the reserved key is required to filter it out when collecting

This comment has been minimized.

Copy link
@MatMoore

MatMoore Aug 6, 2018

Given users are already confused by the API including archived data, I don't think it's reasonable to expect them to filter out special keys as well.

For me this is the difference between being able to quickly put together a script that uses the API and just giving up and finding a client library to do it for me.

* Revoked entries require a mechanism/documentation to explain how to apply
them when computing the list of records.
* Entry proofs are kept intact and invalidations are part of the tree.

This comment has been minimized.

Copy link
@MatMoore

MatMoore Aug 6, 2018

I think we should be careful that this feature doesn't get used excessively by custodians, as it could make data generally unavailable even though it's still part of the log. For example, if a CURIE points to a revoked record, then it's basically a dead link, and we should avoid this happening.

I think we should define the cases where we expect this to be used, and require the custodian to express why a record has been revoked as part of the entry. This would help users understand what's going on and prevent people from jumping to conclusions about government censoring data or whatever.

Valid reasons for revoking stuff:

  • A new key was added to the register by mistake
  • The custodian made an error and the item for an entry has incorrect values
  • The custodian entered information in multiple steps so that items are missing values that alter its meaning (for example, a full name should have been changed but only the first name field was changed)
  • The information contained in an entry was thought to be correct but later turned out to be inaccurate

Reasons why the custodian might want to revoke stuff, but they shouldn't:

  • The record is no longer applicable (use start-date and end-date instead)
  • Personal information needs to be removed for legal reasons (redact information instead)

This comment has been minimized.

Copy link
@arnau

arnau Aug 7, 2018

Author Collaborator

Thanks for the thorough review, I have to re-read what I wrote. It's been a while. In any case there is another case (which is why this is important sooner rather than latter):

An entry introduces a record under a new key but it should've been an update to an existing key.

That means the entry needs to be marked as "ignore this one, mistake", a start/end date is the wrong level of abstraction to flag this. It's not about the data, it's about the metadata.

In any case, I'll revisit this soon and reply to your comments.

@arnau arnau changed the title Entry invalidation RFC0011: Entry invalidation Aug 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.