Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDPR-and-friends-compliant delete processes #39

Open
saemideluxe opened this issue Mar 23, 2021 · 2 comments
Open

GDPR-and-friends-compliant delete processes #39

saemideluxe opened this issue Mar 23, 2021 · 2 comments
Assignees
Labels

Comments

@saemideluxe
Copy link
Member

saemideluxe commented Mar 23, 2021

This is coming up in all database systems as a requirements, I think we would strongly benefit from having this as a well implemented feature. The core implemention should just be able to detect, display and remove all related data of an object. In theory we rely strongly on the cascade deletion feature of the database and just delete associated files and entries in the revision tables. In practive we will need to make this very configurable in order to allow for handling anonymized and business-relevant data.

One idea could be to simply blank and/or replace all fields of all related data and delete associated files.
Other ideas are welcome.

Questions which need to be addressed:

  • Deleting relevant information
    • Approach: Deleting database objects or blanking all fields?
    • How to handle revision entries?
    • How to handle file removal? (This might be trickier than it seems)
    • How are backups handled?
  • Keeping business-relevant/legaly required data
    • Needs a configuration system to specify data which needs to be preserved.
    • Possible solutions: Clearing fields, having separate "statistics/history" models
  • Automated removal of records/triggers
    • Necessary at all? Or should all things be deleted manually (maybe with notifications)?
  • Related issues:
    • Should processes for data extraction/personal record be considered? Like when a person wants to know "What information about me do you have stored in your system?"
@saemideluxe saemideluxe changed the title GDPR- and similar compliant delete processes GDPR-and-friend-compliant delete processes Mar 23, 2021
@saemideluxe saemideluxe changed the title GDPR-and-friend-compliant delete processes GDPR-and-friends-compliant delete processes Mar 23, 2021
@frederikbugglin frederikbugglin added the High Priority High priority issues label Sep 6, 2021
@saemideluxe saemideluxe added Medium Priority Medium priority issues and removed High Priority High priority issues labels Oct 11, 2021
@saemideluxe
Copy link
Member Author

Suggestion for an implementation:

  • Data removal happens through anonymization and is implemented as "blanking" of all PPI attributes. This does not touch referal integrity and allows to keep some data for later analytics
  • The history (meaning all revisions) needs to be purged
  • File fields need to be purged. One unsolved issue is what happens if a file field changed its value, meaning a user uploads a first file and later uploads another file to the same field. Django does not delete the original file as it can lead to loss of data in certain circumstances. A solution would be to implement (or integrate an existing app for) managed files where each uploaded file is saved as a separate, immutable database object (instead of just a database field) and the removal of such a database object will also remove the according file from the filesystem.
  • Backups can basically be ignored as long as they are not retained over large amounts of time. Some privacy laws may have a degree of contradiction (in the law themselfes or with other laws) due to requiring backups and at the same time requiring the ability to completely remove data. However, legal professionals seem to agree that data retention for legal reasons (including backups) takes precedence over the removal of such data.
  • Administrators can define (through the user interface or a configuration file) which fields of a database object should be blanked. N-N relations can only be deleted, 1-N and 1-1 relations can be deleted or kept while blanking-definitions can span kept relationships
  • Rules for automated data removal could be established, but that should have its own discussion thread.

@freebeat
Copy link
Member

freebeat commented Nov 8, 2021

@saemideluxe : Sounds good.

  • File fields: Could you make revisions for that field, and then purge old files when purging the revisions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants