EU-GDPR - Right to Erasure #2447

TheAspens · 2018-04-03T22:01:45Z

Another aspect of the GDPR law is the Right to Erasure. I've created a proposed implementation that might meet the requires of this provision of the law. This continues the work outlined in #2332 and #2413.

The proposal is documented here: https://boinc.berkeley.edu/trac/wiki/RightToErasure

I would appreciate a review of this and feedback on the implementation. In particular, I would like feeback from people who are doing their own compliance work to review if this is likely to be the minimum steps necessary to comply with this provision of the law or if some lessor action (like scrubbing user fields such as email address, name, ip address etc) would be permitted. In particular I would appreciate @lfield and @brevilo to take a look and provide feedback.

TheAspens · 2018-04-03T22:09:53Z

For example - I would like the opinion of @brevilo and @lfield if this implementation is sufficient: #2445

SETIguy · 2018-04-03T23:47:46Z

One major question... What's the definition of "all of their data"? According to the page links, it appears that "personal data" is what is covered. Which fields in the user, host, thread, post and result tables belong to a user? Is a userid "their data" once any link to an email_address or cpid has been removed? I would tend say no, yet I would also tend to think that a CPID is "their data" as is directly identifies a user across projects. Similarly, in host I would expect that ip_addr, external_ip_addr and domain_name belong to the users, but nothing else is personal or user owned information. Most of that information is created by the project for internal use. There may be projects which require access to other information in the host table. host_app_version is also, IMHO, not information that belongs to a user, although its not much use to the project once a user has left. Posts and threads, I can see that deleting them all is probably required. Then there's science data. If host.m_cache belongs to the user, doesn't that also mean any science results returned are the property of the users and need to be deleted as well? After all they link back to result.userid. I think this proposal goes way too far. To delete all of the personal data for a user... 1. randomize the personal fields (name, email address, cpid, url, etc.) in user, forum_preferences 1a. Delete any profile images. 2. delete all threads and posts for the user. 3. randomize the IP addresses for the user's hosts. At that point, all the personal information is gone, unless a project app is sniffing and storing personal information. No deleted tag is necessary, dump the randomized strings.

…

On Tue, Apr 3, 2018 at 3:09 PM, Kevin Reed ***@***.***> wrote: For example - I would like the opinion of @brevilo <https://github.com/brevilo> and @lfield <https://github.com/lfield> if this implementation is sufficient: #2445 <#2445> — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2447 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKXcsniFrxQl5G-sc0hsTrV2iQ9FS4JOks5tk_M0gaJpZM4TF2yB> .

-- Eric Korpela korpela@ssl.berkeley.edu AST:7731^29u18e3

brevilo · 2018-04-04T09:09:11Z

@SETIguy you're questions are all relevant and I think I overall agree with your assessments. The problem is, we don't really know for sure. The GDPR isn't fully fleshed out (as most legal text) and certain questions can only be answered after the first legal/court cases got settled. This is particularly true since the ePrivacy regulation was meant to become effective in parallel to the GDPR but won't until 2019.

Which fields in the user, host, thread, post and result tables belong to a user?

Any data that relates to an identifiable (directly or pseudonymized) data subject (e.g. via userid reference). Break the relation or anonymize the data subject and you might be good.

Also, keep in mind that these data are affected by the data subject's right to "data portability" as well. You need to be prepared to hand those data over on request (within a month), in a "structured, commonly used and machine-readable format".

Is a userid "their data" once any link to an email_address or cpid has been removed?

If the userid can be somehow still be associated with the data subject, it would just be pseudonymized data and the GDPR also applies to those. Think of externally available information like search engine caches or BOINC account managers or stats sites for instance. Those, by the way, are yet another can of worms (sign-up/consent, erasure notifications)...

Most of that information is created by the project for internal use. [...] There may be projects which require access to other information in the host table

Most of this boils down the question of the lawfulness of the data processing you do. This can be established via different means, the two directly applicable ones in our case should be the "data subject's consent" and "data controller's legitimate interest". The latter can override the former if justified but it's of course much easier to describe your data deletion/retention policy in your privacy policy and include it in what the data subject gives its consent to. Most importantly: whatever you do, do it transparently and document it in your "records of processing activities" (another mandatory GDPR requirement).

Posts and threads, I can see that deleting them all is probably required.

Yes, but that's harder than it sounds. What do you do with threads opened by the data subject to be deleted? What do you do with quotes of the data subject's comments? Again, there could be "legitimate interest" to retain those as the discussion would lose coherence (i.e. "for archiving purposes in the public interest, scientific or historical research purposes"), but that's all not entirely clear yet (according to our data protection officer).

any science results returned are the property of the users and need to be deleted as well? After all they link back to result.userid.

This is not about property but data subject rights pertaining to the data subject's data. As soon as you anonymize the tasks/results, e.g. by NULLing the userid you might be safe. Related to the idea of data property is that the data controller should be allowed to delete any of the data subject's data without prior consent as the data subject doesn't own it.

At that point, all the personal information is gone

That might be true but keep in mind external data (see above) that might still allow to derive the original data subject (e.g. via the userid pseudonym).

If in doubt, delete whatever you can. That's probably what we're going to do anyway - cleans/speeds up the DB as a nice side-effect.

HTH

RichardHaselgrove · 2018-04-04T09:34:31Z

Couple of quick points. When a user was recently deleted (user request), several of us noticed that their private messages were deleted from our inboxes. In a recent check of dates/ID numbers at SETI, I was surprised to find that BOINC users appear to write roughly the same number of private messages to each other, as public messages on the message boards. Whatever David did in that case (probably related to #2445, rather than the GDPR) needs to be included in this discussion too.

And what effect will the GDPR have on the "Wayback Machine" internet archiving project? I sometimes refer to that to check on the previous history of a BOINC project.

brevilo · 2018-04-04T09:52:13Z

@RichardHaselgrove we're going to delete private messages.

And what effect will the GDPR have on the "Wayback Machine" internet archiving project?

That's part of the "erasure notifications" (GDPR Art. 17.2) issue as well as the lawful processing for "archiving purposes in the public interest, scientific or historical research purposes" (GDPR Art. 17.3d) I alluded to above. The former might affects projects (not yet clear) but the latter only affects the Internet Archive itself.

brevilo · 2018-04-04T10:00:40Z

@TheAspens I'm in the process of reviewing the proposal. Whatever get's done: please separate frontend (.php) from backend/library (.inc) code such that any of these features can easily be integrated with the Drupal code.

Thanks

brevilo · 2018-04-04T11:13:08Z

@TheAspens My comments on the proposal:

I agree with the overall proposal and its implementation
Data exports:
- I think we need to separate exports required for data portability and exports for downstream consumers
- The former need to be augmented to include all data of the data subject, inkl. host details and community content
- The latter only need to the deletion request/tag (nice idea by the way!)
"All entries for the user will be deleted from the following tables"
- This should probably describe how. In general it should mean to remove any record in any table that references the userid in question, unless otherwise stated.
"team (when user_id is for the user, how to remove since field is not null)"
- The userid could be set to 0 which already served as a default for that attribute earlier.
While a result/task might not be deleted right away: why not set the userid (and hostid) to 0 for affected records? I don't see how those are "for technical operation".
"thread – do we need to remove thread?"
- See my comments above. This seems to be the toughest part...

Ageless93 · 2018-04-04T13:31:01Z

I'm wondering though.

Private messages that I received, are as far as I see it mine, no longer owned by the sender. So when the sender wants his account erased, the PMs I got from him have to be left alone, as they're no longer his, but mine. Perhaps if the project has an outbox, that sent PMs in there have to be removed. But most projects just have an inbox and a write PM option.

Compare it to text messages, Whatsapp, snail mail. Once the sender sent it, it's no longer his. When he wants his account deleted at a service provider, they won't delete all the text messages he ever sent from other people's devices. When the person stops Whatsapp, only the local account is deleted, but all sent messages will still be on other people's devices. When you mail a handwritten letter to some other person, it's no longer yours as soon as you drop it in the mailbox.

So why handle private messages differently?

Willy0611 · 2018-04-04T14:23:45Z

Here are my thoughts regarding both BOINCstats and BAM!:

For BOINCstats (the stats section) it's enough to just remove the user/hosts from the XML export. During the next import users/hosts no longer existing in the XML will be deleted from the stats. Other stats sites may work differently.

BAM! is a little bit more complicated and may also require more to be done on the project side.

When a user deletes his account at a project, should that also delete his BAM! data for that project (please keep in mind that BAM! data is not stats data!)? The project doesn't necessarily know that a BAM! account with data for that project exists. If this data should also be deleted, the project should call a (non-existing) BAM! API to do so.

Then the other way around: When a BAM! user deletes his account, should it also delete all the linked project accounts? I think this should be a choice by the user. If he chooses yes, BAM! must call a project API (RPC) to notify the project to do so. Then the project can do one of two thing: A) Trust BAM! and delete the account or B) start the deletion process as outlined here.

And lastly, the big issue: Sometimes I get requests to remove stats data. Most of the time these emails contain a link to one or more pages on BOINCstats with the request to remove them. It's impossible for me to be 100% sure that the person requesting the deletion is the true owner of that data. It can also be someone trying to get some competition out of the way. I refer these people to the project sites to delete/anonymize their account there. This only works when the project is still up and the admins responding. So far I have refused all requests to delete stats data on my side, however, this may lead to some issues with these new rules. I'm not sure how to handle this.

TheAspens · 2018-04-04T14:24:27Z

@brevilo

Data exports:

I think we need to separate exports required for data portability and exports for downstream consumers

The former need to be augmented to include all data of the data subject, inkl. host details and community content

The latter only need to the deletion request/tag (nice idea by the way!)

I agree that data portability needs to be seperate from the data exports. My proposal does not address the data portabilty requirement and that will need to be addressed in a seperate issue.

brevilo · 2018-04-04T14:29:15Z

Private messages that I received, are as far as I see it mine

@Ageless93 I doubt that. I don't think the data subject has ownership on any kind of data by default, let alone on data provided by others. The controller provides a service and unless otherwise stated (by the contractual basis, e.g the terms of use) can legally remove any such data.

Compare it to text messages, Whatsapp, snail mail. Once the sender sent it, it's no longer his

In that case you might have a physical (or cached) copy but even that doesn't constitute ownership. If all messages were server-based, which they are in BOINC, the service provider (controller) can simply choose to shut down the service immediately, without your consent.

Regarding WhatsApp: have you read the terms of use you agreed to? Would be interesting to know what they say on data ownership.

brevilo · 2018-04-04T14:36:06Z

@Willy0611

When a user deletes his account at a project, should that also delete his BAM! data for that project (please keep in mind that BAM! data is not stats data!)? The project doesn't necessarily know that a BAM! account with data for that project exists. If this data should also be deleted, the project should call a (non-existing) BAM! API to do so.

Projects are required to make sure any upstream/downstream services delete any of the published data as well (GDPR Art. 17.2). In case of BAM! the situation is more complicated, though, as the whole matter of opt-in consent to a given project's terms of use (or privacy policy) would shift to BAM! itself, according to our data protection officer. However, since we'd have to distinguish BAM! accounts from locally created accounts to have actual proof of consent, we might effectively be forced to shut down BAM! support until we have a GPDR-compliant end-to-end solution to this. The same might be true for the stats exports...

So far I have refused all requests to delete stats data on my side, however, this may lead to some issues with these new rules. I'm not sure how to handle this.

You have to get consent for that data processing already, even if it's dealing with pseudonyms only. That means you already have a bigger challenge at your hands, not just for data removal requests. We're all in the same situation.

TheAspens · 2018-04-04T14:48:01Z

@SETIguy - I am not a lawyer, my following statements have no legal weight so take them for 0 value.

As I have tried to understand how to comply with GDPR as it pertains to BOINC* I have come to the following understanding of the intent behind the law.

I believe that GDPR seeks to make information about an individual a fundemental right of that individual and that they get to control where that information is retained. This right supersedes any other agreements that they might of have entered into. Specifically, they can grant consent to a site to utilize data that they provide and that the site might collect about them. However, they also have the right revoke that consent and have the information they provided or was collected about them removed. They also have the right to review what information a system current retains about them.

This second bit is what makes this law such a new and fundementally different thing than what existed before. It means that we have to think of user data and assocaited data that we collect about them as something that is only loaned to us, but is not ours to keep. Systems will have to keep track of personal information and where it flows to ensure that if consent is withdrawn they can ensure that it can be removed.

Doesn't that also mean any science results returned are the property of the users
and need to be deleted as well?

Since the science results can be seperated from any notion of the user (i.e. when the result record is deleted from the database and after the result has been assimilated there is no longer any connection between the result and the user) and because they are part of the legitate purpose of the system seperate from the user, then GDPR does not apply to these records. If information about the user (for example the os it ran on and other such factors that might be needed to determine what happened during the execution of a particular task are retained) then it gets more complicated (I believe you still can, but you need to get into details about the lower levels of the law).

GDPR actual specifies multiple reasons why an organization might have to hold information about a user. The rules for retention of that data vary based on the reason. However, BOINC holds data about a user because the user has granted consent. See Article 6.1.A and Article 7.

Ageless93 · 2018-04-04T15:03:40Z

@brevilo

Regarding WhatsApp: have you read the terms of use you agreed to? Would be interesting to know what they say on data ownership.

https://www.whatsapp.com/legal/
"Your messages are yours, and we can’t read them."
"Your Rights. WhatsApp does not claim ownership of the information that you submit for your WhatsApp account or through our Services."
"If you would like to manage, change, limit, or delete your information, we allow you to do that through the following tools:
Deleting Your WhatsApp Account. You may delete your WhatsApp account at any time (including if you want to revoke your consent to our use of your information) using our in-app delete my account feature. When you delete your WhatsApp account, your undelivered messages are deleted from our servers as well as any of your other information we no longer need to operate and provide our Services. Be mindful that if you only delete our Services from your device without using our in-app delete my account feature, your information may be stored with us for a longer period. Please remember that when you delete your account, it does not affect the information other users have relating to you, such as their copy of the messages you sent them."

drshawnkwang · 2018-04-04T15:05:42Z

@TheAspens - I read through your RightToErasure document as well. Thanks for writing it up.

For the Drupal-BOINC implementation I have already written some code that deletes a user for the Drupal-side of the code. This was pre-GDPR (or before I learned of GDPR).

The user is presented with a 'delete account?' Web page a description of what will happen the account is deleted. If confirmed the account is flagged for deletion. There is no email confirmation. But the account is not deleted until two weeks later (adjustable by the admin). If the user logs in anytime within this two-week period, the delete action is canceled - i.e., the account is un-flagged.

After two weeks, the account is acted upon by a Drupal queue which deletes the Drupal user data, but keeps much of the data in the BOINC project database (tables: user, host, etc.).

There is no pressing reason for BOINC would have to implement a similar wait-period before deletion; this is just my $0.02.

RichardHaselgrove · 2018-04-04T15:09:08Z

(i.e. when the result record is deleted from the database and after the result has been assimilated there is no longer any connection between the result and the user)

I don't think that's necessarily true. Einstein (certainly) and I think SETI retain records of who processed which bit of the science - that's held in their master Science databases, long after the transactional processing records are purged from their BOINC databases. Einstein have - very publicly - awarded discovery certificates and named finders in press releases, and as (IIRC) co-authors in published scientific papers. That public recognition of participation will, of course, have been subject to secondary and very specific consent, far beyond any consent granted as part of the process of joining the BOINC project on day 1. But the user ID associated with the computation must have been maintained rigorously intact for the attribution to be possible.

JuhaSointusalo · 2018-04-04T16:23:46Z

FWIW, national data protection authorities have made guidelines about GDPR available though Article 29 Working Party. link

National authorities may have those translated or additional content available on their websites. link

SETIguy · 2018-04-04T16:56:15Z

Given that the data has been available for download by the general public, upstream/downstream deletion can't be guaranteed for anyone except well behaved upstream/downstream partners with resources. The guy who has been extracting and archiving data for all his team members to create graphs for his web site has been under no obligation to create a means for deleting data from his archive and probably will not do so. Do we need to stop providing public stats dumps? Which gets us back to the definition of "personal data". Are stats personal data to begin with? And then there's gridcoin. A cpid/gridcoin address link beacon can't be deleted from the blockchain. I don't know if a username is stored with that or not. Probably not.

…

On Wed, Apr 4, 2018 at 7:36 AM, Oliver Bock ***@***.***> wrote: @Willy0611 <https://github.com/Willy0611> When a user deletes his account at a project, should that also delete his BAM! data for that project (please keep in mind that BAM! data is not stats data!)? The project doesn't necessarily know that a BAM! account with data for that project exists. If this data should also be deleted, the project should call a (non-existing) BAM! API to do so. Projects are required to make sure any upstream/downstream services delete any of the published data as well (GDPR Art. 17.2). In case of BAM! the situation is more complicated, though, as the whole matter of opt-in consent to a given project's terms of use (or privacy policy) would shift to BAM! itself, according to our data protection officer. However, since we'd have to distinguish BAM! accounts from locally created accounts to have actual proof of consent, we might effectively be forced to shut down BAM! support until we have a GPDR-compliant end-to-end solution to this. The same might be true for the stats exports... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2447 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKXcsl7CmW4lEDGIyfjnvZ96HFyDQ_7rks5tlNpYgaJpZM4TF2yB> .

-- Eric Korpela korpela@ssl.berkeley.edu AST:7731^29u18e3

TheAspens · 2018-04-04T22:40:29Z

The explanation I have come to understand and that I am operating under is that if the clear consent on the BOINC site states what information is public then as long as a mechanism exists to communicate the users intent to have their information removed which consumers of the public data can monitor, then the BOINC site will be in the clear. However, if the consumer of the public data does not follow the delete instructions, then the consumer of the public data could be at risk of violating GDPR.

I am also operating under the assumption that stats data that is tied to a user name, user id or cross project id is personal data and needs to be cleared as well.

As far as any blockchain tech goes - I have no idea how they will comply since the two are somewhat at odds with each other.

I want to be clear again that GDPR is not clear and that the interpretation I am operating under could be incorrect. We are trying to craft the technical changes that will be minimally impactful to BOINC provide the best understanding of what it takes to be compliant. This is why I really want the review of the people who are also trying to comply with the law to articulate their understanding as well since I am not an authoritiative in this matter.

TheAspens · 2018-04-04T22:41:30Z

(i.e. when the result record is deleted from the database and after the result has been assimilated there is no longer any connection between the result and the user)

I don't think that's necessarily true. Einstein (certainly) and I think SETI retain records of who processed which bit of the science - that's held in their master Science databases, long after the transactional processing records are purged from their BOINC databases.

WCG doesn't do this so I hadn't considered the impact of that.

SETIguy · 2018-04-24T03:33:56Z

What is the appropriate action? I assume that when an account at a project is deleted, the account manager disconnects the user from that project. When the user deletes an account manager account, what is the appropriate action? Do nothing? Delete every account associated with that user at every project? Have a user selectable option?

…

On Mon, Apr 23, 2018 at 3:31 PM, David Anderson ***@***.***> wrote: I think we'll need to add an am_delete_account RPC. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2447 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKXcsvQKyJCQ_3Rg37LcuHP5ztkyXaBNks5trlZPgaJpZM4TF2yB> .

-- Eric Korpela korpela@ssl.berkeley.edu AST:7731^29u18e3

Willy0611 · 2018-04-24T07:08:02Z

Hi, My opinions: 1. /get_project_config.php should be extended with an extra tag specifying if the project must meet EU-GDPR and a extra tag indicating that the projects supports deleting accounts via RPC. BAM! reads that file daily so it wil then know if extra actions are required. 2. When a BAM! user signs up for a project it will show any extra options required to meet EU-GDPR for the project (for example, a checkbox to comply with the EU-GDPR). The values of the extra options will be added to the AMS RPC to the project. 3. The legacy problem is knowing whether or not a user created a project account through BAM! (solution under 1.3.1) 4. When a user deletes his account at BAM! an option will be shown to also delete any associated project account or a selection of projects, indicating the EU-GDPR status of the project. 5. When a user deletes his account at a project an option should be shown to delete the associated account at the AMS. Since the project probably doesn't know which AMS (if any) created the account it should send the delete request to all know AMS. 1. This will *not* delete the BAM! account itself since this was not created by the project, it will only delete the project account under the BAM! account. 2. Problem: what's the identifier? 3. API needed at the AMS. 4. Projects should store which AMS created the account. 1. Problem: User can switch AMS, so project should probably store the last used AMS. There's probably more but nothing comes to mind at the moment. Willy.

…

On 24 April 2018 at 05:34, SETIguy ***@***.***> wrote: What is the appropriate action? I assume that when an account at a project is deleted, the account manager disconnects the user from that project. When the user deletes an account manager account, what is the appropriate action? Do nothing? Delete every account associated with that user at every project? Have a user selectable option? On Mon, Apr 23, 2018 at 3:31 PM, David Anderson ***@***.***> wrote: > I think we'll need to add an am_delete_account RPC. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#2447 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AKXcsvQKyJCQ_ 3Rg37LcuHP5ztkyXaBNks5trlZPgaJpZM4TF2yB> > . > -- Eric Korpela ***@***.*** AST:7731^29u18e3 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2447 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ARDUNQ1Ks3piuIpYtJCIKFeAWDCLfT6mks5trp0ogaJpZM4TF2yB> .

brevilo · 2018-04-24T14:13:17Z

When a BAM! user signs up for a project it will show any extra options required to meet EU-GDPR for the project (for example, a checkbox to comply with the EU-GDPR).

According to our DPO the upstream account manager (AM) is required to handle the opt-in/consent problem in that scenario. However, informed consent can only be given to an actual statement/policy so that the AM has to present the project-specific text for that purpose, presumably mimicking the client's terms of use feature.

Projects should store which AMS created the account.

I agree but this needs an augmented RPC.

Anyhow, these account-creation-related issue should be discussed separately.

Other than that I agree that account deletion needs to be taken into account by AMs as well. I recommend to focus on AM -> project account deletion first (e.g. via a new am_delete_account RPC) as that's the more common case I think.

TheAspens · 2018-05-02T13:58:26Z

Here is my 2 cents:

Account managers should monitor the new user_deleted.xml that will be exported in the stats - this will list users deleted on a project in the past 60 days. See https://boinc.berkeley.edu/trac/wiki/RightToErasure#DataExports and https://boinc.berkeley.edu/trac/wiki/RightToErasure#FinalRemoval
Account managers need to be able to invoke the delete operation on a project. As discussed above this would the new am_delete_account RPC that would align with the other RPC's defined here: https://boinc.berkeley.edu/trac/wiki/WebRpc

My biggest concern about the new RPC is that the only authentication used by these RPC's is the authenticator . This will allow anyone who can obtain someone's authenticator to be able to delete someones account. Any thoughts about how to secure this?

Barring the issues around security of the new RPC- will these two points resolve the most critical questions?

TheAspens · 2018-05-02T14:15:31Z

Ok thinking longer on it. I think that the following might become necessary:

Projects will have a list of trusted Web RPC users (i.e. https://boincstats.com/, https://scienceunited.org/, https://www.gridrepublic.org/)
The Web RPC's (need to decide which) will be updated to include a section that contains:

<signature>
       <signer>https://boincstats.com/</signer>
       <hash>1234afd123asdf1234asdf134asdf.....</hash>
</signature>

The Web RPC users will provide a public key at a standard location like /public.key (i.e https://boincstats.com/public.key
The Web RPC user will use their private key to sign the message and send the signature with the request
The RPC will verify that the signer is a trusted signer and will then obtain the public key (either from local cache or from the remote server - but if the signature fails, it needs to refetch the public key to allow the signer to update their key) and then verify that the signature matches the content.
Only after that processing is complete and successful will it perform the actions of the RPC.

Thoughts on this approach?

Note that I do not have the bandwidth before the May 25th date to implement either the new RPC or this extra security step so if someone else could take this on that would be good.

Willy0611 · 2018-05-02T15:42:10Z

Hi, I agree with all the points. Willy.

…

On 2 May 2018 at 16:15, Kevin Reed ***@***.***> wrote: Ok thinking longer on it. I think that the following might become necessary: - Projects will have a list of trusted Web RPC users (i.e. https://boincstats.com/, https://scienceunited.org/, https://www.gridrepublic.org/) - The Web RPC's (need to decide which) will be updated to include a section that contains: <signature> <signer>https://boincstats.com/</signer> <hash>1234afd123asdf1234asdf134asdf.....</hash> </signature> - The Web RPC users will provide a public key at a standard location like /public.key (i.e https://boincstats.com/public.key - The Web RPC user will use their private key to sign the message and send the signature with the request - The RPC will verify that the signer is a trusted signer and will then obtain the public key (either from local cache or from the remote server - but if the signature fails, it needs to refetch the public key to allow the signer to update their key) and then verify that the signature matches the content. - Only after that processing is complete and successful will it perform the actions of the RPC. Thoughts on this approach? Note that I do not have the bandwidth before the May 25th date to implement either the new RPC or this extra security step so if someone else could take this on that would be good. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2447 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ARDUNexGGzJ05nL9si5UCPayJzeoEZSWks5tub-LgaJpZM4TF2yB> .

SETIguy · 2018-05-02T16:29:16Z

I don't think that the account manager needs to be able to run the project delete without intervention. The account manager should redirect the user to the project delete function. Then the delete would propagate back to the project manager in the next stats export.

…

On Wed, May 2, 2018 at 6:58 AM, Kevin Reed ***@***.***> wrote: Here is my 2 cents: - Account managers should monitor the new user_deleted.xml that will be exported in the stats - this will list users deleted on a project in the past 60 days. See https://boinc.berkeley.edu/trac/wiki/RightToErasure# DataExports and https://boinc.berkeley.edu/trac/wiki/RightToErasure# FinalRemoval - Account managers need to be able to invoke the delete operation on a project. As discussed above this would the new am_delete_account RPC that would align with the other RPC's defined here: https://boinc.berkeley.edu/trac/wiki/WebRpc <https://boinc.berkeley.edu/trac/wiki/WebRpc> My biggest concern about the new RPC is that the only authentication used by these RPC's is the authenticator . This will allow anyone who can obtain someone's authenticator to be able to delete someones account. Any thoughts about how to secure this? Barring the issues around security of the new RPC- will these two points resolve the most critical questions? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2447 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKXcsg3CUKhlntlqvoms4UCPitfbBbkfks5tubuFgaJpZM4TF2yB> .

-- Eric Korpela korpela@ssl.berkeley.edu AST:7731^29u18e3

sirzooro · 2018-05-04T18:46:15Z

I have read this whole discussion. I wonder if (or how) do you handle following situation: hacker learned password(s) to user email and project X. Then he deleted user's account at project X, and removed all emails sent during this process. User was not crunching at this project at the time, so BOINC Client was not complaining, and finally (e.g. after a month) found that his account somehow disappeared. If hacker decided to use "right to erasure", project admin also may not have an idea what happened.
Some systems use security event logs (e.g. syslog servers), which may store entries like "date/time: account 'foo' was deleted from IP 1.2.3.4".
I wonder if BOINC does something like this, and how these security logs are treated by GDPR.

brevilo · 2018-05-07T08:31:51Z

@TheAspens complex, but sound.

@SETIguy while I appreciate the simplicity of your approach, it would certainly defeat the whole purpose of account managers, right? That is, manage multiple downstream project-accounts via a single interface. Your example sounds like there is only one project.

@sirzooro

if your full account credentials get leaked you're in trouble anyway
account deletion should ideally employ a double opt-in (confirm via email)
the above requires that email changes are also protected by double opt-in
logs can be stored temporarily just out of "legitimate interest" of the project to provide a secure service

TheAspens · 2018-05-08T17:02:40Z

The discussion about account manager integration should continue in issue #2507

TheAspens · 2018-05-08T17:17:14Z

I was testing the handling of results returned but not validated and ran into problems. The logic of the validator and credit is complex and trying to add proper handling for the case where we are trying to validate a result returned by a host and user that have been deleted adds a lot edge cases to this code. Since we do not expect this feature to be used often and therefore the amount of results that would be in the status will be low, I am moving forward with the following proposal for how to handle this. It is documented here: https://boinc.berkeley.edu/trac/wiki/RightToErasure#ResultTable but also included below:

The removal of userid and hostid from the result table is challenging as the host and user records are used in computing credit and other stats. In order to keep things as straight forward as possible, the following logic will be implemented at the time the user deletes there account:

Any results that are in server state RESULT_SERVER_STATE_IN_PROGRESS and assigned to the user will be set to server_state RESULT_SERVER_STATE_OVER, outcome RESULT_OUTCOME_CLIENT_DETACHED, validate_state = VALIDATE_STATE_INVALID and the transitioner will be triggered for the result

Any results that are in server state RESULT_SERVER_STATE_OVER and outcome RESULT_OUTCOME_SUCCESS and validate_state VALIDATE_STATE_INIT or VALIDATE_STATE_INCONCLUSIVE and assigned to the user will be set to server_state RESULT_SERVER_STATE_OVER, outcome RESULT_OUTCOME_CLIENT_DETACHED, validate_state = VALIDATE_STATE_INVALID and the transitioner will be triggered for the result

The validator, assimilator and transitioner will be examined to make sure that other status are handled properly

Please let me know if anyone has any thoughts on this. I think that the work discarded will be extremely small and it avoids adding some signficant complexity to the code.

JuhaSointusalo · 2018-05-08T20:30:15Z

I suppose there is still a race window when back-end daemon has loaded result and other records, updated them and when it goes to update the database the records are gone. With the deadline approaching fast I'm not sure if you need to handle this case perfectly for v1.

If you are not going to delete results immediately then scrub stderr. stderr may sometimes contain personal data and in worst case scenarios it may take several months before the result gets removed.

TheAspens · 2018-05-08T22:29:00Z

If you are not going to delete results immediately then scrub stderr. stderr may sometimes contain personal data and in worst case scenarios it may take several months before the result gets removed.

GDPR allows for the retention of data that has a legitimate purpose. stderr is often needed for various review by the project and is removed when that purpose is complete. As a result, I think that it needs to be left in place.

TheAspens · 2018-05-08T22:32:59Z

I suppose there is still a race window when back-end daemon has loaded result and other records, updated them and when it goes to update the database the records are gone. With the deadline approaching fast I'm not sure if you need to handle this case perfectly for v1.

The PHP code has been implemented without any concept of transactions (everything is done with autocommit for each statement). I would have to take a deep look to see if the C code handles this any differently. Without transactions and lock in place (pessimistic or optimistic) throughout the system, I don't know how I could address this. I'd be open to ideas.

Ageless93 · 2018-05-09T18:04:26Z

A bit of news from Reuters

The pan-EU law comes into effect this month and will cover companies that collect large amounts of customer data including Facebook (FB.O) and Google (GOOGL.O). It won’t be overseen by a single authority but instead by a patchwork of national and regional watchdogs across the 28-nation bloc.

Seventeen of 24 authorities who responded to a Reuters survey said they did not yet have the necessary funding, or would initially lack the powers, to fulfill their GDPR duties.

and

Many watchdogs lack powers because their governments have yet to update their laws to include the Europe-wide rules, a process that could take several months after GDPR takes effect on May 25.

JuhaSointusalo · 2018-05-09T18:42:56Z

Without transactions and lock in place (pessimistic or optimistic) throughout the system, I don't know how I could address this. I'd be open to ideas.

I don't have any better ideas either.

sirzooro · 2018-05-09T18:57:17Z

Maybe it would suffice to start transaction in backend daemon, and use SELECT ... FOR UPDATE there? I did something similar long time ago on Oracle. As I recall, such query blocked other SELECT until transaction end.
https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html

Edit: see also https://stackoverflow.com/questions/6066205/when-using-mysqls-for-update-locking-what-is-exactly-locked

drshawnkwang · 2018-05-14T21:06:49Z

I added some additional text about the new project config enable_delete_account in the design document.

TheAspens · 2018-05-17T16:59:15Z

#2472 has been merged to master which closes this issue.

brevilo · 2018-06-13T11:35:47Z

@TheAspens Since this now got merged I'm wondering about the periodic cleanup scripts that are mandatory for this to cover the process end to end. As far as I can tell these are html/ops/delete_expired_users_and_hosts.php and html/ops/delete_expired_tokens.php. What are your recommendations to integrate them in a given project. Am I missing any?

Thanks

brevilo · 2018-06-13T11:41:48Z

Never mind, just found your "recommendations" and I'm presumbly not missing any either 👍

TheAspens · 2018-06-13T14:27:05Z

I've been trying to keep https://boinc.berkeley.edu/trac/wiki/ServerUpdates updated as well

Ageless93 · 2018-07-21T09:30:54Z

I haven't followed all the code changes, but the delete account option went live on the BOINC forums sometime in the past days. Silently again, no notification. I wonder though, if this also works for a user whose account is (temporarily) banished. Can they also still use the delete account option, or is it locked on only active -usable- accounts?

TheAspens · 2018-07-24T23:16:55Z

@Ageless93 - I would open a new issue for that. I don't know what the behavior would be.

TheAspens mentioned this issue Apr 3, 2018

Add a mechanism allowing project admins to "delete" a user #2445

Merged

TheAspens added C: Account Manager System C: Server C: Web C: Web - Project P: Critical PR: Enhancement E: 1 week labels Apr 4, 2018

TheAspens mentioned this issue May 8, 2018

EU-GDPR - Right to Erasure - Account Managers #2507

Closed

TheAspens closed this as completed May 17, 2018

AenBleidd mentioned this issue Sep 4, 2019

GDPR: option of 'User data is anonymized' doesn't work #3271

Closed

AenBleidd removed the PR: Enhancement label Oct 23, 2024

EU-GDPR - Right to Erasure #2447

EU-GDPR - Right to Erasure #2447

Comments

TheAspens commented Apr 3, 2018

TheAspens commented Apr 3, 2018

SETIguy commented Apr 3, 2018 via email

brevilo commented Apr 4, 2018 • edited Loading

RichardHaselgrove commented Apr 4, 2018

brevilo commented Apr 4, 2018 • edited Loading

brevilo commented Apr 4, 2018

brevilo commented Apr 4, 2018

Ageless93 commented Apr 4, 2018

Willy0611 commented Apr 4, 2018

TheAspens commented Apr 4, 2018

brevilo commented Apr 4, 2018

brevilo commented Apr 4, 2018 • edited Loading

TheAspens commented Apr 4, 2018 • edited Loading

Ageless93 commented Apr 4, 2018

drshawnkwang commented Apr 4, 2018

RichardHaselgrove commented Apr 4, 2018

JuhaSointusalo commented Apr 4, 2018

SETIguy commented Apr 4, 2018 via email

TheAspens commented Apr 4, 2018 • edited Loading

TheAspens commented Apr 4, 2018

SETIguy commented Apr 24, 2018 via email

Willy0611 commented Apr 24, 2018 via email

brevilo commented Apr 24, 2018 • edited Loading

TheAspens commented May 2, 2018

TheAspens commented May 2, 2018 • edited Loading

Willy0611 commented May 2, 2018 via email

SETIguy commented May 2, 2018 via email

sirzooro commented May 4, 2018

brevilo commented May 7, 2018

TheAspens commented May 8, 2018

TheAspens commented May 8, 2018

JuhaSointusalo commented May 8, 2018

TheAspens commented May 8, 2018

TheAspens commented May 8, 2018

Ageless93 commented May 9, 2018 • edited Loading

JuhaSointusalo commented May 9, 2018

sirzooro commented May 9, 2018 • edited Loading

drshawnkwang commented May 14, 2018

TheAspens commented May 17, 2018

brevilo commented Jun 13, 2018

brevilo commented Jun 13, 2018 • edited Loading

TheAspens commented Jun 13, 2018

Ageless93 commented Jul 21, 2018 • edited Loading

TheAspens commented Jul 24, 2018

brevilo commented Apr 4, 2018 •

edited

Loading

brevilo commented Apr 4, 2018 •

edited

Loading

brevilo commented Apr 4, 2018 •

edited

Loading

TheAspens commented Apr 4, 2018 •

edited

Loading

TheAspens commented Apr 4, 2018 •

edited

Loading

brevilo commented Apr 24, 2018 •

edited

Loading

TheAspens commented May 2, 2018 •

edited

Loading

Ageless93 commented May 9, 2018 •

edited

Loading

sirzooro commented May 9, 2018 •

edited

Loading

brevilo commented Jun 13, 2018 •

edited

Loading

Ageless93 commented Jul 21, 2018 •

edited

Loading