Skip to content
This repository has been archived by the owner on Jun 23, 2023. It is now read-only.

Risky venues circuit breaker leaks private data #29

Closed
CovidSecure opened this issue Oct 14, 2020 · 8 comments
Closed

Risky venues circuit breaker leaks private data #29

CovidSecure opened this issue Oct 14, 2020 · 8 comments

Comments

@CovidSecure
Copy link

The risky venues circuit breaker functions by submitting the venue IDs of each risky venue that a user has visited to an API endpoint. This partially exposes the user's venue check-in history based on which venue IDs are submitted, and is contrary to the various documentation which states that the user's recorded venue IDs will never leave their device.

This data could be used for gathering statistics and performing analyses without users' knowledge or consent. In conjunction with the user's IP address, the data could also be linked to a specific user. Similar data could also theoretically be gathered for any specific non-risky venue by placing it on the risky venues list and deactivating the circuit breaker for that venue to prevent a notification from being triggered.

This breaks the security model around which the app is built, being that all processing should happen on the user's device and no sensitive (i.e. non-analytic) data should ever be uploaded thereby abating possible concerns that the app may be used for mass data gathering or espionage by making it technically impossible to do so. In the interest of transparency, the risky venues circuit breaker component should be removed or redesigned in a way that does not expose the user's data or, if this is not feasible, a clear notice about the operation of this function should be given in the documentation.

@Cyclic3
Copy link

Cyclic3 commented Oct 15, 2020

I initial met this issue with some scepticism, but after been reading though the code for the last few hours (and by god could it use some comments), I've come to the same conclusion. I'll give some more technical details to those who wish to verify it or explain how we've completely misunderstood some integral part of the code, which is completely possible. Of course I don't believe that this is some secret government scheme to track us, because the UK government has far better places to hide trackers than in open-source code on GitHub. This is probably an honest mistake, either on our part, or the dev's.

<MASSIVE_DISCLAIMER>

I am in no way a Kotlin dev, and this could all be completely wrong!!! Someone who knows all of this better than me should check my links to confirm (or refute!) my understanding. I'll try to keep this updated with any more information I get, either through comments or further code inspection.

</MASSIVE_DISCLAIMER>

The code refers to the 2 hour poll loop as "WorkManager" [code], so I will too.

  1. I didn't look at QR code scanning, but I'm going to assume that's uninteresting.

  2. The WorkManager calls the checker code for the risky venues [code], which downloads the list of ids, locally compares the client's ids with downloaded ones, and if it finds matching ones, triggers the circuit breaker initial tasks [code]

  3. This initial tasks function creates a worker on the WorkManager, which only runs if the network is connected, and supplies it with these ids [code]

  4. This worker jumps through some hoops [code] and iterates through the shared ids. For each id:

    • It passes the id to the mentioned API call [code]. This API call is a POST request to circuit-breaker/venue/request [code], and is not really documented anywhere. That's not unique (all the code is undocumented), but leaves it's purpose rather vague. The optimist in me suggests that this is just a redundant check that the token is valid, but the pessimist suggests that this is a way to alert the remote server about possible venue exposure without the user's consent. Either way, unless I've completely missed the point (which is, again, very possible), it has no place in the app and should removed.

    • If it succeeded, the user notification code is run, and we finish this iteration [code]

    • If it is "pending", which I assume indicates some form of server overload, a poll is added to the WorkManager with some form of request identification token [code] [code] [code]. If the remote server returns a value, the poll loop stops, with a successful value creating a prompt like an initial success would. However (according to my limited understanding), the silent exception catch [code] combined with a failed request would allow this to loop every 2 hours, uniquely identifying the user to the remote server. If this is the case, this definitely needs to be addressed! It is possible that this is supposed to handle network failure, but a catch-all solution like this is too easily exploited by purposefully sending bad HTTP responses.

  5. If an exception is thrown during the handling of the initial approval request, it is unmarked as a risky venue id [code], giving credibility to the idea that this is some form of redundant verification system. However, this behaviour is different with the "try again" philosophy of the poll loop, which is rather odd.

It would be great if someone could explain the intended purpose of this API call (and maybe even some documentation!), and also if my assessments are correct.

@CovidSecure
Copy link
Author

@Cyclic3

Documentation for the risky venue circuit breaker is here and for the circuit breakers in general (there is also an exposure notification circuit breaker, which does not expose private IDs) is here.

An item of documentation that I read previously but cannot locate now indicated that the circuit breaker will monitor the number of notifications that are triggered in response to a specific venue and "trip" if too many notifications are triggered in a short space of time, although the majority of documentation refers to the circuit breaker as being a manual (human-operated) control. It has also been stated that the circuit breaker operates automatically based on manually-set parameters.

The "pending" state likely exists to allow notifications for a particular venue to be delayed, but still triggered at some point in the future. This would allow notifications to be spread out over time instead of being triggered on all devices within a single 2-hour window. I cannot confirm whether your assessment of the exception handling is correct however it would be possible to create an indefinite polling loop by leaving the circuit breaker fixed in the pending state without needing to trigger an exception.

Given the history with the NHS's previous attempt at a contact tracing app, it is not unlikely that this data is being used for gathering statistics (specifically on the number of people who attended and were possibly exposed at a risky venue), or is planned to be in the future.

@CovidSecure
Copy link
Author

@Cyclic3 Also, WorkManager is a part of the Android API that is used for scheduling background tasks.

@nhs-covid19
Copy link
Collaborator

Thanks for your interest in the NHS Covid-19 project. Thanks for pointing this out. A recent internal code review had identified the same issue. The venue id is not necessary for providing the circuit breaker functionality within the application - as you point out, @CovidSecure, the other circuit breaker API does not send the information - and as such will be removed in a forthcoming mobile application release.
I'll close the Issue when the code change is released - but you'll also be able to see the code changes here in this repository.

@CovidSecure
Copy link
Author

@nhs-covid19 Thank you for your response and for your continued effort to protect users' privacy. I am pleased to see that this is being resolved.

@nhs-covid19
Copy link
Collaborator

A bit more information on current usage: The code that processes the data received is at: https://github.com/nhsx/covid19-app-system-public/blob/master/src/aws/lambdas/incremental_distribution/src/main/java/uk/nhs/nhsx/circuitbreakers/RiskyVenueHandler.java#L77 - this does not use the supplied venue id. It should not show up even now in log files for the system due to (intentional settings) described in ukhsa-collaboration/covid19-app-system-public#24 and ukhsa-collaboration/covid19-app-system-public#25

@CovidSecure
Copy link
Author

CovidSecure commented Nov 12, 2020

Any status on when this is likely to be resolved? The latest update from 2 days ago appears to still have this issue, and it has been almost a month since the issue was originally raised.

@nhs-covid19
Copy link
Collaborator

Closing this issue as the circuit breaker call has been resolved for some time now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants