How "anonymous" is the semi-decentralized BLE contact tracing provided by Apple & Google (GAEN a.k.a. ENS) or DP-3T?
It is as anonymous as the simulated picture below - and this data is technically accessible to any 3rd party who can install a large fleet of BLE-sniffing devices. This is because all beacons signals broadcast by infected individuals are published to essentially all users of the system when an individual voluntarily uploads their positive infection status.
This repository contains a Proof-of-Concept implementation of a BLE-sniffing system that could uncover this data.
The image shows the results of a simulation where 400 BLE-sniffing devices would have been deployed in a 20×20 grid over an area of 1500×1500 m². The movement of 300 people around the area have been (crudely) simulated as random walks.
The red circles correspond beacon signals recorded from infected individuals, who have voluntarily uploaded their positive infection status to the local health authorities (5% of the paths in the simulation). Blue circles are signals recorded from other people using the contact tracing service. As illustrated by the lines connecting the red dots, the route traveled by each infected & announced individual can be reconstructed within the range of the sniffer grid.
See backend/generate_fake_data.js
for details.
A BLE-sniffing system consists of
-
A significant number of BLE-sniffing devices installed to different known physical locations. An example implementation is given for Linux. Mobile phones could also act as BLE sniffers and an example app is provided for Android. They passively listen to contact tracing BLE traffic and upload it to the server(s).
-
A device running an official local contact tracing app on a device which is hacked to intercept the confirmed positive diagnosis keys (in Apple/Google terminology) / secret day keys (in DP-3T) the app downloads from the local health authorities' servers. This part is not described in this repository, but it is an easy task and infeasible to prevent effectively.
-
The server(s) receive the BLE contact tracing data, namely the rolling proximity identifiers (RPIs, or EphIds in DP-3T) from the agents and the diagnosis keys from the hacked app.
A simple BLE sniffer implementation for Linux. Listens to BLE messages from the GAEN and DP-3T protocols. The agent can verifiably observe DP-3T EphId payloads as broadcast by the official test app. It can also observe traffic sent by the official GAEN app in Finland.
Tested on Debian Stretch
- Make sure the
hcidump
andhcitool
BLE tools are installed, e.g., on Debian:sudo apt-get install bluez-hcidump
- Run as root
cd linux; sudo ./run.sh
. Seerun.sh
for more info. Requires Python 2.7.
An Android application that
- displays the number of nearby GAEN devices on screen
- reads and logs GAEN messages and their RSSis on the background
- reads and logs the GPS & WiFi location data of the device it is installed in
- runs in and Android Foreground Service until killed by the OS or closed with the back button
- logs to the "external cache directory", which is not visible to other apps on the phone
Installed with cd android; ./gradlew installSnifferDebug
.
The logs can be pulled from an USB-connected device using ADB:
cd android; mkdir data; cd data
adb pull /storage/emulated/0/Android/data/org.example.coronasniffer/cache/logs
The file android/parse_logs.py
contains a script for parsing the logs and uploading
them to the backend, for eample:
cd android
cat data/logs/* | python parse_logs.py --server=http://localhost:3000
A minimalistic app for sending various BLE beacon messages from an Android phone, including spoofed GAEN and DP-3T EphId payloads. The app is intended for more convenient testing without installing official contract tracing apps or their test versions.
The app can be installed through Android Studio or running cd android; ./gradlew installSpooferDebug
- assuming you have working Android development environment installed. Make sure you have Bluetooth on in the phone and see Android Logcat for details of what the app is supposed to broadcast.
Note that it is possible that the spoofed messages broadcast by this app would be caught and recorded by actual contact tracing apps, but this should not cause any disturbance to the real contact tracing service. Those messages will effectively get ignored as they are never reported infected, similarly to the other "non-infected" traffic those apps see during their normal operation. However, do not modify the app to spam the airwaves with very rapidly changing EphIds/RPIS, which could theoretically cause a Denial-of-Service to the nearby users.
A minimalistic server that can receive data from a fleet of sniffers, store it in a database, and visualize the results. HTTPS and authentication are currently not supported, but could be added using a reverse proxy. A single SQLite database cannot cope with a very high volume of data, but could be rather easily sharded.
- Setup NodeJS
cd backend
npm install
mkdir -p data
By default, the data is stored to the SQLite file backend/data/database.db
,
which can be directly examined/debugged with, e.g., the sqlitebrowser
program.
- To populate the database with simulated data, run
npm run db:simulate
. This will erase any previous contents of the database. - Run
npm run start
- Go to http://localhost:3000 to observe the results. They will be shown on top of Open Street Map (using Leaflet)
-
Clear any previous data: delete
backend/data/database*
-
Start the server :
node index.js
(see the file forBIND
andPORT
options). -
Start the Linux agent(s):
cd linux; sudo ./run.sh
changeSERVER
inrun.sh
and the spec inagent.json
as necessary. -
Run the Android test app to spoof Contact Tracing messages with a known temporary exposure key (TEK)
-
Open http://localhost:3000 (should show a blue circle)
-
Mark the exposure key as infected, i.e., as a diagnosis key
cd linux; ./resolve.py apple_google_en 6578616d706c65000000000000000000
-
Refresh the browser, the circle should have turned red
DISCLAIMER: This repository is a Proof-of-Concept. Deploying this kind of a system at scale would be a very bad idea for the following reasons:
- It may be illegal. It very probably is under the GDPR/CCPA - unless you are a goverenmental entity who can argue it's for the greater good. Then different rules apply (also under the GDPR).
- Even if it would be legal today, it could be illegal tomorrow. New legislation about the misuse of contact tracing data are being drafted in some countries.
- This implementation is not tested or secure (no HTTPS or authentication and, e.g., the parsing Python scripts can crash on invalid payloads)
- This implementation is AGPL-licensed (on purpose)
The only difference between the GAEN BLE specification and the Google Eddystone base protocol (cf. the diagram here) is the 16-bit service UUID (0xFAAA vs 0xFD6F) and the BLE advertise flags (0x06 vs 0x1A). Devices that can observe Eddystone beacons can probably easily be adapted to read GAEN traffic.
In Google's reference implementation of the GAEN backend, the published key data is exported in a modified protocol buffer format, which can be parsed to JSON with the linux/import_gaen_export.py
script in this repository.
See also my blog post on Medium for a longer technical background story on BLE contact tracing and indoor positioning.
The data exposed by the BLE contact tracing systems is called pseudonymous: name, email, or other direct personal contact information is not known to the system, but the trails of locations themselves often single out an individual. For example, work and home locations are often easy to identify from the data.
BLE-sniffing devices have already been deployed (see, e.g., the links here) for other purposes. In addition, any smart phone also has the technical ability to act as as one (for any third-party application, see this preprint for existing examples), but Google & Apple can block this if this privacy issue is seen as more serious in the future. However, there are millions of devices controlled by other organizations which can potentially be transformed into BLE sniffers with an OTA software/firmware update. Laptops & cars for sure, new WiFi APs maybe, the rest of the Internet-of-Things - security cameras, fridges, cars, etc. - anyone's guess.
As a general principle, the legal consequences of exploiting a weakness, or the requirement that the attacker needs some budget for hardware, are usually not considered good defenses in cybersecurity. I think this issue is serious, and in a click-bait headline, I would call it a "side-channel attack vulnerability".
The designers of the various contact tracing systems see to be aware that this attack is possible, but it is unclear to me how serious they think it is. Concerns about this have been raised before (e.g, here and here). I have reported the existence of this PoC in the DP-3T issue tracker.
- April 2020: First version. Based on the spec only
- May 2020: Longer readme & simulation. Confirmed to work with DP-3T and reported to the issue tracker. Published a link to this README in various channels.
- August 2020: Confirmed to work with GAEN
- September 2020: Added a fully working Android sniffer implementation, revised README