New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address real-world object detection #5
Comments
A few thoughts and considerations from our team:
|
This is an interesting concern. It's worth noting, again, that this could be built in a polyfill with a camera. In fact, there are already libraries today which can identify the presence of objects (not necessarily their exact position) using the camera. |
We haven't really talked at all about platform level object detection / tracking (aside from talking about how we should talk about it). This is a bit reason why I'm such a "squeaky wheel" about camera permissions. It is clear that if we hand video frames to Javascript, "all bets are off". Video frames (with relative device pose information) can be sent off to the cloud to be analyzed at leisure. We should assume that will happen. So the converse (we should be able to do WebXR without giving camera frames) seems super-important to me: in fact, it feels to me like "a thing that the web could do that native platforms aren't going to do any time soon". It may be the case that platforms provide "thing" detection and tracking (objects, images, etc) without giving access to the camera; the real advantage of such capabilities is likely performance (both CPU/GPU, and battery) than privacy or security (e.g., if I can do image tracking, I can look for signs, or even faces, as you suggest). Perhaps some things (like a fixed image/marker set, or something like QR codes) might be reasonably "safe". Overall, I find it hard to imagine that these sorts of features wouldn't come with a "dire warning" (akin to camera access). |
QR codes can be a malware vector. Be very careful with accepting arbitrary QR codes. For facial recognition, as I said in another issue, I feel the best option would be to do a match for things that look like faces in the browser and opt in to allowing it to be parsed with JS or sent. This could allow for censoring faces from being sent to the application/server at a browser level without consent. |
In this context, which I probably should have been clearer on, they are data, not "urls" to be loaded: they might be URLs, of course (my "app" might pull the appropriate bits off the end of an appropriate URL, and ignore the data in others). The point is that they are a recognizable 2D images, that can contain some data, and have a simple structure that can be robustly tracked in 3D. |
I think we need to be careful to distinguish between "facial detection and tracking" (i.e., what ARKit on the iPhoneX does), and "recognition". It is easy to imagine safe ways of doing facial detection and tracking to provide useful capabilities (essentially, almost everything you see Apple promote with the X), without enabling recognition (if the app doesn't have access to the camera bits.). But, your point is interesting too; I've seen security researchers suggest similar things in the past, where we detect things in video frames that we don't want code to have access to, and hide it. A program can tell a face was there, but that it was removed (because, for example, there's a big hole in the data, or it's been replaced by a fixed facial image ... everyone ends up looking like Deadpool on phone!) |
We also need to prevent obscuring important real world data (e.g. placing an XR object in front of a stop sign/danger: cliff/etc) |
Agreed with both of you. We need to prevent obscuring important data, but we also need to prevent the possibility of data gathering based on the data provided that can deanonymize users and those around them. With the sensor data being gathered, one of my worst nightmares is that XR is used to create an Orwellian nightmare, We need to implement methods to prevent deanonymization. I feel this requires further R&D to tell if this can be used to gather that kind of data and effective attack/prevention methods. |
I've added a section on this topic to the privacy & security explainer to address this issue (#14). Can you please take a look and provide feedback? I'll plan on closing this issue (and merging the PR) at the end of the week. |
One comment I had in the document is that much of the mitigations will assume that the app is running at a permission level that does not give full video/sensor access to the apps. This may be in contrast to the native APIs, which assume full access. This may be obvious, but it also should be kept in mind, because there may be mitigations that limit some use cases, where those use cases may work when full access has been given. |
I've just committed the PR #14 to address this issue. Thanks all! |
An explainer should outline user privacy and security concerns (particularly threat vectors) when sites have the ability to detect real-world objects, including planar image detection. An explainer should additionally explore approaches to mitigating those concerns.
The text was updated successfully, but these errors were encountered: