-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Channel Read Information Flow Violation #1197
Comments
I agree that this is potential leak and I think your understanding of channel_read is accurate. I feel like (2) and (3) are basically equivalent in practice. If L(channel) flowsTo L(caller) and not(L(caller) flowsTo L(channel)) then the queue would grow forever unless there is another node that can delete messages. I cannot think of many cases where this would be a useful setup to have, but there might be some obscure cases. Another option is to have single ownership of the read half and write half of a channel. This would require a node to stop using a channel half if it sends it to another node. Regarding broadcasting: I don't think the current model really supports broadcasting. It is more like work stealing or load balancing approaches, where only one copy of the message is received by whoever reads it first. The other nodes don't receive the same message. To really support broadcasting we would need to do something like only popping a message once all read-halves have read it, or keep a separate queue for each read half and automatically send a copy to each. Option 2 could support broadcasting, but it is not clear how multiple nodes could coordinate their timing (apart from having more channels which sort of defeats the purpose) to ensure that only one node deletes the messages once all other nodes have read it. |
Thanks for giving this a look @conradgrobler More comments on (2) separately from (3): Not sure if (2) is actually needed, but broadcasting seems potentially useful. (3) might be expressive enough to support the current use-cases, and is simplest to implement. |
On more thought, this seems like the cleanest way to broadcast. It can be done if the ABI uses (3). The sender just keeps this list internally. It actually uses one message fewer than option 2 (option 2 does one send then N acks where there are N receivers. The option you described just does N sends). |
We did discuss whether handles should be linear (i.e. always move, never copy) ages ago (prior to having much of an IFC plan), but didn't really come to a conclusion – I guess we may now have more of an opinion on it! @aferr, just to confirm: would linear handles solve the IFC violation?
Yeah, that's how I'd put it too – I don't think the current system supports broadcasting (nor am I aware of there being a need to). |
Yes, I think making the handles linear does solve the information leak. Edited: |
Thanks @aferr , very relevant finding, and great write-up! It seems that linearity of handles (1 sender : 1 receiver) would solve the issue, but I wonder if there is still a violation if a node reads from a channel, then linearly passes the handle over to another node. I guess this is fine because the handle passing would only be allowed if the flow is allowed from the two, but I need to fully convince myself this is safe. Also, I think we should seriously consider option (1). In practice, even currently, for each newly created node, we also usually need to create a corresponding channel; that channel usually has the same label as the node, and also is not shared with other nodes. Effectively its receive end is solidly attached to the receiving node. It may be more natural to just receive a handle to a node when creating it, and then use that to send messages to that node (in a sort of mailbox for that particular node). It may change the pattern for RPC-like invocations though, where we usually pass a response channel to the node that fulfils the RPC, from which the caller reads back any response. |
As a side note, this is a great example of something that I hope a formal abstract model of the Oak Runtime would help catch, even if it's not directly connected to the Rust implementation. In fact, if / when we get to writing down some model, it would be nice to do exactly this exercise, by introducing this behaviour in the read call, and check whether a theorem or property of the model actually fails. |
@tzn. I did start this a bit first in f-star, now a bit in coq instead. It's really rough still. Trying to pick a language and figure out how to do some things, still. |
That's great news, I'm really looking forward to that! Keep us posted (and let us know how we can help). |
(BTW I am not |
Via discussion with @tiziano88 over VC. I think we're leaning towards taking a variant of the linear channels option proposed by @daviddrysdale : Read handles are linear (there is at most one node that has a read handle at a given time) There's no rush to implement this. |
@rbehjati FYI this is the issue we talked about earlier in person. If you'd like to work on it, feel free to assign to yourself, and also we can keep using this issue for any preliminary discussion. |
After digging into the proofs a bit more, I don't think linear channels are enough to make it possible to remove this leak and make noninterference provable. Though in short, some ways to fix this would be:
If read handles are linear, maybe the first choice above is quite reasonable. Probably a common case is to make a channel have the same label as the one node that reads it. This design choice would make it difficult to re-purpose a channel for a different node at a different label, but maybe in that case you should really make a new channel anyway. (If we do have channels with linear read handles and the channels need to have the same label as the nodes that read them, this also seems functionally equivalent to not having channels, unless I am missing something). (I think it also should not matter that the channels are linear as long as the label of the node and the label of the channel the node reads are equal. Non-linear read handles should also be fine). More of an explanation as to why we need these extra requirements: To simplify the discussion, I'll just talk about a world with two labels, H and L, but this applies to the general case. (To go from this case to the general case, you just think about whether or not (label_of(n) flowsto observer) instead of "is the node H or L"). If a node labeled H takes a step, the state before the step s, and the state after the step s' should look the same to L. The problem is that a read is destructive because it pops the channel it reads from. If the read handle is linear, it could still be that the channel has label L and the node has label H, which means the low channel would change as a result of the read channel step. Would it be fine to make the ReadChannel call check label_of(ch) = label_of(reader_node) ? |
Extending the explanation above: (This comment is about a proof strategy that attempts to fix the problem without changing the design, but does not work). Another choice you might consider would be to make the contents of all channels unobservable to all nodes regardless of the label of the channels. This might seem reasonable because after all, the contents of channels are not directly observable, their contents are only observed when nodes read from them. However, if the contents of all channels are secret, then we completely lose information about the messages they store, and we can't prove the single-step theorem for noninterference for a read call in this case either. So, making the channels globally unobservable does not work. |
Thanks @aferr , good catch! FWIW, in Oak we are already assuming that channels are unobservable by nodes, otherwise the read-last-message attack would trivially extend to reading an arbitrary message, even if it is not the last one, even if read handles are linear. In particular, it is not possible for a node to check how many messages (if any) are in a channel, it can only try to read from that channel and find out if there is at least one message, which is why I expected linear read handles to be sufficient to fix this issue. I am not sure I fully understand what the downside in keeping it this way, if anything it seems it should make the proof easier, because there is obviously no way for nodes to use internal channel state to encode any information (which is exactly the issue you are describing, I think). |
@tiziano88 The first comment was the more important one than the second one. The high level point is: Would it be fine to change the ReadChannel API to make the check label_of(ch) = label_of(node) instead of label_of(ch) flowsTo label_of(node) ? And also, whether or not the read handles are linear does not change anything as far as noninterference proofs are concerned. I edited the comment two points up for clarity. (It really was not clear the first time. My mistake :) ) |
I understand your proposal, but my point is that (to me, informally) it seems an unnecessary additional restriction given that we already have a restriction in place that does not allow nodes to see inside channels in the first place, apart from when they actually read from them. But it seems that this restriction is not currently captured / capturable by the model? |
Just to double check, by making channels globally unobservable, I mean that even if the observer is LabelX, and the label of the channel is LabelX, the observer still can't tell whats inside the channel. All channels in the whole system look the same. A very intuitive explanation as to why this does not work is that clearly the contents of channels with label X are observable to nodes with label X because nodes with label X are allowed to read those channels. So we need to say that channels at level X are observable at level X (and above). (Clearly nodes can't check anything about channels that have a higher label than the node.) |
- but the eautos found an inconsistent proof: coq/coq#8215 It turns out with the present definitions, the lemma is not true. The definitions of node_state_low_eq and chan_state_low_eq need to be changed so that None is mapped to the low projection. Doing so would make it impossible for a public observer to tell whether or not a secret node or channel was created or destroyed. This is needed to make the unobservable step lemma provable because otherwise, channel create turns an unoccouped handle into one that is occupied. The case is similar with node create. - The call for ReadChannel needs to be changed so that label_of(ch) = label_of(node) rather than label_of(ch) << label_of(node) because channel reads are destructive. See also: project-oak#1197 (comment) (which is still being discussed)
OK, I think I see your point now, thanks. In that case, I guess the suggestion that a Node needs to have exactly the same label as the channel in order to read from it makes sense to me. Just to confirm: this would also allow us to not need linear read handles any more, right? I also think this may enable additional ways of reading channels, e.g. in principle a node may be able to read how many messages (if any) are contained in a channel, even without actually pulling all of them out, and other similar operations. Is this correct? Whether or not we actually end up implementing this functionality is another story of course. |
Right, I think they're not needed. If we still want them anyway, they would still be fine to include.
Yes, that should be fine as long as the label of the channel flows to the label of the node. I think if checking the number of messages in the channel was implemented as a separate call from a normal read, I think the check could just be L(ch) flowsTo L(node) rather than L(ch) = L(node) because this call doesn't need to change the channel's state. |
- but the eautos found an inconsistent proof: coq/coq#8215 It turns out with the present definitions, the lemma is not true. The definitions of node_state_low_eq and chan_state_low_eq need to be changed so that None is mapped to the low projection. Doing so would make it impossible for a public observer to tell whether or not a secret node or channel was created or destroyed. This is needed to make the unobservable step lemma provable because otherwise, channel create turns an unoccouped handle into one that is occupied. The case is similar with node create. - The call for ReadChannel needs to be changed so that label_of(ch) = label_of(node) rather than label_of(ch) << label_of(node) because channel reads are destructive. See also: project-oak#1197 (comment) (which is still being discussed)
I think there is a subtle information leak in the design of channel_read, but let me know if I am mistaken about how it works. I only noticed this while starting to formalize the runtime. No downgrading is involved. I did not think of this until I started trying to write a formal model of Oak with the goal of trying to get some NI-result.
There is a channel X, with label L.
L flowsTo L1.
L flowsTo L2.
not (L2 flowsTo L1)
not (L1 flowsTo L2)
Node A is labeled L1
Node B is labeled L2
Both A and B have the read handle for X.
If there is a message in X and A does read the message, the message is removed from the channel. When B reads from X, whether or not B sees this message depends on whether or not A has read from X first. This is a leak from a node at L1 to a node at L2, which is not allowed.
The trick here is that reading from channels is actually destructive. Reading changes the state in the channel, so the read call does both a read to access the data and a write by removing the message.
Some potential solutions:
Peak: the calling node receives the message from the channel, but the message is not removed. The check is L(channel) flowsTo L(caller).
Delete: (like a pop, but without seeing the message). The message is removed from the channel, but the caller does not get this message. The check is L(caller) flowsTo L(channel).
It may seem (3) is overly restrictive, but I think in practice it could be used quite similarly to sending messages from one node to another. If just (3) is implemented, we lose the option to broadcast messages to several nodes.
Option (2) does allow messages to be broadcast.
I think option (3) could be implemented in terms of the two calls in option (2). So perhaps doing that is best?
What do you think? What are some other solutions? (and actually most importantly, do I understand how channel_read works correctly? :) )
The text was updated successfully, but these errors were encountered: