Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined behavior in safe code with public Response<'static> field #30

Closed
ExpHP opened this Issue Nov 3, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@ExpHP
Copy link

ExpHP commented Nov 3, 2018

extern crate tokio_imap;
extern crate tokio_codec;
extern crate bytes;

use std::fmt::Debug;
use tokio_codec::Decoder;
use tokio_imap::proto::{ImapCodec};

// Returns Response<'static>
fn make_bomb() -> impl Debug {
    use bytes::{BytesMut, BufMut};
    let mut buf = BytesMut::with_capacity(1_000_000_000);
    buf.put("* OK [UNSEEN 3] Message 3 is first unseen\r\n");

    let mut decoder = ImapCodec::default();
    let mut result = decoder.decode(&mut buf).unwrap().unwrap();

    println!("  ok: {:?}", result.response); // seems legit
    result.response
} // result is freed. uh oh.

fn main() {
    // get a Response<'static> that points to freed memory
    let bomb = make_bomb();
    for _ in 0..10 {
        // make other allocations that zero out memory
        make_bomb();
    }
    println!("uhoh: {:?}", bomb);
}

Release output:

  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
  ok: Data { status: Ok, code: Some(Unseen(3)), information: Some("Message 3 is first unseen") }
uhoh: Data { status: Ok, code: Some(Unseen(3)), information: Some("\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}") }
@djc

This comment has been minimized.

Copy link
Owner

djc commented Nov 4, 2018

Thanks for creating an issue about this. So do you think this way would be safe?

diff --git a/tokio-imap/src/proto.rs b/tokio-imap/src/proto.rs
index 4ef8fc1..45a5ddd 100644
--- a/tokio-imap/src/proto.rs
+++ b/tokio-imap/src/proto.rs
@@ -82,7 +82,7 @@ pub struct ResponseData {
     // is transmuted to `'static` by the `Decoder`, instead, and
     // references returned to callers of `ResponseData` are limited
     // to the lifetime of the `ResponseData` struct.
-    pub response: Response<'static>,
+    response: Response<'static>,
 }

 impl ResponseData {
@ExpHP

This comment has been minimized.

Copy link
Author

ExpHP commented Nov 5, 2018

Thanks for responding!

Short answer: I think so.

Long answer:

So, the things I know of that you need to watch out for when using this pattern are:

  1. All public access to the inner reference must be bound to a lifetime that borrows self.
  2. If the reference field can be used to mutate the raw field, then extreme care must be taken with raw. (e.g. it might not even be safe to, for instance, #[derive(Debug)]).
  3. The code must not access raw in a way that may invalidate response, followed by an access to response which depends on that validity.

Considering these in order:

  1. This is now satisfied with your change.
  2. It is easy to verify from the public API of the module that no mutation can be done through Response.
  3. One can easily see from the code that there is no access to raw after it is saved... except in drop glue. The question then becomes, are we safe in the off-chance that raw is deinitialized before response? It seems unlikely to me that there is any danger since Response seems to be some sort of AST type (so the Drop glue for Response should not read any data from raw), but it is laborious for an outsider like me to check all the child structs contained within; it'd be nice to see a comment explicitly mentioning this consideration.
    (If you can't provide this guarantee, there is std::mem::ManuallyDrop)

Two other off-side comments:

  • I feel that mem::transmute should always be annotated to make it obvious the exact transformation that's happening and to guard against getting more than you bargained for. (this annotation can be written using '_ from 1.26 onwards)
  • The second mem::transmute (in parsed) is unnecessary due to covariance and can be removed. (unless you prefer to keep it for some reason)

---
 tokio-imap/src/proto.rs | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/tokio-imap/src/proto.rs b/tokio-imap/src/proto.rs
index 4ef8fc1..11c01f1 100644
--- a/tokio-imap/src/proto.rs
+++ b/tokio-imap/src/proto.rs
@@ -26,7 +26,7 @@ impl Default for ImapCodec {
     }
 }
 
-impl<'a> Decoder for ImapCodec {
+impl Decoder for ImapCodec {
     type Item = ResponseData;
     type Error = io::Error;
     fn decode(&mut self, buf: &mut BytesMut) -> Result<Option<Self::Item>, io::Error> {
@@ -35,11 +35,13 @@ impl<'a> Decoder for ImapCodec {
         }
         let (response, rsp_len) = match imap_proto::parse_response(buf) {
             Ok((remaining, response)) => {
-                // This SHOULD be acceptable/safe: BytesMut storage memory is
-                // allocated on the heap and should not move. It will not be
-                // freed as long as we keep a reference alive, which we do
-                // by retaining a reference to the split buffer, below.
-                let response = unsafe { mem::transmute(response) };
+                let response = unsafe {
+                    // This SHOULD be acceptable/safe: BytesMut storage memory is
+                    // allocated on the heap and should not move. It will not be
+                    // freed as long as we keep a reference alive, which we do
+                    // by retaining a reference to the split buffer, below.
+                    mem::transmute::<Response<'_>, Response<'static>>(response)
+                };
                 (response, buf.len() - remaining.len())
             }
             Err(nom::Err::Incomplete(Needed::Size(min))) => {
@@ -82,7 +84,10 @@ pub struct ResponseData {
     // is transmuted to `'static` by the `Decoder`, instead, and
     // references returned to callers of `ResponseData` are limited
     // to the lifetime of the `ResponseData` struct.
-    pub response: Response<'static>,
+    //
+    // `raw` is never mutated prior to being dropped, and `Response`
+    // does not try to access the borrowed data in its drop glue.
+    response: Response<'static>,
 }
 
 impl ResponseData {
@@ -93,7 +98,7 @@ impl ResponseData {
         }
     }
     pub fn parsed(&self) -> &Response {
-        unsafe { mem::transmute(&self.response) }
+        &self.response
     }
 }
 
-- 
2.19.0

@djc djc closed this in 804009e Nov 6, 2018

@djc

This comment has been minimized.

Copy link
Owner

djc commented Nov 6, 2018

Thanks for the review! I've pushed a more minimal version and will push out a release shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.