-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of the unsafe block in Parser::new #6
Comments
In the latest crates.io release (0.5.12), if I keep a pointer to the text inside an This seems linked to this issue, right? |
Hi, I’m not sure what you mean. Could you please provide a working example? |
Sorry to say that so bluntly, but based on my understanding (happy to be proven wrong!) what is being done in the I don't think that the current interface with the given lifetimes can be made to work safely. On the one hand you want to bind ownership of the string to the I don't have a ready solution (Sorry! I am just driving by if you will), so at this point I am just raising awareness. I would not use the crate as-is if I were to rely on the |
Thanks, @d-e-s-o, for the feedback! Why do you say the memory will be deallocated in fn read_internal<'l, R>(mut source: R) -> io::Result<Parser<'l>>
where
R: Read,
{
let mut content = String::new();
source.read_to_string(&mut content)?;
Ok(Parser::new(content))
}
pub struct Parser<'l> {
content: Cow<'l, str>,
reader: Reader<'l>,
}
impl<'l> Parser<'l> {
pub fn new<T>(content: T) -> Self
where
T: Into<Cow<'l, str>>,
{
let content = content.into();
let reader = unsafe { ::std::mem::transmute(Reader::new(&*content)) };
Parser { content, reader }
}
} |
By the way, what I meant is really simple. If I keep a borrow |
@P-E-Meunier, do you have a working example of this? |
Honestly, looking at it again...I made changes in an attempt to understand and fix and I believe was commenting on what I believe the code to look like before... it was late. :-| Yes, you are right about the passing of ownership. In any case, the larger unsafety issue remains. Consider this patch: diff --git src/lib.rs src/lib.rs
index 4bdeb8..84be52 100644
--- src/lib.rs
+++ src/lib.rs
@@ -136,6 +136,7 @@ mod tests {
use crate::parser::{Event, Parser};
const TEST_PATH: &'static str = "tests/fixtures/benton.svg";
+ const TEST_PATH2: &'static str = "tests/fixtures/benton2.svg";
#[test]
fn open() {
@@ -167,4 +168,36 @@ mod tests {
assert!(parser.next().is_none());
}
+
+ #[test]
+ fn read2() {
+ let t1: &str = {
+ let mut p1 = crate::read(&mut File::open(self::TEST_PATH).unwrap()).unwrap();
+ let t1 = p1.nth(9).unwrap();
+ match t1 {
+ Event::Text(text) => {
+ println!("{}", text);
+ text
+ },
+ _ => unreachable!(),
+ }
+ };
+
+ let t2: &str = {
+ let mut p2 = crate::read(&mut File::open(self::TEST_PATH2).unwrap()).unwrap();
+
+ let t2 = p2.nth(9).unwrap();
+ match t2 {
+ Event::Text(text) => {
+ println!("{}", text);
+ text
+ },
+ _ => unreachable!(),
+ }
+ };
+
+ println!("{} (should still be: 'Test1')", t1);
+ println!("{} (should still be: 'ZOMG IT CHANGED')", t2);
+ }
+
} This new test will do what @P-E-Meunier describes, I believe. I added a text element to the existing test file and created another file with a different text. Here is the test data: diff --git tests/fixtures/benton.svg tests/fixtures/benton.svg
index 003c9f..2658f9 100644
--- tests/fixtures/benton.svg
+++ tests/fixtures/benton.svg
@@ -15,4 +15,5 @@
<path d="M544,50c-67,0-129,63-129,143c0,107,87,191,199,191c81,0,149-60,149-125c0-42-29-73-62-73c-22,0-44,17-44,40
c0,25,19,43,41,43c11,0,26-5,31-11c16-16,26-3,27,5c1,45-40,93-123,93c-90,0-192-91-192-177c0-81,52-123,99-121c11,2,21,9,5,24
c-7,4-12,20-12,30c0,22,20,40,45,40s43-21,43-43C621,78,588,50,544,50z"/>
+<text x="10" y="10">Test1</text>^M
</svg>
diff --git tests/fixtures/benton2.svg tests/fixtures/benton2.svg
new file mode 100644
index 000000..95cc10
--- /dev/null
+++ tests/fixtures/benton2.svg
@@ -0,0 +1,19 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!-- Generator: Adobe Illustrator 18.0.0, SVG Export Plug-In . SVG Version: 6.00 Build 0) -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg version="1.1" id="Layer_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px"
+ viewBox="0 0 800 800" enable-background="new 0 0 800 800" xml:space="preserve">
+<path d="M249,752c67,0,129-63,129-143c0-107-87-191-199-191c-81,0-149,60-149,125c0,42,29,73,62,73c22,0,44-17,44-40
+ c0-25-19-43-41-43c-11,0-26,5-31,11c-17,16-26,3-27-5c-2-45,40-93,123-93c90,0,192,91,192,177c0,81-52,123-99,121c-11-2-21-9-5>
+ c7-4,12-20,12-30c0-22-20-40-45-40s-43,21-43,43C172,724,205,752,249,752z"/>
+<path d="M544,752c44,0,77-28,77-59c0-22-18-43-43-43s-45,18-45,40c0,10,5,26,12,30c16,15,7,22-5,24c-47,2-99-40-99-121
+ c0-86,102-177,192-177c83,0,124,48,123,93c-1,8-11,21-27,5c-5-6-20-11-31-11c-22,0-41,18-41,43c0,23,22,40,44,40c33,0,62-31,62>
+ c0-65-68-125-149-125c-112,0-199,84-199,191C415,689,477,752,544,752z"/>
+<path d="M249,50c-44,0-77,28-77,59c0,22,18,43,43,43s45-18,45-40c0-10-5-26-12-30c-16-15-6-22,5-24c47-2,99,40,99,121
+ c0,86-102,177-192,177c-83,0-125-48-123-93c1-8,10-21,27-5c5,6,20,11,31,11c22,0,41-18,41-43c0-23-22-40-44-40c-33,0-62,31-62,>
+ c0,65,68,125,149,125c112,0,199-84,199-191C378,113,316,50,249,50z"/>
+<path d="M544,50c-67,0-129,63-129,143c0,107,87,191,199,191c81,0,149-60,149-125c0-42-29-73-62-73c-22,0-44,17-44,40
+ c0,25,19,43,41,43c11,0,26-5,31-11c16-16,26-3,27,5c1,45-40,93-123,93c-90,0-192-91-192-177c0-81,52-123,99-121c11,2,21,9,5,24
+ c-7,4-12,20-12,30c0,22,20,40,45,40s43-21,43-43C621,78,588,50,544,50z"/>
+<text x="10" y="10">ZOMG IT CHANGED</text>
+</svg> Now, that still "works", in the sense that the text did not change as he described. $ cargo test -- read2 --nocapture
Compiling svg v0.8.0 (/home/deso/local/opt/svg)
Finished test [unoptimized + debuginfo] target(s) in 0.89s
Running svg-4dbe958f0f293962
running 1 test
Test1
ZOMG IT CHANGED
Test1 (should still be: 'Test1')
ZOMG IT CHANGED (should still be: 'ZOMG IT CHANGED')
test tests::read2 ... ok However, I believe we are just lucky. The memory is not being reused. So let's switch to diff --git Cargo.toml Cargo.toml
index 7fc3e9..3989b1 100644
--- Cargo.toml
+++ Cargo.toml
@@ -20,3 +20,6 @@ repository = "https://github.com/bodoni/svg"
readme = "README.md"
categories = ["multimedia::images", "parsing", "rendering::data-formats"]
keywords = ["vector-graphics"]
+
+[dependencies]
+jemallocator = "0.3.2"
diff --git src/lib.rs src/lib.rs
index 84be52..e7b576 100644
--- src/lib.rs
+++ src/lib.rs
@@ -61,6 +61,11 @@
//! # }
//! ```
+use jemallocator::Jemalloc;
+
+#[global_allocator]
+static GLOBAL: Jemalloc = Jemalloc;
+
use std::fs::File;
use std::io::{self, Read, Write};
use std::path::Path; $ cargo test -- read2 --nocapture
Finished test [unoptimized + debuginfo] target(s) in 0.01s
Running svg-e9ef7eefbc050772
running 1 test
Test1
ZOMG IT CHANGED
ZOMG (should still be: 'Test1')
ZOMG IT CHANGED (should still be: 'ZOMG IT CHANGED')
test tests::read2 ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 22 filtered out And you can see that |
Interesting! I got the same output as you. The idea with the text node was that it would have the lifetime of the content passed into pub struct Parser<'l> {
content: Cow<'l, str>,
reader: Reader<'l>,
}
pub enum Event<'l> {
Text(&'l str),
} It should not be legal to use It works as expected when passing a reference (“ #[test]
fn read2() {
use std::io::Read;
use crate::parser::Parser;
let t1: &str = {
let mut file = File::open(self::TEST_PATH).unwrap();
let mut content = String::new();
file.read_to_string(&mut content).unwrap();
let mut p1 = Parser::new(&content);
let t1 = p1.nth(9).unwrap();
match t1 {
Event::Text(text) => {
println!("{}", text);
text
},
_ => unreachable!(),
}
};
let t2: &str = {
let mut p2 = crate::read(&mut File::open(self::TEST_PATH2).unwrap()).unwrap();
let t2 = p2.nth(9).unwrap();
match t2 {
Event::Text(text) => {
println!("{}", text);
text
},
_ => unreachable!(),
}
};
println!("{} (should still be: 'Test1')", t1);
println!("{} (should still be: 'ZOMG IT CHANGED')", t2);
} |
That makes sense to me. I believe Which brings me back to the initial point I made, which is that I don't believe that the current interface with the given lifetime annotations can be made work. Just from a conceptual point of view I don't see how it would be possible. But happy to be proven wrong. |
Per discussion in bodoni#6, the lifetimes and use of unsafe in Parser's constructor may not be sound. This removes the offending unsafe code and offers an argument for the unsoundness of the existing approach and the necessity of leaking to soundly satisfy the current api's lifetimes.
Hello, My understanding of this matches @d-e-s-o's - when providing an owned |
I've been looking at this too, trying understand what uses cases might motivate this API. (And trying to improve my understanding of Rust's ownership, borrowing, and lifetimes along the way. So this is more of an exploration than a conclusion.) I wonder if it would be sufficient if In the test cases that exercise the test!("<foo>", "foo"); which, after the macro does its thing, just becomes: let mut parser = Parser::new("foo"); So I tried an experiment: just make that change and see what happens. Here's a commit to demonstrate that. Now run
Ok, that makes sense:
Notice we're doing So the borrow checker complains because we're allocating memory in That's how I understand the point raised by @d-e-s-o:
My intuition seems similar to that: we can't have our cake and eat it too here. Because So what if we instead pass ownership of that memory to That's how I understand @IvanUkhov's point: that the ownership of the memory for Back to @d-e-s-o's point, though, my intuition is that it seems like what we're fundamentally trying to achieve here is a borrow with a lifetime of So it seems that the key question is: is that what the unsafe This is how I understand the concerns about whether
The fact that the lifetime couldn't be inferred in the absence of the Reading the documentation on
Is that what this use of If so, I think the reasoning and the demonstration above show that it's not having that intended effect. Instead, it's creating an undefined state. So, circling back to my initial question: what use cases motivate this particular API? If those were known, then we could consider alternative APIs that could be implemented safely. |
@mlwilkerson, thank you for presenting your thought process! It matches well my own understanding of the situation. Regarding the motivation behind the current API of |
It would be great to get rid of the unsafe code in
Parser::new
.The text was updated successfully, but these errors were encountered: