Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lifetime issue in new unbuffered logic #329

Closed
stevenroose opened this issue Oct 25, 2021 · 3 comments
Closed

Lifetime issue in new unbuffered logic #329

stevenroose opened this issue Oct 25, 2021 · 3 comments

Comments

@stevenroose
Copy link

I have been trying out the new alpha version and I think I stumbled upon a kinda-bug. Somehow something that should be possible is not, I think because of the way lifetimes are indicated in some methods. I'm trying to find a fix, but to reproduce the issue, I have the following snippet.

In the snippet, I'm trying to use a &'x [u8] and trying to deserialize into a struct that holds a Cow<'x, str> with the content of a tag. Somehow this snippet gives a lifetime error. I think the main culprit is the combination of do_unescape that equates the lifetime of the slice

pub fn do_unescape<'a>(
    raw: &'a [u8],
    custom_entities: Option<&HashMap<Vec<u8>, Vec<u8>>>,
) -> Result<Cow<'a, [u8]>, EscapeError> {

together with calling into do_escape as follows which actually overwrites the 'a lifetime with a new local lifetime: do_unescape(&*self.value, custom_entities).map_err(Error::EscapeError)

Here is the snippet:

use std::str;
use std::borrow::Cow;
use std::error::Error;

use quick_xml::Reader;
use quick_xml::events::Event;


pub struct Example<'a> {
	pub content: Cow<'a, str>,
}

/// Convert a UTF8-encoded [Cow<[u8]>] into a [Cow<str>].
fn cow_from_utf8<'a>(cow: Cow<'a, [u8]>) -> Result<Cow<'a, str>, str::Utf8Error> {
	Ok(match cow {
		Cow::Borrowed(v) => Cow::Borrowed(str::from_utf8(v)?),
		Cow::Owned(v) => Cow::Owned(String::from_utf8(v).map_err(|e| e.utf8_error())?),
	})
}

pub fn parse<'a>(xml: &'a impl AsRef<[u8]>) -> Result<Example<'a>, Box<dyn Error>> {
	let mut reader = Reader::from_reader(xml.as_ref());

	match reader.read_event_unbuffered()? {
		Event::Start(x) => match str::from_utf8(x.name())? {
			"Content" => {
				let content = match reader.read_event_unbuffered()? {
					Event::Text(t) => cow_from_utf8(t.unescaped()?)?,
					Event::End(e) if e.name() == x.name() => Cow::Borrowed(""),
					e => return Err(format!("unexpected event '{:?}'", e).into()),
				};
				match reader.read_event_unbuffered()? {
					Event::End(e) if e.name() == x.name() => {}
					e => return Err(format!("unexpected event '{:?}'", e).into()),
				}
				Ok(Example {
					content: content,
				})
			}
			t => Err(format!("unexpected tag '{}'", t).into()),
		}
		e => Err(format!("unexpected event '{:?}'", e).into()),
	}
}
@stevenroose
Copy link
Author

So I created a branch in which I added the lifetimes that should be added to the API in order to be correct. And it breaks borrowck atm, so I'm trying to fix them. Someone with more knowledge of the project might be have an easier time to figure it out, though.

Here are the lifetimes: stevenroose@fb7a3dd

@stevenroose
Copy link
Author

I'm thinking this might be a problem in Rust's Cow implementation or even some trait definitions. Both AsRef, Borrow and Deref return references with a lifetime equal to the lifetime of the borrow of self. So a &'a Cow<'b, [u8]> can seemingly not be made into a &'b [u8], only into &'a [u8].

@stevenroose
Copy link
Author

So I'm starting to also convert the decode functionality, because it suffers from the same problem. Though when doing multiple conversions that are free on &[u8] on a Cow<[u8]> seem to do a lot of subsequent re-allocations in the owned case. I'm thinking if we can somehow use the Bytes crate to do this more efficiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant