Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement From<String> for Document #10

Closed

Conversation

aleksanb
Copy link
Contributor

@aleksanb aleksanb commented Jul 4, 2016

Currently, one cannot create a Document from a String,
as is the case when reading a response from Hyper into a String.
With From, this is now possible.

Example of new allowed behaviour:

let client = hyper::Client::new();
let mut response = try!(client.get(&request_url).send());
let mut body = String::new();
try!(response.read_to_string(&mut body));

let dom = Document::from(body);

Currently, one cannot create a Document from a String,
as is the case when reading a response from Hyper into a String.
With From<String>, this is now possible.
@utkarshkukreti
Copy link
Owner

Thanks for the PR!

Do we really need a From<String> when we have From<&str> considering it doesn't look like converting a String to Tendril is any more efficient (i.e. doesn't reuse the allocation) than converting a slice to it. In your example, you can just do:

let doc = Document::from(&body);

What do you think?

@utkarshkukreti
Copy link
Owner

utkarshkukreti commented Jul 6, 2016

I think we should instead have a Document::from_read(R) where R: Read that uses tendril::ReadExt::read_to_tendril() and saves 1 allocation completely.

With that, you would be able to write:

let client = hyper::Client::new();
let mut response = try!(client.get(&request_url).send());
let doc = try!(Document::from_read(response));

and reduce the allocations to half.

@aleksanb
Copy link
Contributor Author

aleksanb commented Jul 6, 2016

I like your last suggestion with Document::from_read8R) where R: Read.
Should I or you implement it?

Regarding the first comment, you can't currently do

let mut response = try!(client.get(&request_url).send());
let mut body = String::new();
try!(response.read_to_string(&mut body));
let dom = Document::from(&body);

as you quickly run into lifetime errors.

I'm not quite sure why but the error is:

src/vegvesen_api.rs:58:15: 58:29 error: the trait bound `select::document::Document: std::convert::From<&std::string::String>` is not satisfied [E0277]
src/vegvesen_api.rs:58     let dom = Document::from(&body);
                                     ^~~~~~~~~~~~~~
src/vegvesen_api.rs:58:15: 58:29 help: run `rustc --explain E0277` to see a detailed explanation
src/vegvesen_api.rs:58:15: 58:29 help: the following implementations were found:
src/vegvesen_api.rs:58:15: 58:29 help:   <select::document::Document as std::convert::From<tendril::tendril::Tendril<tendril::fmt::UTF8>>>
src/vegvesen_api.rs:58:15: 58:29 help:   <select::document::Document as std::convert::From<&'a str>>
src/vegvesen_api.rs:58:15: 58:29 note: required by `std::convert::From::from`

The current implementation is

impl<'a> From<&'a str> for Document {
    /// Parses the given `&str` into a `Document`.
    fn from(str: &str) -> Document {
        Document::from(StrTendril::from(str))
    }   
}

I tried adding 'a to the argument of from, but that didn't change anything either.

@utkarshkukreti
Copy link
Owner

Should I or you implement it?

I'll only have time to do it later this week, or if you want to, you could send a PR. A test would also be nice (You could use std::io::Cursor to create a Read from bytes). I would be happy to help in any way I can.

I believe the signature would look like:

fn from_read<R: Read>(r: R) -> std::io::Result<Document>

Regarding the first comment, you can't currently do

Ah, looks like just & is not enough. Please try this instead:

let doc = Document::from(&*body);

@utkarshkukreti
Copy link
Owner

Also, I just saw you're using select 0.2.2 in this project (and it was committed 4 days ago). Is there any specific reason for not using 0.3.0?

@aleksanb
Copy link
Contributor Author

aleksanb commented Jul 6, 2016

I was using 0.2.2 as i couldn't get my string coerced correctly on 3.0, and I found an other project using select and hyper with 0.2.2 which worked.

Could you explain why the &* is needed?
I'll take a look at implementing it with read.

@LeMoussel
Copy link

let doc = Document::from(&*body); It's OK with 0.3.0 version.
But I don't understand why the &* is needed?
eg Struct std::string::String Deref => Strings implement Deref<Target=str>, and so inherit all of str's methods. In addition, this means that you can pass a String to any function which takes a &str by using an ampersand (&)

@utkarshkukreti
Copy link
Owner

@aleksanb @LeMoussel I'm not sure myself why the extra * is needed as with this program:

struct Nothing;

impl<'a> From<&'a str> for Nothing {
    fn from(str: &str) -> Nothing {
        Nothing
    }
}

fn go(str: &str) {}

This compiles:

go(&String::new());

While this doesn't:

Nothing::from(&String::new());
// error: the trait bound `Nothing: std::convert::From<&std::string::String>` is not satisfied

@utkarshkukreti
Copy link
Owner

The chapter about Deref Coercions in the Rust Book doesn't mention this case: https://doc.rust-lang.org/book/deref-coercions.html.

@aleksanb
Copy link
Contributor Author

aleksanb commented Jul 8, 2016

Here's a sample implementation of the from_read version:

pub fn from_read<R: Read>(mut readable: R) -> io::Result<Document> {
    let mut byte_tendril = ByteTendril::new();
    try!(readable.read_to_tendril(&mut byte_tendril));
    let str_tendril: StrTendril = byte_tendril.try_reinterpret().unwrap();
    Ok(Document::from(str_tendril))
}   

The unwrap should be replaced by something, but the error type of the tendril is just the tendril itself, so i'm unsure what we should map the error to.
Any ideas?

@utkarshkukreti
Copy link
Owner

Ah, read_to_tendril cannot read into StrTendril directly. I think we should create and return an std::io::Error with std::io::ErrorKind::InvalidData ourselves if try_reinterpret fails.

InvalidData
Data not valid for the operation were encountered.
Unlike InvalidInput, this typically means that the operation parameters were valid, however the error was caused by malformed input data.
For example, a function that reads a file into a string will error with InvalidData if the file's contents are not valid UTF-8.

https://doc.rust-lang.org/stable/std/io/enum.ErrorKind.html

@utkarshkukreti
Copy link
Owner

This is how std:io handles invalid UTF-8 for functions that return Strings:

        if str::from_utf8(&g.s[g.len..]).is_err() {
            ret.and_then(|_| {
                Err(Error::new(ErrorKind::InvalidData,
                               "stream did not contain valid UTF-8"))
            })
        }

https://github.com/rust-lang/rust/blob/3fa1cdf23cd28b8dc5c8ba1f626c5256713cc991/src/libstd/io/mod.rs#L327-L332

@aleksanb
Copy link
Contributor Author

aleksanb commented Jul 8, 2016

I've put the new pull request up on https://github.com/utkarshkukreti/select.rs/pull/13/files,
so this one can probably be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants