Add Vec-less reading of events and borrowing deserialization #208

andreivasiliu · 2020-04-07T18:10:03Z

Add a new reader.read_event_unbuffered() method that does not require a user-given Vec buffer, and borrows from the input, implemented if the input is a &[u8].

Still needs more polishing, and it needs the buf_position branch to be pushed first, since it is based on those changes. I had to move all of the input-reading methods into a trait, and make them return a reference to the text that was read. Because of that, there's now a new requirement of at most 1 input-reading method being called per read_event(), so I had to rework whitespace skipping, and to move all of the bang element processing into yet another read_until-like function which doesn't return until it has all of the text.

Next up is making a deserializer that can use this to remove the DeserializedOwned restriction and allow user structs to borrow from the input when possible, allowing for truly zero-copy parsing and deserialization.

The only user-facing change is the 1 new method, the rest is completely hidden. I'm not fond of the new method's name, so I'd appreciate any help with figuring out a better name for it.

Bikeshedding for the rest of the names would be appreciated too.

…slices

andreivasiliu · 2020-04-18T16:54:58Z

Managed to make a prototype of a deserializer that can borrow strings from the input. I had to make a new public trait, and make it a requirement for the input, so this is a major-version-breaking kinda change, but I think most users won't notice it.

Closes #195 once finished. Requires #201 to be closed first.

Needs a lot more polish, I'm especially not liking the naming and the excessive duplication on the read_until-like methods.

tafia

Thanks very much for this PR. This is awesome, I like it very much!!

I have made a few comments, nothing major except perhaps checking if encoding feature works fine for serde related part.

I have 2 higher level comments:

As per serde documentation (https://serde.rs/lifetimes.html) we are NOT supposed to use the 'de lifetime in our deserializer. I've tried to update the bench-serde.rs file using &str instead of String but I got lot of errors. I am no serde expert so I may be doing something wrong.
Speaking of benchmarks, we have a 14% regression on regular reader but a 35% amelioration on serde related one.

name                   old ns/iter  new ns/iter  diff ns/iter   diff %  speedup 
 bench_quick_xml        233,063      264,648            31,585   13.55%   x 0.88 
 bench_serde_quick_xml  1,121,517    728,877          -392,640  -35.01%   x 1.54

I believe part of the regression may come from the checks done twice in the read_bang(_element) functions.

Again thanks a lot for this work, it is massive. I am very sorry I couldn't review it sooner.

tafia · 2020-05-17T07:15:11Z

src/de/mod.rs

-        visitor.visit_string(value)
+        let text = self.next_text()?;
+        let unescaped = text.unescaped()?;
+        let decoded = self.reader.decoder().decode(&unescaped)?;


Not a big deal but I got it wrong the first time and we actually need to decode before unescape.

tafia · 2020-05-17T07:19:27Z

src/de/mod.rs

-        visitor.visit_string(value)
+        let text = self.next_text()?;
+        let unescaped = text.unescaped()?;
+        let decoded = self.reader.decoder().decode(&unescaped)?;


Also it needs to be adapted to support the encoding feature.

tafia · 2020-05-17T07:20:32Z

src/de/mod.rs

        self.deserialize_string(visitor)
    }

    fn deserialize_str<V: de::Visitor<'de>>(self, visitor: V) -> Result<V::Value, DeError> {


Same comments as deserialize_string

tafia · 2020-05-17T07:23:05Z

src/de/mod.rs

+
+    /// Skips until end element is found. Unlike `next()` it will not allocate
+    /// when it cannot satisfy the lifetime.
+    fn read_to_end(&mut self, name: &[u8]) -> Result<(), DeError>;


Does it need to be in the trait? Can't we implement it automatically?

I keep hitting an NLL limitation (I mentioned it in the issue), I can only ever touch the generic buf at most once in every branch; in the Vec<u8> implementation I need to clear it multiple times, and I can't do that while it's generic. Once Rust fixes that, or I find a way to bypass the issue, I can probably improve this.

tafia · 2020-05-17T07:26:18Z

src/de/mod.rs

+}
+
+impl<'i, R: BufRead + 'i> BorrowingReader<'i> for IoReader<R> {
+    fn next(&mut self) -> Result<Event<'static>, DeError> {


Shouldn't it be Event<'i> as per the trait definition? Or shouldn't we implement BorrowingReader<'static> for IoReader<R>?

Both work, since 'static can be used anywhere a different lifetime is required. I could change it if you want, but I like making it obvious why it is able to satisfy 'i (i.e., it's static so it doesn't care about it).

tafia · 2020-05-17T07:40:31Z

src/reader.rs

+        };
+
+        if skip_text {
+            return self.read_event_buffered(buf);


returns can be omitted

tafia · 2020-05-17T08:38:58Z

src/reader.rs

+
+        // Note: Do not update position, so the error points to a sane place
+        // rather than at the EOF.
+        Err(Error::UnexpectedEof("Element".to_string()))


I believe we just return Ok(None) and expect the calling function to handle associated error. (UnexpectedEOF for instance).

tafia · 2020-05-17T09:02:13Z

src/reader.rs

-                }
-            }
-            let len = buf.len();
+    fn read_bang<'a, 'b>(&'a mut self, buf: &'b [u8]) -> Result<Event<'b>> {


You're checking the starts twice between this read_bang and read_bang_element fn. We could probably find a way to do it only once (either merge the fn but you may need to repeat it for each trait, or have a proper read_comment, read_cdata and read_doctype).

tafia · 2020-05-17T09:55:03Z

As per serde documentation (https://serde.rs/lifetimes.html) we are NOT supposed to use the 'de lifetime in our deserializer. I've tried to update the bench-serde.rs file using &str instead of String but I got lot of errors. I am no serde expert so I may be doing something wrong.

Actually using Cow<'a, str> worked but it was slower than using regular String ...

andreivasiliu · 2020-05-17T12:44:54Z

Cows, interestingly, default to being deserialized as owned by default (even when the deserializer gives a borrowed string to the visitor). If the field is prefixed with #[serde(borrow)] then it'll borrow when it can, or take owned when decoding/unescaping says it can't.

I'm currently in the middle of another project so it'll take a ~~while~~ week or two until I get back to this, apologies!

tafia · 2020-05-17T12:57:21Z

No worries! I took far too long to review.

andreivasiliu · 2020-06-03T20:15:10Z

That regression is actually really bugging me, I'd like to know at least where it's from.

Benchmarks just confuse me, they're consistently saying that only text processing got worse... except I'm pretty sure nothing changed in text processing... o.O

  name                                old ns/iter  new ns/iter  diff ns/iter   diff %  speedup
- bench_quick_xml                     232,610      272,305            39,695   17.07%   x 0.85
- bench_quick_xml_escaped             278,140      308,536            30,396   10.93%   x 0.90
+ bench_quick_xml_namespaced          414,775      396,945           -17,830   -4.30%   x 1.04
+ bench_quick_xml_read_cdata_event    82           71                    -11  -13.41%   x 1.15
+ bench_quick_xml_read_comment_event  83           72                    -11  -13.25%   x 1.15
+ bench_quick_xml_read_start_event    111          96                    -15  -13.51%   x 1.16
- bench_quick_xml_read_text_event     54           67                     13   24.07%   x 0.81

I'll need to come up with better benchmarks.

tafia · 2020-06-04T13:36:23Z

I agree benchmarks are extremely unsatisfying. Still it repeats consistently.Maybe it is linked to o that read_bang comment

andreivasiliu · 2020-06-14T21:12:19Z

Looks like my previous benchmark was mostly right, it's mostly text that is slower; interestingly, if whitespace is stripped (and thus some events elided) via trim_text, it is sometimes even faster than before.

Half of it comes from a reindexing in read_bytes_until, which causes a bounds check now since it doesn't know that the buffer's length can only ever grow:

Ok(Some(&buf[start..]))

I don't know how to fix that yet. Still looking for the other half.

Benchmarks:

  name                                          old ns/iter  new ns/iter  diff ns/iter   diff %  speedup
- bench_quick_xml                               248,385      273,660            25,275   10.18%   x 0.91
- bench_quick_xml_escaped                       272,957      306,278            33,321   12.21%   x 0.89
+ bench_quick_xml_escaped_trimmed               257,206      248,625            -8,581   -3.34%   x 1.03
- bench_quick_xml_mostly_cdata                  21,851       22,985              1,134    5.19%   x 0.95
- bench_quick_xml_mostly_empty_tags             453,845      499,480            45,635   10.06%   x 0.91
- bench_quick_xml_mostly_namespaced_tags        62,985       76,694             13,709   21.77%   x 0.82
- bench_quick_xml_mostly_tags                   204,582      239,886            35,304   17.26%   x 0.85
- bench_quick_xml_mostly_tags_and_text          213,720      233,603            19,883    9.30%   x 0.91
- bench_quick_xml_mostly_tags_and_text_trimmed  217,949      219,513             1,564    0.72%   x 0.99
+ bench_quick_xml_mostly_tags_trimmed           163,403      144,813           -18,590  -11.38%   x 1.13
- bench_quick_xml_namespaced                    372,500      428,576            56,076   15.05%   x 0.87
+ bench_quick_xml_namespaced_trimmed            384,055      344,745           -39,310  -10.24%   x 1.11
+ bench_quick_xml_read_cdata_event_trimmed      84           80                     -4   -4.76%   x 1.05
+ bench_quick_xml_read_comment_event_trimmed    95           81                    -14  -14.74%   x 1.17
- bench_quick_xml_read_start_event              126          132                     6    4.76%   x 0.95
+ bench_quick_xml_read_start_event_trimmed      111          107                    -4   -3.60%   x 1.04
- bench_quick_xml_read_text_event               58           68                     10   17.24%   x 0.85
+ bench_quick_xml_trimmed                       238,913      234,001            -4,912   -2.06%   x 1.02

SergioBenitez · 2021-06-02T23:58:52Z

What prevented this from seeing movement? Being able to deserialize borrowed data blocks Rocket from using this library for its xml support. See rwf2/Rocket#1606.

andreivasiliu · 2021-06-03T06:21:48Z

There's a 15%-20% degradation in performance for people that don't use this, if we were to merge this; I spent months trying to figure out something here, and every time I found something that would work, it was blocked by an NLL limitation (this, described here). It got really frustrating, and other projects got more interesting.

Also, it was only blocking a private hobby project of mine, nothing as important as Rocket.

If @tafia is okay to make things slower for non-borrowing users, I can probably pick this up again.

Also, the lack of GATs makes the API kinda weird, since I have to use an output lifetime as a sort of input that the user has to somehow hint at to even be able to use it; so I just made all the traits and machinery private, but it would've been nice to have GATs, and was hoping they'd come out faster. This is not essential, I can just keep that stuff private.

SergioBenitez · 2021-06-03T09:41:00Z

I've rebased and fixed a few issues. See #290. Feel free to take the work and continue here or continue there.

tafia · 2021-08-10T15:04:18Z

Closing it as #290 got merged

dralley · 2021-09-02T22:26:26Z

@andreivasiliu GATs are very very close to being ready now rust-lang/rust#44265

andreivasiliu · 2021-09-03T05:32:23Z

I know! I'm very excited about it. Although even once released, it'll probably take a while before I rewrite this to use them, to keep the minimum supported Rust version still accessible for users.

andreivasiliu mentioned this pull request Apr 7, 2020

DeserializeOwned prevents deserialization into structs with lifetime bounds #195

Closed

andreivasiliu marked this pull request as draft April 10, 2020 09:00

andreivasiliu force-pushed the unbuffered branch from cf038ac to 0591583 Compare April 18, 2020 16:39

andreivasiliu changed the title ~~Add Vec-less reading of events~~ Add Vec-less reading of events and borrowing deserialization Apr 18, 2020

andreivasiliu added 4 commits April 18, 2020 19:52

Fix tests on Windows

f1e4bf1

Add BufferedInput trait, rework read_until/read_elem_until to return …

cb2c67a

…slices

Add read_event_unbuffered

f24ae91

Add borrowing support to deserializer

3c637bf

andreivasiliu force-pushed the unbuffered branch from 0591583 to 3c637bf Compare April 18, 2020 16:53

tafia requested changes May 17, 2020

View reviewed changes

andreivasiliu mentioned this pull request Jun 28, 2020

Fix benchmarks on Windows and add trimmed variants #222

Merged

untitaker mentioned this pull request Nov 3, 2020

Is this crate aiming to parse HTML? #238

Closed

tobz1000 mentioned this pull request Nov 21, 2020

Deserialization behavior for Vec #177

Closed

SergioBenitez mentioned this pull request Jun 3, 2021

Add zero-copy deserialization #290

Merged

tafia closed this Aug 10, 2021

Add Vec-less reading of events and borrowing deserialization #208

Add Vec-less reading of events and borrowing deserialization #208

Uh oh!

Conversation

andreivasiliu commented Apr 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andreivasiliu commented Apr 18, 2020

Uh oh!

tafia left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tafia commented May 17, 2020

Uh oh!

andreivasiliu commented May 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tafia commented May 17, 2020

Uh oh!

andreivasiliu commented Jun 3, 2020

Uh oh!

tafia commented Jun 4, 2020

Uh oh!

andreivasiliu commented Jun 14, 2020

Uh oh!

SergioBenitez commented Jun 2, 2021

Uh oh!

andreivasiliu commented Jun 3, 2021

Uh oh!

SergioBenitez commented Jun 3, 2021

Uh oh!

tafia commented Aug 10, 2021

Uh oh!

dralley commented Sep 2, 2021

Uh oh!

andreivasiliu commented Sep 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andreivasiliu commented Apr 7, 2020 •

edited

Loading

andreivasiliu commented May 17, 2020 •

edited

Loading