Binary parser change #60

Skepfyr · 2019-07-26T16:48:46Z

I was attempting to see if I could do anything about the remaining unsafe from #53 and got slightly carried away... I thought I may as well put this up to see if you thought it would be worthwhile.
I've flipped the parser inside out so that instead of being callback-based it is an Iterator over the instructions.
It's a really large change to the API and I don't think the benefits are huge, however, it does unify a few things and is a slightly more rusty interface (IMO). It would take a bit more work to get up and running but probably only a day or two.

Skepfyr · 2019-08-09T23:38:23Z

Hey @antiagainst, I would love to hear your opinion on this, if for no other reason than that it doesn't get hopelessly out of date.

antiagainst

Sorry for dropping the ball! I like the changes to remove unsafe (num.rs) but for the decoder and parser interface, I think we need more discussion. What's the motivation for those changes?

Would you mind to separate the changes for the unsafe removal out to get it landed first?

antiagainst · 2019-08-15T03:49:24Z

rspirv/binary/assemble.rs

-    if bytes.len() % 4 == 0 {
-        words.push(0)
-    }
+    let chunks = s.as_bytes().chunks_exact(4);


antiagainst · 2019-08-15T03:49:44Z

rspirv/binary/autogen_error.rs

@@ -24,8 +24,8 @@ use std::{error, fmt};
 /// Decoder Error.
 #[derive(Debug, PartialEq)]
 pub enum Error {
-    StreamExpected(usize),
-    LimitReached(usize),
+    StreamExpected,


Sounds reasonable to me.

antiagainst · 2019-08-15T03:52:27Z

rspirv/binary/parser.rs

-pub struct Parser<'c, 'd> {
-    decoder: decoder::Decoder<'d>,
-    consumer: &'c mut dyn Consumer,
+pub struct Instructions<T>


Hmm, I'm not so sure about this. This is changing the whole parser interface but I'm not seeing how this is to be used? Another nit comment is that Parser is a more well-known term in compilers. Instruction is already heavily overloaded. Let's not make it worse.

antiagainst · 2019-08-15T03:54:38Z

rspirv/binary/decoder.rs

-
-        let b = f32_to_bytes(-12.34);
-        let mut d = Decoder::new(&b);
-        assert_eq!(Ok(-12.34), d.float32());
    }

    #[test]
    fn test_decode_float32() {


So decoder is removed but we still have tests here?

antiagainst · 2019-08-15T03:59:36Z

rspirv/binary/decoder.rs

-        }
-        Ok(words)
-    }
+pub struct Decoder {


I sort of like the existing decoder and parser structure because it allows other kinds of parsers. Whether there are such a need I cannot see it right now. But given it's already implemented, I think we can just keep it for now unless it becomes problematic for some feature we want to use. WDYT?

Skepfyr · 2019-08-15T17:58:24Z

Sorry, I've realised I should have explained this better. This changes the parser from being callback-based to being an Iterator of instructions, the parse_bytes and parse_words functions are the best examples of this change. This gives more control to the consumer because now it's pull-based, with the parser essentially acting as a complicated map function over the input. This is possible as instead of a generic parser infrastructure, it relies on the layout of SPIR-V modules.

This change was brought about by me trying to get rid of the unsafe in parse_words, although it's proably possible to do that without this change (but probably does require a reasonably large change). This just struck me as a slightly more idiomatic way of doing things, and as a bonus deleted some code and removed some types.

This is a very draft-y PR most of the small changes can be ignored, I was only putting it up here for the large scale structure to be considered, and I think most of the unsafe removal came in #53.

I agree this is a large change without large benefits, however seen as so much is changing right now, I feel like now is probably the best time to implement anything like this.

antiagainst · 2019-08-16T04:49:55Z

I see your motivation better now. Thanks for the detailed information! But I feel I'm still not very convinced over this. What additional benefits does this bring except from being more Rust idiomatic? I think that's a second thing compared to designing good layering and API. (Unless the existing impl is totally off w.r.t. Rust paradigm; but that means it probably won't be able to be expressed in Rust ;-P) IMO the change proposed here blurs the layering and API boundary. I think a good API should make problematic usages harder. The previous decoder and parser is very clear regarding that. If you are using decoder, it means no understanding of the grammar, so only low-level decoding functionality and you are on your own for making sure everything works. If you are using parser, then you are only exposed to either full instruction parsing or error; no way to get a single operand of an instruction. Now with the new Instructions, we can do both: decoding an instruction, and then decoding a single id, and then trying to decode another instruction, which will fail. This makes me feel uneasy.

I guess I'm not opposed changing to make it more Rust idiomatic; I'm just worried about the public interface. As long as the API is reasonable, we can change the impl whatever way we want. WDYT? (But I may misunderstand your proposal; feel free to correct me!)

Skepfyr · 2019-08-16T10:46:14Z

I think the public API in this is more structured than it used to be (although I might not have removed some pubs yet). The only public way to parse something is to call parse_bytes or parse_words, these both return something that looks like Result<(ModuleHeader, Instructions<_>)>. The ModuleHeader is exactly what would have been received by the consume_header method but now the types prove that you only get one per file and you get it first; Instructions should only have one public method next (from Iterator) which returns the next fully parsed instruction (or an error).

antiagainst · 2019-08-24T13:08:09Z

I see. So actually you'd like to make the decoder internal to the parser (which now becomes Instructions) entirely and only expose APIs for working with DR instructions. Isn't that still having the layering problem? I like the decoder and parser separation because they are modelling different concepts at different levels. So IMO it's cleaner to use two classes. It's also easier to test and has the additional benefits that others have access to the decoder API. Putting all of them in one class is creating a class that operates at multiple levels and serves multiple purposes... I'm still not quite seeing the real benefits here?

Given this is quite outdated right now (sorry I wasn't able to reply promptly!), would you mind to pocket this for now and let's focus on having SR fleshed out first? I don't want the world to be entirely broken when we cut the next release and folks suddenly find the codebase is entirely different. :)

Skepfyr · 2019-08-26T22:40:19Z

Yes, to me the decoder and parser are essentially the same, if you look at the current code the design of each are heavily intertwined. Although I do like elegant abstractions, I can't really see one here or why it would be useful, I would like to hear of any alternative use case for the decoder without the parser, I can't see one existing because in order to decode what comes next in the stream you need to have parsed what you already have. There is no reason it is harder to test because a unit testing module can access all the methods, they still exist but are private.

The benefit I see of doing this is to clean up the users API, it turns from a few functions, a trait implementaton, and a whole load of methods that the user doesn't really need to touch to two functions (parse_bytes and parse_words).

I'm actually happy to just leave this here, it turns out I was wrong about it getting outdated as all of the sr work doesn't touch any of this code. I'm not sure about whether it is more sensible to wait for the next release, the whole thing is essentially changing anyway it might be sensible to just get it over and done with in one go, rather than making large changes to APIs over multiple revisions.

Skepfyr · 2019-08-28T16:43:13Z

I attempted to implement some more of this just to see where it would take me. I'm now no longer convinced that this change specifically is worth it.
However, I did work out what is bothering me about the current API, we have 4 separate parsers exposed that the user has to feed the data through one-by-one (although there are some functions that do multiple steps of this), Decoder, Parser, dr, and sr. While I can see the benefits of this, it increases the complexity of the library and increases the amount of allocation. I might open a separate issue on this at some point but I agree that the change in this PR is not a good idea right now and I'm happy to leave this alone while we flesh out sr.

Skepfyr changed the title ~~Data representation parser change~~ Binary parser change Jul 26, 2019

antiagainst reviewed Aug 15, 2019

View reviewed changes

tmp

08c39d6

Skepfyr force-pushed the parser-overhaul branch from 46177b7 to 08c39d6 Compare August 15, 2019 17:20

Skepfyr closed this Aug 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Binary parser change #60

Binary parser change #60

Skepfyr commented Jul 26, 2019

Skepfyr commented Aug 9, 2019

antiagainst left a comment

antiagainst Aug 15, 2019

antiagainst Aug 15, 2019

antiagainst Aug 15, 2019

antiagainst Aug 15, 2019

antiagainst Aug 15, 2019

Skepfyr commented Aug 15, 2019

antiagainst commented Aug 16, 2019

Skepfyr commented Aug 16, 2019

antiagainst commented Aug 24, 2019

Skepfyr commented Aug 26, 2019

Skepfyr commented Aug 28, 2019

Binary parser change #60

Binary parser change #60

Conversation

Skepfyr commented Jul 26, 2019

Skepfyr commented Aug 9, 2019

antiagainst left a comment

Choose a reason for hiding this comment

antiagainst Aug 15, 2019

Choose a reason for hiding this comment

antiagainst Aug 15, 2019

Choose a reason for hiding this comment

antiagainst Aug 15, 2019

Choose a reason for hiding this comment

antiagainst Aug 15, 2019

Choose a reason for hiding this comment

antiagainst Aug 15, 2019

Choose a reason for hiding this comment

Skepfyr commented Aug 15, 2019

antiagainst commented Aug 16, 2019

Skepfyr commented Aug 16, 2019

antiagainst commented Aug 24, 2019

Skepfyr commented Aug 26, 2019

Skepfyr commented Aug 28, 2019