Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Enhancement: Support for top-level "add-ons" for map #213

Open
v1gnesh opened this issue May 15, 2021 · 7 comments
Open

Request for Enhancement: Support for top-level "add-ons" for map #213

v1gnesh opened this issue May 15, 2021 · 7 comments

Comments

@v1gnesh
Copy link

v1gnesh commented May 15, 2021

Please consider adding support for custom encodings.
In #212, what I've described is the conversion of bytes from one encoding into String.
The morph referred there is essentially a codepage conversion function.

If codepage can be added as a top-level attribute, that'll save me loads & loads of LoC.
That is, if the codepage conversion can optionally be applied to all struct fields that are String.

EDIT: A simple addition will be to say, 'if you supply your own codepage conversion function, then deku will allow the use a top-level setting to point to that function'.

Thanks again for this excellent library and all the work from the contributors!

@sharksforarms
Copy link
Owner

Understood, this does seem a little tedious. Not sure I'm in favor of a default map function, it would only be able to accept a single type? What happens to the a field in this case?

Have you considered the alternative of splitting it out into a separate struct to hide some of these details?

use deku::prelude::*;
use std::convert::TryFrom;

#[derive(Debug, DekuRead)]
pub struct H {
  a: u16,
  #[deku(bytes = "4")]
  b: MorphedString,
}

#[derive(Debug, DekuRead)]
#[deku(ctx = "bytes: deku::ctx::Size")]
pub struct MorphedString(#[deku(map = "morph", bytes_read = "bytes.byte_size().unwrap()")] String);

pub fn morph(bytes: &[u8]) -> Result<String, DekuError> {
    let str = std::str::from_utf8(&bytes).unwrap().to_string();
    Ok(str)
}

fn main() {
    let input = b"\xAB\xCDDEKU";
    let the_h = H::try_from(input.as_ref()).unwrap();

    dbg!(&the_h);
    assert_eq!(the_h.a, 0xCDAB);
    assert_eq!(the_h.b.0, "DEKU");
}

You could implement Deref, PartialEq, etc. to make this type more transparent.

Possibly related: #174

@v1gnesh
Copy link
Author

v1gnesh commented May 17, 2021

Not sure I'm in favor of a default map function, it would only be able to accept a single type?

Yes, ideally configurable in a top-level setting such as #[deku(conv="morph", types="String")]

What happens to the a field in this case?

That'll be processed as normal; in this case, #deku(endian="little")

It will be really excellent to not have to code the bytes_read or any tagging for each of the String (in this case) fields.
Yes, it's only my use case right now.
However, this will increase the simplicity & readability manifold.
It'll look just like a normal struct but with deku magic all over it.

#[deku(conv="morph", types="String", sizes="tuple")], where tuple points to a tuple where I code how many bytes for each String in the struct.

Obviously, I'm biased, but simplicity will surely help with adoption.
The only reason I came to this library is 'cause it's far simpler than other parsers (or manual rolling) out there.

@v1gnesh
Copy link
Author

v1gnesh commented May 17, 2021

Keeping it general...

Top-level functions:
map - works as it currently does.

map_types - apply map on these types
map_fields - apply map on these fields

map_sizes - bits/bytes of the fields to apply map on, in a Tuple array or something.

map_types & map_fields are mutually exclusive.

#[deku(endian="little", map="morph", map_types="String", map_sizes="tuple")]
pub struct H {
  a: u16,
  b: String,
  c: String,
  d: String,
  e: String,
}

@v1gnesh v1gnesh changed the title Request for Enhancement: Support for codepage conversion Request for Enhancement: Support for top-level "add-ons" for map May 20, 2021
@sharksforarms
Copy link
Owner

I feel like a top-level map attribute would imply "after the struct is read, apply this function, this isn't the case here. I also don't see the benefit of having a map_sizes?

Is there any reasons why you'd rather this approach vs attributes on each field? I feel like this is adding a unnecessary level on indirection with not much benefit.

I'm not convinced of this addition. I believe the current attributes provide enough flexibility while keeping things explicit. I recommend explicitly on each field via the attributes, or abstracted as described above.

I'm open to discuss further.

@v1gnesh
Copy link
Author

v1gnesh commented May 21, 2021

Hi, perhaps something like rust-lang/rust#83366 can help with the implementation.

after the struct is read, apply this function, this isn't the case here.

This is the case though... for all struct fields of a certain type (String here), do this particular map function, before creating the morphed struct.

I also don't see the benefit of having a map_sizes?

It's just to have a single place to provide bytes_read at the top.

I recommend explicitly on each field via the attributes, or abstracted as described above.
I tried the MorphedString example.. it didn't work. Also, the attribute bytes still claims a LoC.

Overall, it's no doubt that as the lib exists currently, it works.
I'm only looking to find out if it can be made to look far simpler.

Before/After examples in this comment - #213 (comment)

@v1gnesh
Copy link
Author

v1gnesh commented Jun 18, 2021

Any further thoughts on this, @sharksforarms?

@sharksforarms
Copy link
Owner

Similar to #217 (comment)

I rather avoid having multiple ways of accomplishing the same thing, this sort of abstraction could be done in a separate proc-macro crate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants