Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing the absolute position of objects after parsing #331

Open
hugsy opened this issue Jan 30, 2018 · 11 comments
Open

Exposing the absolute position of objects after parsing #331

hugsy opened this issue Jan 30, 2018 · 11 comments

Comments

@hugsy
Copy link

hugsy commented Jan 30, 2018

Hi there!

I would like to suggest a minor improvement to kaitai-struct: it would be quite beneficial to have objects exposing their absolute position (from the _root stream), by showing their start and end positions.

According to @GreyCat , this feature already exists via the debug mode but is only available in Ruby, JS and Java. If this could be accessible to the other languages (at least Python and/or C++ 😄 ), that'd be awesome.
And ksv already uses that property (the highlight string at the bottom).

There would be a lot of potential uses for such feature, notably the fact that it would allow to generate from a template precise mutated version of one particular section declared in the KSY file.

Thanks again for the fantastic work!

@GreyCat
Copy link
Member

GreyCat commented Jan 30, 2018

BTW, objects won't be exposing absolute positions, only the positions from their current stream. Absolute positions actually make little sense, as in many cases (like navigating from zlib-compressed stream) current stream positions won't really map to anything in absolute world.

@hugsy
Copy link
Author

hugsy commented Jan 30, 2018

Good point hehe. The absolute position in their stream would work just as well.

@arekbulski arekbulski changed the title [Enhancement] Exposing the absolute position of objects after parsing Exposing the absolute position of objects after parsing Jan 30, 2018
@arekbulski
Copy link
Member

objects won't be exposing absolute positions, only the positions from their current stream

For a moment I thought that those are the same thing, and hugsy seems to be using a non-term that is a mixup of both. I think what @GreyCat meant was those offsets are positons as-reported by stream tell(), not pre-computed from pure schema. Did I get that right?

@GreyCat
Copy link
Member

GreyCat commented Feb 8, 2018

All that position recording stuff is actually very simple. Instead of just doing read + member assingment, we do something like:

this._debug['foo']['start'] = _io.pos()
this.foo = _io.read_something()
this._debug['foo']['end'] = _io.pos()

That's it, no magic. So, if that _io is actually not a root IO, but some substream and/or processed stream, it will obviously return positions relative to that stream, i.e.:

seq:
  - id: header
    size: 8
    # => would report [0, 8)
  - id: buffer
    size: 8 # creates substream
    type: my_buffer
    # => would report [8, 16)
types:
  my_buffer:
    seq:
      - id: foo
        type: u4
        # => would report [0, 4), as this is a substream;
        # if we were talking about absolute positions, one might expect [8, 12)

@arekbulski
Copy link
Member

Also pointer fields would report offsets like those specified, right?

@GreyCat
Copy link
Member

GreyCat commented Feb 8, 2018

What do you mean by "pointer fields"?

@arekbulski
Copy link
Member

Oops, I forgot what its called in Kaitai, so I used Construct terminology. I meant an instance with "pos".

@GreyCat
Copy link
Member

GreyCat commented Feb 8, 2018

Yeah, they would, with the difference that they might not be using current _io directly, but have it overrriden somehow, for example:

  def bar
    return @bar unless @bar.nil?
    io = foo._io # <= selects which io to use and remembers it
    _pos = io.pos
    io.seek(42)
    (@_debug['bar'] ||= {})[:start] = io.pos # <= uses `io`, not `_io`
    @bar = io.read_u4le
    (@_debug['bar'] ||= {})[:end] = io.pos # <= uses `io`, not `_io`
    io.seek(_pos)
    @bar
  end

@arekbulski
Copy link
Member

Which reminds me, should processxor return a stream instead of another byte array? It would avoid double memory consumption, but it would also be elaborate in implementation, and not faster either.

@GreyCat
Copy link
Member

GreyCat commented Feb 8, 2018

I'm not sure what you're calling "double memory consumption" here. Indeed, doing a processor with a stream-like API (i.e. consume a stream and produces a stream) is a nice goal, but that's also not a trivial solution (i.e. it complicates things a lot, it is not always faster, it is not always better in terms of memory consumption, etc, etc).

@andreasgrosche
Copy link

I would like to see --debug support for C#.
Until then, I will try an approach using aspect oriented programming to weave the necessary code into the generated classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants