Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repeat: size, repeat: pos #24

Closed
ghost opened this issue Aug 23, 2016 · 8 comments
Closed

repeat: size, repeat: pos #24

ghost opened this issue Aug 23, 2016 · 8 comments
Assignees
Labels

Comments

@ghost
Copy link

ghost commented Aug 23, 2016

Sometimes one need to describe repeated blocks which number is unknown and known only a size of all blocks or a position where they end. Consider the following .ksy file:

meta:
  id: test
  file-extension: ext
  endian: le

seq:
  - id: table_1_size
    type: u4
  - id: table_1
    type: table_1_entry
  - id: garbage
    size-eos: true

types:
  table_1_entry:
    seq:
      - id: len
        type: u4
      - id: data
        size: len

Here we want to read full table_1 but don't know how many entries exist. My suggestion is to add repeat: size attribute to make it possible to read as much table_1 entries as they fits in a size specified:

seq:
  - id: table_1_size
    type: u4
  - id: table_1
    type: table_1_entry
    repeat: size
    repeat-size: table_1_size
  - id: garbage
    size-eos: true

The other option is to read to a position in stream. It may looks like that:

seq:
  - id: offset_to_garbage
    type: u4
  - id: table_1
    type: table_1_entry
    repeat: pos
    repeat-pos: offset_to_garbage
  - id: garbage
    size-eos: true
@GreyCat GreyCat self-assigned this Aug 23, 2016
@GreyCat
Copy link
Member

GreyCat commented Aug 23, 2016

You format can be described perfectly well in current language by adding an extra level of nesting:

seq:
  - id: table_1_size
    type: u4
  - id: table_1
    size: table_1_size
    type: table_1
types:
  table_1:
    seq:
      - id: entries
        type: table_1_entry
        repeat: eos
  table_1_entry:
    seq:
      - id: len
        type: u4
      - id: data
        size: len

In this example, table_1 is just a container with known size, which actually creates a substream with a fixed size (table_1_size). Then, in table_1 type you can just set entries to repeat until the end of this substream.

The only downside of this is that you'll have to write this extra level of indirection in your code, i.e. not just table_1[42], but table_1.entries[42].

Will that work for you?

@ghost
Copy link
Author

ghost commented Aug 23, 2016

That works, thank you! And could you tell is there a way to do repeat: pos? I have got something like that:

list_entry:
  seq:
    - id: next_offset
      type: u4
    - id: table
      type: table
      size: next_offset - _io.pos

table:
  seq:
    - id: table_entry
      type: table_entry
      repeat: eos

table_entry:
  seq:
    - id: data
    - size: 0x800

Here I need to retrieve current position in a stream to calculate the size. I was trying _io.pos and _io.pos() but it doesn't fit.

@GreyCat
Copy link
Member

GreyCat commented Aug 23, 2016

Personally, I've never encountered a format where it would be needed. Like, in your example above, _io.pos is always 4 (unless there is no substream for list_entry).

But, yeah, given that we have _io.size, we should probably implement _io.pos and _io.eof for the sake of completeness. Could you add an issue for that?

@ghost
Copy link
Author

ghost commented Aug 23, 2016

Before I create it, could you tell what you do mean by _io.eof, what should it return, true/false or a position at the end of file?

@GreyCat
Copy link
Member

GreyCat commented Aug 23, 2016

_io.eof should just call Kaitai Stream's eof method, returning true/false. _io.size is a method that returns full size of the stream = position of the end.

@LogicAndTrick
Copy link
Collaborator

If we're adding _io.eof as a boolean expression: Does the expression syntax have good support for booleans at the moment? I may be wrong (haven't tested) but it seems like the only boolean operation you can do is the ternary expression, i.e. test ? result_if_true : result_if_false.

Might be a good idea to add support/tests for true/false literals and boolean operations such as:

# Example: I have a format that has an optional info block at the end of the file
seq:
  - id: info
    type: optional_info_type
    if: !_io.eof

# Example: I'm doing something crazy and need this for a work-around or whatever
seq:
  - id: never
    type: u1
    if: false

@GreyCat
Copy link
Member

GreyCat commented Aug 24, 2016

Our "expression language" was originally a rip-off from a Python grammar, so I believe there's fairly good support for boolean expressions - but they are all named in English (i.e. not, and, or, etc). We have tests like 1 < 2, 1.0 < 2, a != 2 and a != 5. Probably a good idea to add boolean literals tests, though :)

@ghost ghost mentioned this issue Aug 24, 2016
@GreyCat
Copy link
Member

GreyCat commented Sep 5, 2016

Given that all questions seems to be answered, closing this one.

@GreyCat GreyCat closed this as completed Sep 5, 2016
DL4PD pushed a commit to DL4PD/kaitai_struct that referenced this issue Mar 13, 2019
Added a test to map value instances to an enum.
krisutofu pushed a commit to krisutofu/kaitai_struct that referenced this issue Jan 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants