[RFC] Update .mrb file format #944

masuidrive · 2013-03-04T15:46:10Z

It's proposal of new .mrb file format.
Current .mrb file is hex format. It's too fat.

New .mrb file can have sections. now it have irep section only. I'll add debug section.
It have #880 in mind. The irep section has 'endianness' field and can choice big/little endian in the irep section.

I'm working on https://github.com/masuidrive/mruby/tree/binary

matz · 2013-03-04T16:39:51Z

I agreed with compact .mrb file.

I still think .mrb format should be endian neutral. There's no reason for "file format" to be endian sensitive, unless you REALLY want to read them through mmap. Only we need is in-memory correspondent of .mrb file, which should consume less memory.

FYI, I have a vague plan to remove irep array from mrb_state, instead, I'd add small irep arrays to each ireps.

masuidrive · 2013-03-04T17:30:29Z

.mrb loader support both endian.
So little endian CPU can load big endian mrb file.
'endianness' flag is for direct refer to ISEQ in .mrb file on ROM.

What do you think about ROM .mrb?
Should I need to create new section in .mrb file?

mattn · 2013-03-05T00:18:24Z

Why did you store compiler name/ver in irep section? I prefer to be upper-layer.

mattn · 2013-03-05T00:19:03Z

Also bytecode ver, endian things.

matz · 2013-03-05T02:11:58Z

In my opinion, ROM stored data format should be separated from .mrb file.

mrb should be endian neutral, ROM is not
mrb should be error tolerant (CRC e.g.), ROM need not to be.
mrb should be read/write via I/O, ROM is not.

Design something that serves both purposes has no merit.

skandhas · 2013-03-05T02:15:14Z

I agree with what @mattn said. Compiler name/ver, bytecode ver and endianness are redundant in irep section. Those infomation can be stored in .mrb file header.

miura1729 · 2013-03-05T03:56:44Z

I think that member whose size is 16 or 32bits should not assign on odd address. And in embedded system I think .mrb file needs reserved area for extending format. So I propose it need reserved member after endianess or top of IREP record 'B'.

monaka · 2013-03-05T04:22:30Z

I prefer @matz's opinion.
I believe there is the needs to ROMize.
But there are so many ROM-CPU connection type on real targets. So we mruby core team can't follow all of them.
In addition to @matz's exemplify: Alignments and endian is important for parallel ROMs. But they are not always important for serial ROMs like SPI connected.
(BTW, CRC check is required even if it is a programmable ROM, I think. ...back on topic.)

The solution is depends on the reason why binary format is required.
We can use compression if the reason is compact size.
(This may provide the another merit we can pack media contents with bytecodes.)

And I think we should provide a pluggable dump/load framework if the reason is ROM.

monaka · 2013-03-05T04:55:47Z

I agree with @mattn's saying basically.
It's useful to store {compiler name | version | bytecode ver. | endian} into the section header instead of the section binary. (And strictly I don't support adding endian to portable .mrb file format.)

monaka · 2013-03-05T05:08:51Z

How about to add file-format-type field in RITE file header?
If it exists, loader framework (this framework also "if it is exsits") can dispatch each reader subsystems.

Just a first plan:
file-format-type is uint32_t.
0 means traditional mrb format by @matz.
1 means new(?) mrb ascii hex format by @masuidrive and others.
2 means new(?) mrb binary format similar to 1.
3 to 255 is reserved for the future use.
256 - UINT32_MAX means free to use by application's self-responsibility.

masuidrive · 2013-03-05T10:50:12Z

@monaka
I think you don't need to file-format-type.
You can have application specific section in this file.

monaka · 2013-03-05T11:06:43Z

Ok. Go back to the root of this issue, then.

Could you tell us again why we need the new format?
For compaction? For archiving IREPs? For machine readability?

There have more than a merit. So it become easier to discuss if it was focused.

masuidrive · 2013-03-05T11:08:24Z

I moved compiler name/ver to file header.

But byte code ver still in irep section.
The file can contain some irep section what's different byte code ver.

I agreed remove endianness field.
I see sparately .mrb and ROM.

masuidrive · 2013-03-05T11:15:25Z

I think the new format is for extendable.
New format can contain data more than IREPs.

After that I'll work for containing debug information to the file.

masuidrive · 2013-03-05T11:21:52Z

Updated image https://creative.adobe.com/file/4b4b9a69-db93-4120-9cdf-38964b307042

monaka · 2013-03-05T11:25:23Z

I think this work is enough worth to develop incremental.

We should focus to IREP archiving for now if it regards as the top priority.

If it is so, is it acceptable the new file format is based on "ASCII hex format" and "endian neutral"? (for now)

matsumotory · 2013-03-05T11:55:28Z

Do you think a IREP Record have a IREP record header included nlocals, nregs, npools, nsyms and so on? I prefre IREP record header by IREP record. If we have a IREP record header structure, simple to use IREP record section when cast original data to IREP record structure or header structure.

masuidrive · 2013-03-05T12:34:54Z

@monaka
I agreed "endian natural". but I don't understand ASCII hex format. Why do you want to use ASCII?

monaka · 2013-03-05T12:56:10Z

There are 2 + 1 reason why I suggest ASCII.

1: working step by step. the current version of mrb is ASCII based.

2: easy to debug. This helps you until the format is stable. e.g. We can't paste binary here.

3: target loading in embedded systems. ASCII based formats are still active format on the embedded system area. Typical examples are Intel hex and Motorola S-record.

Actually 3 is not important. Maybe embedians will choice the another format anyway.
I'm afraid 1,2.

masuidrive · 2013-03-06T01:58:04Z

Binary generator code is simple than hex generator. And I want to remove hex generate code from current code, it's messy.
You can paste hexdump-ed binary and upload binary to gist. Either way, we need to write format verification tool for debugging, because hard to reed bin/hex data by human.

monaka · 2013-03-06T05:29:49Z

I have no reason that I recommend strongly if you say so.

mattn · 2013-03-06T05:40:09Z

@masuidrive that link is 404

monaka · 2013-03-06T06:10:44Z

I go back to the figure attached.

It is probably required IREP section table. Even if we can determine the offset of next section using section size.

And it's better alignment conscious. This tactic makes easy to analyze using binary editors.
You'll fall into tool making hell if you play down about alignment. ;)

I think 4bytes alignment is fit to this format. 2bytes also possible.
So the magic of IREP record should be expand to 2/4bytes. Or should mark as reserved 1/3bytes.
"pool size" and "sym size" also considerable to treat alignment.

matz · 2013-03-06T08:15:50Z

I propose to have two separate irep representation, one for mrb file format (new mrb), the other for in-memory packed representation (packed irep).

new mrb format should be:

endian neutral
binary (should be compact than current mrb)
with CRC check sum

packed irep should be:

can be represented by C array (a la mrbc -B)
ROM able
endian aware so that irep can refer iseq section in packed irep

monaka · 2013-03-06T08:56:00Z

I have no rights to stop someone's creation.
But I think we can't decide spec for ROMable. This is not by out skill of couse, but by the diversity of embedded targets.
So My counter propose is that we concentrate to new (not packed) mrb format.

If I understand correctly, not packed version of new mrb format can convert to C array and linked as same as current mrb format. Is this right?
If it is right, it is a enough progressive feature even if the target is a small embedded system.

masuidrive · 2013-03-07T12:20:37Z

I updated new .mrb format.

All uint* are big endian.
binary
have CRC in file header.

beoran · 2013-03-09T18:40:03Z

That image is too messy, so I made a diagram in Dia:

Download the Dia file here (use "save as" functionality of your browser):
http://www.beoran.net/eruta/uploads/diagrams/new_mrb_format.dia

mattn · 2013-03-11T00:21:09Z

@beoran thank you. it's cool.

masuidrive mentioned this issue Mar 7, 2013

New .mrb format. #964

Merged

kyab mentioned this issue Jul 3, 2013

ROM-able ISEQ revisited #1338

Closed

takahashim pushed a commit to takahashim/mruby that referenced this issue Nov 3, 2013

New mrb format. The detail is in mruby#944

2842583

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Update .mrb file format #944

[RFC] Update .mrb file format #944

masuidrive commented Mar 4, 2013

matz commented Mar 4, 2013

masuidrive commented Mar 4, 2013

mattn commented Mar 5, 2013

mattn commented Mar 5, 2013

matz commented Mar 5, 2013

skandhas commented Mar 5, 2013

miura1729 commented Mar 5, 2013

monaka commented Mar 5, 2013

monaka commented Mar 5, 2013

monaka commented Mar 5, 2013

masuidrive commented Mar 5, 2013

monaka commented Mar 5, 2013

masuidrive commented Mar 5, 2013

masuidrive commented Mar 5, 2013

masuidrive commented Mar 5, 2013

monaka commented Mar 5, 2013

matsumotory commented Mar 5, 2013

masuidrive commented Mar 5, 2013

monaka commented Mar 5, 2013

masuidrive commented Mar 6, 2013

monaka commented Mar 6, 2013

mattn commented Mar 6, 2013

monaka commented Mar 6, 2013

matz commented Mar 6, 2013

monaka commented Mar 6, 2013

masuidrive commented Mar 7, 2013

beoran commented Mar 9, 2013

mattn commented Mar 11, 2013

[RFC] Update .mrb file format #944

[RFC] Update .mrb file format #944

Comments

masuidrive commented Mar 4, 2013

matz commented Mar 4, 2013

masuidrive commented Mar 4, 2013

mattn commented Mar 5, 2013

mattn commented Mar 5, 2013

matz commented Mar 5, 2013

skandhas commented Mar 5, 2013

miura1729 commented Mar 5, 2013

monaka commented Mar 5, 2013

monaka commented Mar 5, 2013

monaka commented Mar 5, 2013

masuidrive commented Mar 5, 2013

monaka commented Mar 5, 2013

masuidrive commented Mar 5, 2013

masuidrive commented Mar 5, 2013

masuidrive commented Mar 5, 2013

monaka commented Mar 5, 2013

matsumotory commented Mar 5, 2013

masuidrive commented Mar 5, 2013

monaka commented Mar 5, 2013

masuidrive commented Mar 6, 2013

monaka commented Mar 6, 2013

mattn commented Mar 6, 2013

monaka commented Mar 6, 2013

matz commented Mar 6, 2013

monaka commented Mar 6, 2013

masuidrive commented Mar 7, 2013

beoran commented Mar 9, 2013

mattn commented Mar 11, 2013