Consider addition RTTI to words #103

fomkin · 2018-06-04T12:41:08Z

Overview

Now RTTI (runtime type information) becomes obvious. It's required by .NET transaction support, by debug purposes like stack and head printing and transaction introspection. Also it will remove necessarity of type-bounded arithmetic instructions. I see two approaches to implementation.

Use BSON or MessagePack. Pros: it's are well-known standards, stable third-party implementations. Cons: this is serialization formats, both don't support references, hard to mutate.
More domain-specific format. Pros: we can use advantages of our VM without compromises. Cons: need to write a lot of code, bugs are coming, NIH is a shame.

I'm inclined to the second approach.

Proposal

This is specification pseudo-BNF of word format.

length := 0b00<6 bits of data>
       | 0b01<6 bits length>
       | 0b10<14 bits length>
       | 0b11<22 bits length>

bytes := length byte[&length]

int8    := 0x01
int16   := 0x02
int32   := 0x03
bigint  := 0x04
uint8   := 0x05
uint16  := 0x06
uint32  := 0x07
decimal := 0x08
boolean := 0x09
ref     := 0x0A
utf8    := 0x0B
array   := 0x0C
struct  := 0x0D

primitive_type := int8
               | int16
               | int32
               | bigint
               | uint8
               | uint16
               | uint256
               | double
               | boolean
               | ref

type := primitive_type
      | struct
      | array
      | utf8

int8_data    := bytes~1
int16_data   := bytes~2
int32_data   := bytes~4
bigint_data   := length bytes[&length]
uint8_data   := bytes~1
uint16_data  := bytes~2
uint32_data  := bytes~4
double_data := bytes~8 # strict IEEE-754 floating point number
ref_data     := bytes~4
boolean_data := bytes~1

primitive_data := int8 int8_data
                | int16 int16_data
                | int32 int32_data
                | bitint bigint_data
                | uint8 uint8_data
                | uint16 uint16_data
                | uint32 uint32_data
                | double double_data
                | ref ref_data
                | boolean

data := primitive_data
      | array primitive_type length data(primitive_type)[&length]
      | struct bytes length (bytes, primitive_data)[&length] # struct_name, [(field, field_value)]
      | utf8 bytes

smth[num] means that we duplicate smth structure num times. byte[8] means 8 bytes,
bytes~num means that we expect num of bytes (which length is dynamic).

&length refers to given length field and means an integer representation of that field.

(a, b) means pair type, e.g. two values of a and b are written consecutively.

data(primitive_type) means corresponding *_data structure for primtive_type

The text was updated successfully, but these errors were encountered:

pankratov · 2018-06-04T13:33:02Z

both don't support references

MessagePack supports extension types. So, you can define domain specific types with it. Although they have no versions for their specs ( see msgpack/msgpack#165 and msgpack/msgpack#195 ).

fomkin · 2018-06-04T13:57:50Z

@pankratov MessagePack extension types looks good. But the more I watch, the less I like this format. Lack of format version is big concern. Also I think we trying to use tools (BSON/MP) looks like suitable but not actually suitable.

vovapolu · 2018-06-04T14:04:11Z

I don't think it would be hard to implement our own protocol. It seems quite simple, and it's an esential part of our vm.

pankratov · 2018-06-04T18:35:36Z

struct length (utf8, primitive_data)[&length]

It seems like you meant data here, not primitive_data, i.e. struct length (utf8_data, data)[&length]. However maybe I wrong. Nevertheless, data looks good here, IMHO.

@fomkin Do you consider binary data (like hash for example) as an array of uint8 ?

pankratov · 2018-06-04T19:09:31Z

I think tuples (i.e. tuple length (type data(type))[&length]) are also might be useful.

pankratov · 2018-06-04T19:17:15Z

And here

array primitive_type length data(primitive_type)[&length]

Why primitive? How can I define array of arrays, for example? Should I use refs ?

pankratov · 2018-06-04T19:41:39Z

Domain specific types (like address or code for example) could be defined as separate types.

sherzodv · 2018-06-05T06:21:03Z

How about a name of a struct itself, I mean how are we going to separate different struct types being of the same structure?

Another concern here is an array of a variable length data, e.g. array utf8 .... We won't be able neither to give its length nor have constant time indexed access to values.

If it's meant that only refs are used for variable length data in arrays, I think we need to show it in specs.

This is complementary to Vasilii's question.

ref_data size looks very modest to me, although this will reduce the size of a program significantly, we will have very few space for future changes.

pankratov · 2018-06-05T07:25:49Z

I mean how are we going to separate different struct types being of the same structur

@sherzodv are we going to distinguish them? I thought it's something like a dictionary or map.

sherzodv · 2018-06-05T07:35:49Z

@sherzodv are we going to distinguish them? I thought it's something like a dictionary or map.

External mapping is OK too if we can guarantee that our data never moves. In other case things like dynamic cast will be in trouble

fomkin · 2018-06-05T08:24:01Z

Why primitive? How can I define array of arrays, for example? Should I use refs ?

@pankratov Because I want to mutate without copy. Yes, you need to use refs.

How about a name of a struct itself, I mean how are we going to separate different struct types being of the same structure?

@sherzodv +1 great idea.

Another concern here is an array of a variable length data, e.g. array utf8 .... We won't be able neither to give its length nor have constant time indexed access to values.

@sherzodv thanks, I'll fix it.

ref_data size looks very modest to me, although this will reduce the size of a program significantly, we will have very few space for future changes.

@sherzodv Ok. You right. Let it be 32 bit.

fomkin · 2018-06-07T10:54:49Z

Changed decimal to double (strict IEEE-754 floating point number)
Changed int64 and uint64 to int256 and uint256

vovapolu · 2018-06-19T11:32:21Z

We need to somehow encode type signature of a word. It's needed in meta method signature and definitely will be needed in future meta information. So I suggest to introduce new signature structure:

signature := primitive_type
           | struct bytes length (bytes, primitive_type)[&length] 
           | array primitive_type
           | utf8

in struct first bytes are the name of the structure, second bytes are the name of corresponding field.

Probably we're also going to add something for class representation (e.g. for methods in struct). What do you think?

fomkin · 2018-06-19T11:44:39Z

@vovapolu looks good.

fomkin mentioned this issue Jun 4, 2018

Typed words in dotnet #99

Closed

vovapolu mentioned this issue Jun 15, 2018

Program method signature #121

Closed

vovapolu added a commit that referenced this issue Jun 29, 2018

Fix dotnet translation tests, add _ in vm-asm tests #103

c8c63df

This was referenced Jul 2, 2018

Tests for asm #42

Closed

Proper address of a stored program #69

Closed

Standard library #134

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider addition RTTI to words #103

Consider addition RTTI to words #103

fomkin commented Jun 4, 2018 •

edited

pankratov commented Jun 4, 2018 •

edited

fomkin commented Jun 4, 2018

vovapolu commented Jun 4, 2018

pankratov commented Jun 4, 2018 •

edited

pankratov commented Jun 4, 2018 •

edited

pankratov commented Jun 4, 2018 •

edited

pankratov commented Jun 4, 2018 •

edited

sherzodv commented Jun 5, 2018 •

edited

pankratov commented Jun 5, 2018

sherzodv commented Jun 5, 2018 •

edited

fomkin commented Jun 5, 2018 •

edited

fomkin commented Jun 7, 2018

vovapolu commented Jun 19, 2018 •

edited

fomkin commented Jun 19, 2018

Consider addition RTTI to words #103

Consider addition RTTI to words #103

Comments

fomkin commented Jun 4, 2018 • edited

Overview

Proposal

pankratov commented Jun 4, 2018 • edited

fomkin commented Jun 4, 2018

vovapolu commented Jun 4, 2018

pankratov commented Jun 4, 2018 • edited

pankratov commented Jun 4, 2018 • edited

pankratov commented Jun 4, 2018 • edited

pankratov commented Jun 4, 2018 • edited

sherzodv commented Jun 5, 2018 • edited

pankratov commented Jun 5, 2018

sherzodv commented Jun 5, 2018 • edited

fomkin commented Jun 5, 2018 • edited

fomkin commented Jun 7, 2018

vovapolu commented Jun 19, 2018 • edited

fomkin commented Jun 19, 2018

fomkin commented Jun 4, 2018 •

edited

pankratov commented Jun 4, 2018 •

edited

pankratov commented Jun 4, 2018 •

edited

pankratov commented Jun 4, 2018 •

edited

pankratov commented Jun 4, 2018 •

edited

pankratov commented Jun 4, 2018 •

edited

sherzodv commented Jun 5, 2018 •

edited

sherzodv commented Jun 5, 2018 •

edited

fomkin commented Jun 5, 2018 •

edited

vovapolu commented Jun 19, 2018 •

edited