Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

union - C++ bug when initializing elements with values greater than 255 #4209

Closed
slavslavov opened this issue Mar 8, 2017 · 13 comments
Closed

Comments

@slavslavov
Copy link

Having the following IDL, in the code try to compare

YourHeaderPointer->Command_type() with Commands_Command1

Command_type() returns 232 (which is the 8 bit representation of 1000)
Commands_Command1 is 1000

IDL:
union Commands {Command1 = 1000, Command2 = 1001}
table Header {
Command: Commands;
}

root_type Header;

table Command1{
Data1:uint;
Data2:uint;
}

table Command2{
Data4:ulong;
Data5:ulong;
}

@slavslavov
Copy link
Author

The same behavior is observed if you add more than 254 union elements without initializing them. The bug will be observed for element 255 and above.

@FrankStain
Copy link
Contributor

FrankStain commented Mar 8, 2017

In fact, uint8_t is used currently to store the type of union value.

return static_cast<Any>(GetField<uint8_t>(VT_TEST_TYPE, 0));

@slavslavov , can you confirm that no warning/error was produced by 'flatc' while parsing your schema?

@aardappel
Copy link
Collaborator

Yes, union types are always represented by a byte. The schema compiler should give you an error when using values like that.. this needs to be fixed.

@slavslavov
Copy link
Author

slavslavov commented Mar 9, 2017

When I execute the following, the header_generated.h file is created with no errors and warnings.
flatc -c --gen-object-api header.fbs

I've created an IDL file with 260 union elements (see attached). flatc generates 1meg file, 32k lines but again with no errors and warnings.
header_large.txt

forgot to mention the version I am using - 1.6.0

@aardappel
Copy link
Collaborator

I will make it generate an error. For now you'll have to manually ensure your values are inside the [0..255] range

@zejal
Copy link
Contributor

zejal commented Mar 19, 2017

Quick question: is the uint8_t type for storing a union type a "hard" requirement please ? For example is changing it to uint16_t (and rebuild) safe ? It is likely I'd have cases where I'd have unions with more than 256 types (esp as member of a root type). Thanks.

@aardappel
Copy link
Collaborator

It is fairly hard, since all implementations in all languages make this assumption. So if you changed it to a uint16_t, you'd create an incompatible FlatBuffers fork for your own use.

We could support this in the future by it being specified per union, but it is a non-trivial change as all languages and parsers etc will have to support it.

@zejal
Copy link
Contributor

zejal commented Mar 20, 2017

Thanks, would a table containing 1 member per type be an alternative (only 1 is set) ? Is there also a 256 limit for members in a table ? Thanks a lot

@aardappel
Copy link
Collaborator

Tables can have around 16K fields, but a table with hundreds of fields and only one set will be very inefficient, so no, not recommended.

@zejal
Copy link
Contributor

zejal commented Mar 21, 2017

Thanks a lot for replying. In what aspect will it be inefficient ? Performance or space (or ...) ? If I understood correctly the encoding: spacewise hundreds of fields with only 1 set should only make the vtable large, and given the vtable only contain uint16_t IIRC of 2 bytes each even with 1000 fields we have roughly 2K, not a big deal (and there is only 1 in the whole message). Data itself is only 1 offset (assuming all types are tables). Performancewise I could store the id of the field that's set (a "which" field). After it should be only a matter to lookup the vtable directly to that id (and check it is not 0 of course). This is a bit an adhoc use of encoding I must admit.
Thanks a lot.

@aardappel
Copy link
Collaborator

The inefficiency is mostly the larger wire size, caused by big vtables for all those unused fields. If having 2K of unused data just to store a single value is "no big deal", then I guess so :)

You could also consider a hierarchical union, i.e. a set of sub-unions all <256 elements, with one root union to decide between them.

@zejal
Copy link
Contributor

zejal commented Mar 23, 2017

IMO, 2K of unused data is not a big deal if the total size of FB buffer is in order of magnitude of 10's of megabytes or even 100's or gigabyte. If smaller yes I agree it's more of a problem.

@aardappel
Copy link
Collaborator

union values out of range is now an error: 9d01bfa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants