Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"enum value does not fit" for uint64 enums #5161

Closed
iceboy233 opened this issue Feb 7, 2019 · 18 comments
Closed

"enum value does not fit" for uint64 enums #5161

iceboy233 opened this issue Feb 7, 2019 · 18 comments

Comments

@iceboy233
Copy link
Contributor

Example:

enum Foo : uint64 { a }

Result:

error: a.fbs:1: 23: error: enum value does not fit [0; -1]

Looks like this overflowed:

int64_t min_value = static_cast<int64_t>( \

Introduced in 5c0f914.

@iceboy233
Copy link
Contributor Author

At the same time we should also allow empty enums.

@vglavnyy
Copy link
Contributor

flatbuffers::numeric_limits<CTYPE>::max()) returns 0xFFFFFFFF interpreted as -1.
The 0 is greater than -1.

This issue has a simple solution, but there is a hidden problem.
Internally, the underlying type of any enums is int64_t (see struct EnumVal).
Therefore it is impossible to use enum values large then max(long) even if the fbs-enum has ulong underlying type.
static_cast<int64_t> from uint64_t is implementation-defined (platform dependent) since C++03:

If the destination type is signed, the value is unchanged if it can be represented in the destination type > (and bit-field width); otherwise, the value is implementation-defined.

@aardappel
Copy link
Collaborator

Hah, that's unfortunate.. looks like we need a special case there?

@iceboy233
Copy link
Contributor Author

I think we can split the current EnumVal into two parts:

struct EnumValBase {
  std::string name;
  std::vector<std::string> doc_comment;
  Type union_type;
};

template <typename T>
struct EnumVal {
  EnumValBase base;
  T value;
};

Not sure on the reflection side. I feel we can use union-like type there.

@aardappel
Copy link
Collaborator

I'm not sure how that helps, since the type of the enum is only known at runtime (in the parser).

I think the changes can be entirely in code. We can obtain the desired data with reinterpret_cast<uint64_t> or a union.

@iceboy233
Copy link
Contributor Author

In this way, every code that needs to access EnumVal<T>::value needs to know T at compile time (by template param). We still need a runtime dispatch at some level, but I think that might already partially exist, like the current CTYPE.

@aardappel
Copy link
Collaborator

@iceb0y I don't think this can work.. Now EnumDef needs to be templated as well (since it contains EnumVals), and now we can't have a type for the list of EnumDefs in Parser since each is different. You'll end up dispatching and casting all over the place. Better to just cast the actual value storage.

@vglavnyy
Copy link
Contributor

@iceb0y do you already have a solution?

I think it is acceptable to use simple static_cast<int64_t>(u64_val) to store a new enum value to the vector without type punning.

Implicit signed to unsigned cast is modulo 2n cast by the C++ standard.
But unsigned to signed cast is implementation-defined by the C++ standard if a u64 value doesn't fit int64.
De facto the unsigned to signed cast is modulo 2n too.
I hope, the C++20 standard will make this cast well-defined:
https://en.cppreference.com/w/cpp/language/implicit_conversion
Proposal:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r3.html

@iceboy233
Copy link
Contributor Author

Looking at code I agree that it's better to dispatch at value level than higher level. We can continue to use int64_t to store the value, while casting from T to int64_t, we can branch on value > max(int64_t) to use defined behavior. For longer term we can introduce a type utility that makes this easier.

@vglavnyy
Copy link
Contributor

vglavnyy commented Mar 9, 2019

@aardappel
I have got a draft for this issue.
The solution will be ready in one or two weeks.
Probably #5194 may depend on this issue.

@aardappel
Copy link
Collaborator

Thanks @vglavnyy !

vglavnyy added a commit to vglavnyy/flatbuffers that referenced this issue Mar 17, 2019
vglavnyy added a commit to vglavnyy/flatbuffers that referenced this issue Mar 17, 2019
@vglavnyy
Copy link
Contributor

@aardappel what do you think about unsigned bit_flags?
Now, an enum with bit_flag attribute can have any integer underlying type.
Problem:
If the underlying type is int32 then bit_31 is less than bit_30 by the value.

In C++ it is possible to set "sign" bit using static_cast from an unsigned value or using an explicit negative value. For other languages, explicit negation may be the only solution.

In C++ there are no bit_flags like 1 << bit_pos. One should specify each flag explicitly:

enum test_i8 : int8_t {
 b0 = 1 << 0,
 b7_err = 1<< 7,     // this left-shift is UB
 b7_ok = -128 // well-defined
}

C# has [Flags] attribute, but also doesn't allow (1<<7):

[Flags]
enum test_i8 : sbyte
{
  b0 = 1 << 0,
  b7_err = 1 << 7, // compile time error, 128 is out of range
  b7_ok = -128      // valid
}

Can we limit the signed bif_flags by [1 << (sizeof(T)*8-2)] value to avoid complexity and ambiguity?
Grammar restriction: bit_flags end value shall be non-negative.

@vglavnyy
Copy link
Contributor

vglavnyy commented Mar 21, 2019

Any enum values can be combined by OR:

flatbuffers/tests/test.cpp

Lines 1391 to 1407 in f93d0f6

void EnumStringsTest() {
flatbuffers::Parser parser1;
TEST_EQ(parser1.Parse("enum E:byte { A, B, C } table T { F:[E]; }"
"root_type T;"
"{ F:[ A, B, \"C\", \"A B C\" ] }"),
true);
flatbuffers::Parser parser2;
TEST_EQ(parser2.Parse("enum E:byte { A, B, C } table T { F:[int]; }"
"root_type T;"
"{ F:[ \"E.C\", \"E.A E.B E.C\" ] }"),
true);
}
void EnumNamesTest() {
TEST_EQ_STR("Red", EnumNameColor(Color_Red));
TEST_EQ_STR("Green", EnumNameColor(Color_Green));
TEST_EQ_STR("Blue", EnumNameColor(Color_Blue));

Grammar:

For enums representing flags, you may place multiple inside a string
separated by spaces to OR them, e.g.
field: "EnumVal1 EnumVal2" or field: "Enum.EnumVal1 Enum.EnumVal2".

What does "represent the flags" mean in the grammar?
Probably, this is the wrong expected values in the test.
Is this test correct for enums without bit_flags attribute?

@aardappel
Copy link
Collaborator

I am hoping programmers can make informed choices about what integer type works for them.. if you really are actually using all 32-bits I hope you're smart enough to choose unsigned :)
I am not sure if its a problem worth preventing.. though I guess we can error that the value doesn't fit if more than 31 bits are used.

"representing flags" means "are intended as flags, either because they use bit_flags or because they manually use powers of two values"
As for enum strings, it doesn't make sense for non-flags, but its hard to avoid. Do you think this is a problem?

vglavnyy added a commit to vglavnyy/flatbuffers that referenced this issue Mar 24, 2019
vglavnyy added a commit to vglavnyy/flatbuffers that referenced this issue Mar 26, 2019
- fix enum with uint64
- hide the implementation of enums from code generators
@vglavnyy
Copy link
Contributor

I have implemented it without restrictions.
But integers with OR-initialization using enums looks very strange.

enum E1: int8 {
 v = -128, // stored as int64=-128=0xFFFFFFFFFFFFFF80
 w = 127 // stored as int64=-128=0x7F
}

enum E2: uint8 {
 v = 127 // stored as int64=0x7F
}

table X{
 x :int32 = E1.v E2.v; // signed OR unsigned => 0xFFFFFF80 | 0x7F = -1.
 y : E1 = v w; // two signed => (-128 | 127) = -1
}

Perhaps it is better to add the character | as a bitwise-or operator?

@aardappel
Copy link
Collaborator

You should also be able to put "" around the initializer.

I agree the syntax is strange, and an explicit | would have been better. We can add it, as long as we retain backwards compatibility with the current syntax.

@vglavnyy
Copy link
Contributor

vglavnyy commented May 8, 2019

Close?

@iceboy233
Copy link
Contributor Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants