-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Flatbuffer Schema Support #70
Conversation
- tested compilation is successful
There are no well-known types.
Note that the "default defaults" are zero, which is typically reasonable. I'd suggest explicit defaults for enums (because sometimes there is no 0 value for the enum; it's not required, but it helps make things clearer).
File structure is generally whatever. Namespaces should definitely be consistent.
It doesn't strictly matter if they are in separate files or not. My typical philosophy is to not put them into separate files unless they are logically separate or there is a particular need to (e.g., enums are shared across many different types). Since it looks like you are doing this based on codegen, I think your current approach is fine.
I think that in order to maintain consistency with the schemas in your other languages, you want to leave it as-is. The fbs compiler will sometimes mess with the names for certain codegen, which is why it watches for this (e.g., for Java, a |
ROS 2 does not support naming a field |
Are the panels that consume these fields on the Studio side case-insensitive? If so, then yeah just make everything lowercase. |
Some notable exceptions:
|
Also re: defaults: note that, like protobufs, flatbuffers do have a concept of a field not being populated. It is also the case that the serialization for flatbuffers by default is configured to not serialize fields if the field is set to its default (so |
Is there some compile step we can run in CI to validate these generated files? For example, we do this for protobuf: schemas/.github/workflows/ci.yml Lines 39 to 40 in b36014a
|
Protobuf has special names because the types are nested inside message types:
The enum which is called This was done partly because the way protobuf c++ code is generated, the enum names pollute their parent namespace, rather than being namespaced under the enum type itself. So multiple top-level enums with I don't know if flatbuffers would have the same problem or not? |
It depends on where/how the types are used. In C++, |
- final tweaks to generate script to support nested bytes - update time and duration to be more consistent with protobuf types
Is it fine to just add
So as it stands right now all of the enums except for the pre-determined (ByteVectorForNesting, Time, Duration) ones all live in the foxglove namespace.
It's a little unclear to me what this uses to determine scoping. Is it just if it's in the file with a |
- lowercase field names for compliance with flatc warnings
looked into this and can't find a github action for it like there is for protobuf unfortunately. Is there somewhere else I should look or do you know how easy it would be to write one? I just don't know that much about making github actions |
In C++ it generates a Scoped Enum I.e., with
Then the generated C++ code will be namespace foo {
enum class Abc : int8_t {
A = 0,
B = 1,
MIN = A,
MAX = B
};
} Without namespace foo {
enum Abc : int8_t {
Abc_A = 0,
Abc_B = 1,
Abc_MIN = A,
Abc_MAX = B
};
} They also seem to have a flag to strip those prefixes off, but I doubt most people use it. Other languages I think tend to have some sort of scoping to their enums by default, so this is less of an issue outside of C++. |
ah I see, thanks for clarifying! Wasn't sure if it was related to the namespacing of enums |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does someone go about merging flatbuffer schemas and their includes into one blob, in order to put them in an mcap/websocket schema? Is it acceptable that each file has root_type
in it? I would guess not, and we might want to remove root_type
from our generated fbs files, but maybe there is some convenient way of merging them (like protoc's --descriptor_set_out
)?
scripts/updateGeneratedFiles.ts
Outdated
path.join(outDir, "flatbuffer/foxglove", "ByteVectorForNesting.fbs"), | ||
BYTE_VECTOR_FB, | ||
); | ||
await fs.writeFile(path.join(outDir, "flatbuffer/foxglove", "Time.fbs"), TIME_FB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it normal to use the foxglove
package dir for flatbuffers?
Should this be named flatbuffer
or flatbuffers
? Given that the project website starts with "FlatBuffers is an efficient cross platform serialization library..." I'm guessing probably flatbuffers
is the more accepted name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The github repo is https://github.com/google/flatbuffers and the C++ includes are in a flatbuffers
folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no real prescribed folder structure for flatbuffers as far as I can tell. I thought I'd follow the protobuf pattern of making it follow the namespace, but that's probably not necessary. Removed. Good callout on flatbuffer->flatbuffers though
internal/generateFlatbuffer.ts
Outdated
// Same as protobuf wellknown types | ||
export const TIME_FB = `struct Time { | ||
sec:long; | ||
nsec:int; | ||
} | ||
`; | ||
|
||
export const DURATION_FB = `struct Duration { | ||
sec:long; | ||
nsec:int; | ||
} | ||
`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use the int64
and int32
aliases for clarity?
Maybe worth copying/adapting this from protobuf docs to give more explanation on how negative durations work?
On the other hand, an option would be to simply use a single uint64 (time) or int64 (duration). That would give a wider range of representable values too, although it probably doesn't matter because the range is so large already.
Note that ROS does it differently:
ROS 2:
int32 sec, uint32 nsec: https://github.com/ros2/rcl_interfaces/blob/master/builtin_interfaces/msg/Time.msg
int32 sec, uint32 nsec: https://github.com/ros2/rcl_interfaces/blob/master/builtin_interfaces/msg/Duration.msg
C++ API: Time (int32_t seconds, uint32_t nanoseconds
https://docs.ros2.org/beta3/api/rclcpp/classrclcpp_1_1Time.html
C++ API: Duration (int32_t seconds, uint32_t nanoseconds)
https://docs.ros2.org/dashing/api/rclcpp/classrclcpp_1_1Duration.html
C API: uint64 time, int64 duration, both storing nanoseconds: https://docs.ros2.org/beta1/api/rcl/time_8h.html#a5957493a6bde7c3878f947398f048a24
ROS 1:
uint32 sec, uint32 nsec http://docs.ros.org/en/latest/api/rostime/html/classros_1_1TimeBase.html
int32 sec, int32 nsec http://docs.ros.org/en/latest/api/rostime/html/classros_1_1DurationBase.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good call, added the comment in line with the protobuf.
What kind of consideration should be made to the ROS times schemas in this context? Is this difference going to cause issues at read time?
internal/generateFlatbuffer.ts
Outdated
// fields that would benefit from having a default of 1 | ||
const defaultOneNumberFields = new Set(["x", "y", "z", "r", "g", "b", "a", "w"]); | ||
function isDefaultOneField(field: FoxgloveMessageField): boolean { | ||
return ( | ||
field.type.type === "primitive" && | ||
field.type.name === "float64" && | ||
defaultOneNumberFields.has(field.name) | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't belong here because it's not really specific to flatbuffers. If we want to add defaults to some fields, we need to add the default values to schemas.ts
and just read them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the schema under defaultValue?: string
under the MessageSchemaField
type if that seems appropriate
internal/generateFlatbuffer.ts
Outdated
}); | ||
|
||
// `///` comments required to show up in compiled flatbuffer schemas | ||
definition = `/// ${schema.description}\nenum ${schema.name} : byte {\n ${fields.join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use ubyte for enums? Might want to add validation that the values all fit within the range of the type we choose. I assume the compiler would validate this, so if we can get the compiler to be tested in CI, then that would be sufficient.
Looks like So it would be possible to simply download and run them in the CI step. Take a look at the "before" side here for an example of downloading & running binaries as part of GitHub CI: https://github.com/foxglove/mcap/pull/604/files (If you go this route, might also want to add a SHA check for security as was suggested in foxglove/studio#4745 (review)) |
Yes, at least I would say they probably shouldn't be floating around in the top level namespace. I don't know how name/import conflicts work in flatbuffers, but if someone is mixing their own custom schema with our schemas, we wouldn't want our Time/Duration to conflict with one they might already have. Maybe @jameskuszmaul-brt can give more insight on whether that would actually be a problem.
Not sure I fully understand the different options here – what would it look like to use the enums in the different scenarios you describe? |
Having root types for all the messages is fine (and probably what you want). Honestly not entirely sure why flatbuffers has the concept, since I'm pretty sure all it does is generate some extra code for that type, when it would be fine to just always generate the extra code.
I'd prefer they be in the |
From what I can tell the
after some testing, adding the parent proto types to the namespace doesn't work because the schemas referencing those enums don't have access to know what the parent type is 😅. So the options are just have the enums under the |
- remove `foxglove` subdir - use more readable type aliases - comment Time, Duration types
|
.github/workflows/ci.yml
Outdated
curl -LO https://github.com/google/flatbuffers/releases/download/v22.10.26/Linux.flatc.binary.clang++-12.zip | ||
echo "0821af82a3a736b0ba9235c02219df24d1f042dd Linux.flatc.binary.clang++-12.zip" | shasum -a 1 -c | ||
unzip Linux.flatc.binary.clang++-12.zip | ||
./flatc --ts -o /dev/null schemas/flatbuffers/*.fbs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there were warnings/errors would it actually affect the exit code of this flatc invocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there are errors, it returns an exit code of 1, so I believe it should exit out. On warnings, however, it still returns 0. I can add some logic to this to fail out when there's a warning though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think if that's not too hard, it would be nice to catch warnings as well. (Assuming the warnings are actually useful and not just noisy/pedantic)
Just updated to have the enums included in their parent table files again. I think this makes a bit more sense going forward and prevents empty files being generated for the enums that were alone in their schema files. On another note, I've been working on setting up an example file to test in studio and realized (as far as I know) that the schema has to be in binary format to be attached to the mcap schema. Do we want to also hold those binary schemas (.bfbs files) in this repo, since those are what will need to be passed to the mcap schemas when they are registered? Not sure what the ideals would be for usage of these schemas |
Correct, the MCAP definition presumes the binary schemas. Whatever you do, it seems like it should be able to be consistent with however you do it for protobufs. |
scripts/updateGeneratedFiles.ts
Outdated
generateFlatbuffers(schema), | ||
// want enums with their corresponding parent tables for usage | ||
const enums = Object.values(foxgloveEnumSchemas).filter( | ||
(enumSchema) => enumSchema.protobufParentMessageName === schema.name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's rename these protobuf
things now that they are not exclusive to protobuf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this wasn't addressed yet
Do the binary schemas include all the dependencies so they are fully self-contained? How does one create a merged binary schema from the individual .fbs files? With Protobuf we originally recommended people use |
Nice. Does this mean we do not support an enum being used in two different message schemas? For a moment I thought we already had that with NumericType, but I realized it is nested under the PackedElementField message and that gets used from both Grid and PointCloud. If it's not supported then maybe we should add an assertion in the unit tests to check that an enum only gets referenced by its parent message type and no others. |
They are fully self-contained.
You don't merge them. If you, e.g., wanted to include to disparate types in the same binary you'd have to wrap the normal reflection object into a vector or something else custom to represent the list. Since MCAP only needs the individual schemas for each channel, you don't need to merge things.
There isn't a great equivalent (that I'm aware of) for flatbuffers (for some reason flatbuffers has two different reflection mechanisms, one of which produces the nice easy-to-store object we use here; the other which is more useful at runtime with generated code). |
I guess my question was, what processes the I guess storing the .bfbs in the repo could make sense. We never stored the protobuf binary FileDescriptorSets here because it seemed normal for people to copy/paste .proto files into their project and integrate with their own build system. And also because there's no standard file extension for a binary proto schema. But since the .bfbs extension is standard maybe it would make sense here. cc @defunctzombie @amacneil for thoughts. |
Yeah, |
Currently Studio would only support the former, we don't support pulling out visualizable types from a nested field (see foxglove/community#203) |
My thoughts immediately go to a miss-match between the |
The only thing there is I don't think there's really a way to generate these flatbuffer schema files in Typescript at runtime at least not in the same way that the bag2mcap example. On demand beforehand with Aside from that if they're writing or reading these from these schemas, they'll also need the generated readers/writers for their specific language if they're doing something other than parsing to/from JSON. I was going to say I don't want to push the responsibility of keeping the |
Ok. Let's start without the bfbs files. We can always revisit this decision once we have more information from users. |
Turns out that, for C++ (and C++ alone, currently) the |
This should be ready for a final pass now. |
internal/generateFlatbufferSchema.ts
Outdated
namespace foxglove; | ||
|
||
struct Time { | ||
/// Represents seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Represents seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z | |
/// Represents seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z |
internal/generateFlatbufferSchema.ts
Outdated
sec:uint64; | ||
/// Nano-second fractions from 0 to 999,999,999 inclusive | ||
nsec:uint32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want/need uint64+uint32 – the other things I've seen are either a single uint64, or two (u)int32s
internal/types.ts
Outdated
@@ -30,6 +30,7 @@ export type FoxgloveMessageField = { | |||
required?: true; | |||
description: string; | |||
protobufFieldNumber?: number; | |||
defaultValue?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the default value for numeric fields probably shouldn't be a string, it should be the actual type of the field (number
in the cases you've added, although I suppose we could have defaults for string fields). Then it can be up to each generator to know how to translate a number into valid syntax in its own language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I honestly am not sure what would be the right way to type that aligned with the I just realized we also need this for possible boolean translations 🤦 I'll just make this type
type, without putting it inside of that, but maybe that makes more sense since we might not want defaults of nested schemas?number | string | boolean
scripts/updateGeneratedFiles.ts
Outdated
generateFlatbuffers(schema), | ||
// want enums with their corresponding parent tables for usage | ||
const enums = Object.values(foxgloveEnumSchemas).filter( | ||
(enumSchema) => enumSchema.protobufParentMessageName === schema.name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this wasn't addressed yet
.fbs
.To test compilation: install
cmake
andflatbuffer
(via homebrew or other means) then run this command from the repo root directory:flatc --ts -o ./schemas/flatbuffer/output ./schemas/flatbuffer/foxglove/**.fbs
and see that it generates the schema files in the
schemas/flatbuffer/output
directory.Few Open Questions about this implementation:
should I use the protobuf enums names?just to be sure there aren’t any wellknown types for time and duration like protobuf/ros?currently using something similar to the way we handle typescript generationDo we want defaults for primitive types?should we copy protobuf file structure and namespaces (foxglove.etc)?