-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVRO-3984 [C++] Improve code generated for unions #3047
AVRO-3984 [C++] Improve code generated for unions #3047
Conversation
…ead of a value to avoid calling copy constructor of large classes
…e reference. This allows the user to modify values in union branches after creation (apache#3047)
…re efficient way to set a value (apache#3047)
…s for unions to avoid a copy (apache#3047)
…anch names to the corresponding index. This allows the user to avoid checks against "magic numbers" (apache#3047)
} | ||
os_ << " };\n"; | ||
|
||
os_ << " size_t idx() const { return idx_; }\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it is worth including a branch
method which returns static_cast<Branch>(idx_)
? In code I have that uses Avro I always end up doing this anyways to switch over the enum for branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, i added the method. Thanks for the quick feedback!
I would! It means I can get rid of the enums that are manually maintained alongside the avro file today. I think the release note is sufficient documentation though, anyone new to using avro will already end up looking at the generated header and see it. |
…h enum directly, this avoids a manual static_cast (apache#3047)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything alarming from the code. But I never used avro in projects, so my vote can be considered as a half of the vote :)
What is the purpose of the change
AVRO-3984
This pull request should improve the code generated by avrogencpp for union types. The improvements are:
Getter return a reference
Previously the getters for a union branch returned a value:
This lead to a complete copy of the map, which may be really expensive depending on the use case.
In this pull request the getter return a reference instead, which allows the user to choose if a copy should be done.
Additionally a getter is created, that returns a mutable reference. This can be useful if a value needs to be added to a map or array.
Setters can take a r-value reference
Currently the setter of a union takes a reference to the value that should be set. This forces the user to copy the value here.
With this pull request a alternative overload is provided that takes a r-value reference. Now copies can be avoided, since the cheaper move assignment is called.
Additional Branch enum for each union type
Currently the
idx()
method is available to check which branch of a union is set.The issue is that the user needs to know which size_t value matches to which branch. This might lead to issues if values are inserted in the middle of a union and the indices change.
To avoid such issues a enum is generated that maps the branch types to the index. For a union of null, map and float this will look like this:
The user can then cast the value return by idx() to this enum and write a switch case statement
Verifying this change
I added unit tests for the new generated functions and enum class. Also existing test still pass.
Documentation