Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-37148: [C++] Explicitly list the integer values of the Type::type enum #37149

Merged
merged 2 commits into from
Aug 16, 2023

Conversation

felipecrv
Copy link
Contributor

@felipecrv felipecrv commented Aug 14, 2023

Rationale for this change

The enum type in C++ leaves the integer values implicitly defined and let the compiler assign the values to them automatically. This means an insertion of a new entry that is not at the end, causes implementations that rely on specific values (like R) to break with confusing error messages [1].

Assigning the values explicitly can communicate that these enum entry values are relied upon and can allow a more natural ordering of the list that is different from the numeric order the entries receive.

[1] #37091

What changes are included in this PR?

  • Setting numeric values of the enum entries explicitly
  • Completing an equivalent enum on the R side

Are these changes tested?

N/A

Are there any user-facing changes?

No.

As these are relied upon by code in R and Go implementations.
@felipecrv felipecrv changed the title GH-37148 [C++ GH-37148 [C++]: Explicitly list the integer values of the Type::type enum Aug 14, 2023
@github-actions github-actions bot added the awaiting review Awaiting review label Aug 14, 2023
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Aug 14, 2023
@pitrou
Copy link
Member

pitrou commented Aug 14, 2023

Would should we do about MAX_ID then?

@felipecrv
Copy link
Contributor Author

Would should we do about MAX_ID then?

Leave it unset, so the compiler sets it to the maximum.

@pitrou pitrou requested a review from bkietz August 14, 2023 15:43
@@ -79,6 +79,8 @@ Type <- enum("Type::type",
LARGE_STRING = 34L,
LARGE_BINARY = 35L,
LARGE_LIST = 36L
INTERVAL_MONTH_DAY_NANO = 37L,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this! Just missing a comma on the line above this one, and then hopefully tests should pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I fixed it and pushed again.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels Aug 14, 2023
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Aug 14, 2023
@kou kou changed the title GH-37148 [C++]: Explicitly list the integer values of the Type::type enum GH-37148: [C++] Explicitly list the integer values of the Type::type enum Aug 14, 2023
@kou kou changed the title GH-37148: [C++] Explicitly list the integer values of the Type::type enum GH-37148: [C++] Explicitly list the integer values of the Type::type enum Aug 14, 2023
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Aug 16, 2023
@thisisnic
Copy link
Member

CI failures look to be a timeout and something with the dev script - I've re-triggered them both, but I think we can probably merge this once we're happy that's all they are?

Copy link
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit values for a large enumeration seems more readable to me, thanks!

@felipecrv
Copy link
Contributor Author

@bkietz can you merge this?

@bkietz bkietz merged commit 8bdfc8c into apache:main Aug 16, 2023
38 of 39 checks passed
@bkietz bkietz removed the awaiting merge Awaiting merge label Aug 16, 2023
@felipecrv felipecrv deleted the explicit_type_enum branch August 17, 2023 13:58
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 8bdfc8c.

There were 9 benchmark results indicating a performance regression:

The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them.

loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…:type enum (apache#37149)

### Rationale for this change

The enum type in C++ leaves the integer values implicitly defined and let the compiler assign the values to them automatically. This means an insertion of a new entry that is not at the end, causes implementations that rely on specific values (like R) to break with confusing error messages [1].

Assigning the values explicitly can communicate that these enum entry values are relied upon and can allow a more natural ordering of the list that is different from the numeric order the entries receive.

[1] apache#37091

### What changes are included in this PR?

 - Setting numeric values of the enum entries explicitly
 - Completing an equivalent enum on the R side 
 
### Are these changes tested?

N/A

### Are there any user-facing changes?

No.
* Closes: apache#37148

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] Implementation (like R) rely on implicit numeric assignment of enum Type entries
5 participants