Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-28046: Using serdeConstants fields instead of string literals in hive-exec module #5072

Closed
wants to merge 8 commits into from

Conversation

mlorek
Copy link
Contributor

@mlorek mlorek commented Feb 7, 2024

Using serdeConstants fields instead of string literals in hive-exec module

Why are the changes needed?
to improve the use of existing constants

Does this PR introduce any user-facing change?
No

Is the change a dependency upgrade?
No

How was this patch tested?
using existing tests

@mlorek mlorek marked this pull request as ready for review February 7, 2024 11:16
Copy link

sonarcloud bot commented Feb 7, 2024

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

4 New issues
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

@mlorek
Copy link
Contributor Author

mlorek commented Feb 7, 2024

@abstractdog can you take a look?

@mlorek
Copy link
Contributor Author

mlorek commented Feb 9, 2024

cc @ayushtkn

@mlorek
Copy link
Contributor Author

mlorek commented Feb 21, 2024

@zabetak amy idea who can review this PR?

Copy link
Contributor

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance it looks reasonable. I will take a better look tomorrow.

One thing that I am not sure if it's beneficial is using the serdeConstants in tests. I assume that we don't want to change the values of such constants cause we risk breaking compatibility. By hardcoding the string values in tests we can guard against such changes.

@abstractdog
Copy link
Contributor

At first glance it looks reasonable. I will take a better look tomorrow.

One thing that I am not sure if it's beneficial is using the serdeConstants in tests. I assume that we don't want to change the values of such constants cause we risk breaking compatibility. By hardcoding the string values in tests we can guard against such changes.

looks good to me

I don't think we're about to change these constants, and I cannot think of a dev scenario where we risk compatibility due to the fact that we're using the constants in tests

Copy link
Contributor

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch is a good starting point but it seems somewhat incomplete.

I attempted a replace based on the following semi-automatic procedure and found more places where serdeConstants should be used.

  1. Generate a sed script file using serdeConstants.java for the possible replacements
grep "public static final java.lang.String" ./serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java | sed 's/.\+java\.lang\.String//'| sed 's/ //g' | sed 's/\(.*\)=\(.*\);/s#\2#serdeConstants.\1#/' | sed 's/\./\\./g'  > sedrepscript
  1. Apply the script file to all java files in production code
find ql/src/java -name "*.java" -exec sed -i -f sedrepscript {} \;
  1. Manual review of change files adding import when necessary and reverting irrelevant changes.

All the changes can be found here:
6a338bb

Please review those changes and update this PR accordingly.

The replacement in test files is a controversial topic so I would suggest to exclude those changes from the patchset and target exclusively production files.

@zabetak
Copy link
Contributor

zabetak commented Feb 22, 2024

I don't think we're about to change these constants, and I cannot think of a dev scenario where we risk compatibility due to the fact that we're using the constants in tests

If a developer changes serdeConstants#INT_TYPE_NAME from "int" to "integer" then if the tests are using the constant they will all pass without issues hiding the fact that the type name change and this may be a breaking change for end-users. Doing a holistic replace in tests would require very careful review without a significant gain.

Also note that when we attempted a replace literals with constants in ql/src/test I got 84 files changed, 745 insertions(+), 745 deletions(-) which is much more than what is currently is covered by this PR.

Copy link

sonarcloud bot commented Mar 22, 2024

Quality Gate Passed Quality Gate passed

Issues
11 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

@mlorek
Copy link
Contributor Author

mlorek commented Mar 27, 2024

@zabetak could you have another look?

Copy link
Contributor

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will merge soon!

@zabetak zabetak closed this in a537000 Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants