Skip to content

HIVE-29474: DESCRIBE FORMATTED for columns produces wrong output when…#6333

Open
tanishq-chugh wants to merge 6 commits intoapache:masterfrom
tanishq-chugh:desc_for_testing
Open

HIVE-29474: DESCRIBE FORMATTED for columns produces wrong output when…#6333
tanishq-chugh wants to merge 6 commits intoapache:masterfrom
tanishq-chugh:desc_for_testing

Conversation

@tanishq-chugh
Copy link
Contributor

@tanishq-chugh tanishq-chugh commented Feb 23, 2026

… the column datatype is STRUCT

What changes were proposed in this pull request?

Fix DESCRIBE FORMATTED for STRUCT datatype columns output

Why are the changes needed?

Currently, when we have a table with STRUCT datatype column, running the DESCRIBE FORMATTED query on the particular STRUCT column produces wrong output,
For Exm:
CREATE TABLE tbl_t (id int, point STRUCT<x:INT, y:INT>);
DESCRIBE FORMATTED tbl_t point;

gives the following output:
image

Here as we can observe that the col_name & data_type are wrong & instead should be point & struct<x:int,y:int> respectively.

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

Manual Testing & Qtest

Copy link
Contributor

@soumyakanti3578 soumyakanti3578 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HiveMetaStoreUtils#getFieldsFromDeserializer should be agnostic of where it was called from, so adding a boolean variable in its args is not ideal.

Also, I don't like the idea that we have to add false everywhere we are calling the method from, as the changes seem irrelevant, just to add true in a couple of places.

It seems we are running into this issue because the tableName in method getFieldsFromDeserializer is default.[tbl].point. This is then split into names:

    String[] names = tableName.split("\\.");
    String last_name = names[names.length - 1];
    for (int i = 2; i < names.length; i++) {

and since the length of names is 3, we go into the for loop to iterate through the fields of point.

You could probably just pass the colName instead of desc.getColumnPath() in:

Hive.getFieldsFromDeserializer(desc.getColumnPath(), deserializer, context.getConf()));

in DescTableOperation.java‎. This will force it to skip over the for loop as there will just be 1 item in names.

Probably you won't need to change anything anywhere else. Please try this and let's see if all the tests pass.

Copy link
Contributor

@soumyakanti3578 soumyakanti3578 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look into the sonarqube issues too, as I think they are valid.

DESCRIBE FORMATTED tbl_part Point;

DESCRIBE tbl_part id;
DESCRIBE tbl_part Point; No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing a new line here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed SonarQube issues in commit d05c394

Added new line in commit ce5bd65

@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 4, 2026

Copy link
Contributor

@soumyakanti3578 soumyakanti3578 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I will merge this soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants