Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query with map field got wrong result. #1321

Closed
wants to merge 5 commits into from

Conversation

JackyWoo
Copy link
Member

@JackyWoo JackyWoo commented Aug 19, 2019

We use hive 1.x and presto 315 in my company. And we have a table which is defined by hive like below :

CREATE TABLE T.A(
ip string,
event_info map<string,string>)
PARTITIONED BY (
dayno string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe'
WITH SERDEPROPERTIES (
'colelction.delim'=',',
'field.delim'='\t',
'mapkey.delim'=':',
'serialization.format'='\t')
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
LOCATION
'hdfs://xxx'
`

When I send a query to presto :

select cardinality(event_info), event_info from T.A limit 1

I get the result which is wrong :

1 {fk_1=1,fk_2:2,fk_3:3}

Finally I find that for map field collection deliminator config key hive 1.x use "colelction.delim" but hive 3.x use "collection.delim".

Prestosql 315 relies on hive 3.x, so it use “collection.delim” config.

Resolution : when getting deliminator first use “collection.delim” config, if get null then use “colelction.delim” config.

So prestosql 315 can query both hive 1.x and 3.x defined table correctly.

@cla-bot
Copy link

cla-bot bot commented Aug 19, 2019

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@prestosql.io. For more information, see https://github.com/prestosql/cla.

@cla-bot
Copy link

cla-bot bot commented Aug 19, 2019

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@prestosql.io. For more information, see https://github.com/prestosql/cla.

@findepi
Copy link
Member

findepi commented Aug 19, 2019

Thanks for reporting this & fixing!

Finally I find that for map field collection deliminator config key hive 1.x use "collection.delim" but hive 3.x use "colelction.delim".

@JackyWoo is it reversed?

@findepi
Copy link
Member

findepi commented Aug 19, 2019

@findepi findepi requested a review from electrum August 19, 2019 15:28
@JackyWoo
Copy link
Member Author

@electrum @findepi Thanks for advising, I correct it.

@cla-bot
Copy link

cla-bot bot commented Aug 20, 2019

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@prestosql.io. For more information, see https://github.com/prestosql/cla.

@findepi
Copy link
Member

findepi commented Aug 20, 2019

@JackyWoo would you be able to sign the CLA? See https://github.com/prestosql/cla for more info.

@JackyWoo
Copy link
Member Author

@findepi I have submit CLA to cla@prestosql.io.

@martint
Copy link
Member

martint commented Aug 20, 2019

@cla-bot check

@cla-bot cla-bot bot added the cla-signed label Aug 20, 2019
@cla-bot
Copy link

cla-bot bot commented Aug 20, 2019

The cla-bot has been summoned, and re-checked this pull request!

Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @JackyWoo could you please squash the commits?

JackyWoo and others added 2 commits August 21, 2019 10:07
Co-Authored-By: Piotr Findeisen <piotr.findeisen@gmail.com>
Co-Authored-By: Piotr Findeisen <piotr.findeisen@gmail.com>
@JackyWoo
Copy link
Member Author

@findepi Thanks for help, it is done.

@findepi
Copy link
Member

findepi commented Aug 21, 2019

Merged as aaf43e7, thanks!

@findepi findepi closed this Aug 21, 2019
@findepi findepi mentioned this pull request Aug 21, 2019
@findepi findepi added this to the 318 milestone Aug 21, 2019
Anurag870 added a commit to Anurag870/presto that referenced this pull request Oct 12, 2019
…nts#COLLECTION_DELIM, but also introduced a breaking change.

trinodb#1321 fixed this compatibility issue for RC files, while leaving TEXT files out.
@findepi findepi mentioned this pull request Oct 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants