Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An unsupported nested encoding was found. #10397

Open
JkSelf opened this issue Jul 4, 2024 · 5 comments
Open

An unsupported nested encoding was found. #10397

JkSelf opened this issue Jul 4, 2024 · 5 comments
Assignees
Labels
bug Something isn't working parquet triage Newly created issue that needs attention.

Comments

@JkSelf
Copy link
Collaborator

JkSelf commented Jul 4, 2024

Bug description

When we execute the SQL statement below using Gluten, it raises an exception with the message 'An unsupported nested encoding was found.

sql("create table map_table(a map<bigint, string>) using parquet")
sql("insert into table map_table select map(1, 'hello')")

System information

Velox System Info v0.0.2
Commit: 467812f
CMake Version: 3.28.3
System: Linux-5.4.0-167-generic
Arch: x86_64
C++ Compiler: /usr/bin/c++
C++ Compiler Version: 9.4.0
C Compiler: /usr/bin/cc
C Compiler Version: 9.4.0
CMake Prefix Path: /usr/local;/usr;/;/usr/local/lib/python3.8/dist-packages/cmake/data;/usr/local;/usr/X11R6;/usr/pkg;/opt

Relevant logs

Caused by: org.apache.gluten.exception.GlutenException: java.lang.RuntimeException: Exception: VeloxRuntimeError
 Error Source: RUNTIME
 Error Code: INVALID_STATE
 Reason: An unsupported nested encoding was found.
 Retriable: False
 Expression: vec.valueVector() == nullptr || vec.wrappedVector()->isFlatEncoding()
 Context: Operator: TableWrite[2] 2
 Function: exportFlattenedVector
 File: /__w/incubator-gluten/incubator-gluten/ep/build-velox/build/velox_ep/velox/vector/arrow/Bridge.cpp
 Line: 884
 Stack trace:
 # 0  _ZN8facebook5velox7process10StackTraceC1Ei
 # 1  _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
 # 2  _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorEPKcEEvRKNS1_18VeloxCheckFailArgsET0_
 # 3  _ZN8facebook5velox12_GLOBAL__N_121exportFlattenedVectorERKNS0_10BaseVectorERKNS1_9SelectionERK12ArrowOptionsR10ArrowArrayPNS0_6memory10MemoryPoolERNS1_24VeloxToArrowBridgeHolderE
 # 4  _ZN8facebook5velox12_GLOBAL__N_117exportToArrowImplERKNS0_10BaseVectorERKNS1_9SelectionERK12ArrowOptionsR10ArrowArrayPNS0_6memory10MemoryPoolE
 # 5  _ZN8facebook5velox12_GLOBAL__N_117exportToArrowImplERKNS0_10BaseVectorERKNS1_9SelectionERK12ArrowOptionsR10ArrowArrayPNS0_6memory10MemoryPoolE
 # 6  _ZN8facebook5velox13exportToArrowERKSt10shared_ptrINS0_10BaseVectorEER10ArrowArrayPNS0_6memory10MemoryPoolERK12ArrowOptions
 # 7  _ZN8facebook5velox7parquet6Writer5writeERKSt10shared_ptrINS0_10BaseVectorEE
 # 8  _ZN8facebook5velox9connector4hive12HiveDataSink5writeEmSt10shared_ptrINS0_9RowVectorEE
 # 9  _ZN8facebook5velox9connector4hive12HiveDataSink10appendDataESt10shared_ptrINS0_9RowVectorEE
 # 10 _ZN8facebook5velox4exec11TableWriter8addInputESt10shared_ptrINS0_9RowVectorEE
 # 11 _ZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEE
 # 12 _ZN8facebook5velox4exec6Driver4nextERSt10shared_ptrINS1_13BlockingStateEE
 # 13 _ZN8facebook5velox4exec4Task4nextEPN5folly10SemiFutureINS3_4UnitEEE
 # 14 _ZN6gluten24WholeStageResultIterator4nextEv
 # 15 Java_org_apache_gluten_vectorized_ColumnarBatchOutIterator_nativeHasNext
 # 16 0x00007f8833c9cfb0
@JkSelf JkSelf added bug Something isn't working triage Newly created issue that needs attention. labels Jul 4, 2024
@JkSelf
Copy link
Collaborator Author

JkSelf commented Jul 4, 2024

@mbasmanova @majetideepak @rui-mo Do you have any input? Thanks.

@rui-mo
Copy link
Collaborator

rui-mo commented Jul 4, 2024

@JkSelf I met the same issue and please check the discussion in this issue #9821.

@yingsu00
Copy link
Collaborator

yingsu00 commented Jul 5, 2024

I think we will need to support writing non-flat vectors in Parquet writer. @majetideepak what do you think?

@majetideepak
Copy link
Collaborator

We do support writing regular dictionaries to Parquet. See #7025
I am curious as to why complex types have a problem. I'll look into this.
We know constant vectors are flattened until Arrow supports writing REE encoding to Parquet. There is some issue somewhere for this.

@majetideepak majetideepak self-assigned this Jul 6, 2024
@Yohahaha
Copy link
Contributor

Yohahaha commented Jul 8, 2024

#9406

I have a draft PR to support flatten complex vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working parquet triage Newly created issue that needs attention.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants