Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADBDEV-5595] - Add support for new types in PXF #100

Open
wants to merge 27 commits into
base: pxf-6.x
Choose a base branch
from

Conversation

xardazzzzzz
Copy link

  • Refactor read and write parquet process
  • Support time, decimal, uuid, interval and their respective array types
  • Support list of lists for reading

RomaZe and others added 10 commits June 6, 2024 19:47
- Add integration tests for fragments distribution;
- Modify test environment adding additional segment host;
- Fix other integration tests because of the changing test environment.
- Refactor read and write parquet process
- Support time, decimal, uuid, interval and their respective array types
- Support list of lists for reading
- Refactor read and write parquet process
- Support time, decimal, uuid, interval and their respective array types
- Support list of lists for reading
@xardazzzzzz xardazzzzzz requested a review from RomaZe June 10, 2024 11:09
xardazzzzzz and others added 9 commits June 10, 2024 17:16
- Refactor read and write parquet process
- Support time, decimal, uuid, interval and their respective array types
- Support list of lists for reading
- Refactor read and write parquet process
- Support time, decimal, uuid, interval and their respective array types
- Support list of lists for reading
- Refactor read and write parquet process
- Support time, decimal, uuid, interval and their respective array types
- Support list of lists for reading
- Refactor read and write parquet process
- Support time, decimal, uuid, interval and their respective array types
- Support list of lists for reading
- Refactor read and write parquet process
- Support time, decimal, uuid, interval and their respective array types
- Support list of lists for reading
RomaZe
RomaZe previously approved these changes Jun 14, 2024
- Add tests
- Add binary various parsing
- Add bson to dependencies
xardazzzzzz and others added 3 commits June 19, 2024 14:25
- Add tests
- Add binary various parsing
- Add bson to dependencies
Refactor ListConstToStr function

The original implementation of the list_const_to_str (ListConstToStr) function
was intended to extract and format values from array constants into a string
buffer, supporting data types like int2[], int4[], int8[], and text[]. It
checked for null constants, logged exceptions, extracted the array, processed
each supported type with specific code blocks to deconstruct the array,
converted each element to a string, and appended the formatted data to a buffer.
However, the approach led to code duplication, maintenance challenges, and
reduced readability due to scattered, repetitive logic.

The refactored function consolidated common logic into reusable, unified steps.
It retrieves type information and deconstructs arrays using a single,
generalized procedure, eliminating repetitive code. This streamlined array
processing is supported by getTypeOutputInfo, which extracts output function
from the catalog, facilitating consistent conversion across all data types
with OidOutputFunctionCall.

Additionally, the refactored function simplifies its logic by centralizing the
handling of array elements and reducing the switch case complexity. This
enhances readability and maintainability, making it easier to understand and
modify. Adding new array types now involves minimal changes, creating a more
elegant, less error-prone codebase.
- Fix handling of arrays. Currently they're not supported on parquet library level. Trying <array-column> = array[val1] leads to 0 match
iamlapa
iamlapa previously approved these changes Jul 3, 2024

@Slf4j
@UtilityClass
public class ParquetIntervalUtilities {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic class. Shouldn't we add unit test for this class?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm magician. Added test

@@ -1,5 +1,7 @@
package org.greenplum.pxf.api.filter;

import org.greenplum.pxf.api.io.DataType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused import

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

import org.apache.parquet.schema.Type;
import org.greenplum.pxf.api.error.UnsupportedTypeException;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check please unused imports

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants