Skip to content
This repository was archived by the owner on May 10, 2024. It is now read-only.

Conversation

@majetideepak
Copy link

This PR adds support for INT96 and FIXED_LEN_BYTE_ARRAY types.
It modifies the examples and DebugPrint to handle these types.

@majetideepak
Copy link
Author

The first check passed on Travis CI https://travis-ci.org/apache/parquet-cpp/builds/105457807

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about dictionary encoding (see the BYTE_ARRAY specialization there)? It's hard to verify that it works without tests, though, we should try to address that soon.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add that now. Actually its rare for these types to be dictionary encoded.
I tested this on my side and it seems to work okay. We should definitely add tests after PARQUET-435

@wesm
Copy link
Member

wesm commented Jan 28, 2016

Can you create src/parquet/types-test.cc and add a unit test asserting ASSERT_EQ(12, sizeof(Int96))? We can add more unit tests relating to types as we go.

@majetideepak
Copy link
Author

The macro STRUCT_END(Int96, 12) includes a static assert. I checked that this works locally.
Do you still want to add a test ?

@wesm
Copy link
Member

wesm commented Jan 28, 2016

Ah good point, thanks. I think it's fine to omit the test, then. We should eventually move parquet_type_to_string to types.h and add tests that verify the type names (and other properties and utilities we add).

@julienledem
Copy link
Member

Is this one good to go?
@wesm @majetideepak

@majetideepak
Copy link
Author

Rebased and good to go from my side.

@wesm
Copy link
Member

wesm commented Jan 28, 2016

lgtm +1. @asandryh will need to fix up minor conflicts #29

@asfgit asfgit closed this in 89f5a54 Jan 29, 2016
@Syed-SnapLogic
Copy link

I am using java hadoop and parquet library. My parquet schema is:
message myschema {
optional fixed_len_byte_array(5) decimal_field (DECIMAL(10,6));
}

And my value is of type BigDecimal. And I am writing to parquet file as follows:

value = value.toString();
// recordConsumer is an instance of org.apache.parquet.io.api.RecordConsumer
recordConsumer.addBinary(Binary.fromString((String) value));

I am getting exception:

java.lang.IllegalArgumentException: Fixed Binary size 11 does not match field type length 5

Any ideas how to resolve it?

@wesm
Copy link
Member

wesm commented Oct 7, 2019

You're in the wrong place -- this is the original repo for the Parquet C++ project

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants