Skip to content

Conversation

@rahul123mth
Copy link

No description provided.

@julienledem
Copy link
Member

this will happen in the next release, not in a PR. Could you close this?

@hubot hubot deleted the release-2.2.0-rc2 branch April 28, 2017 16:11
@hubot hubot restored the release-2.2.0-rc2 branch April 28, 2017 21:06
@pono pono closed this Jun 15, 2017
lekv pushed a commit to lekv/parquet-format that referenced this pull request Jul 31, 2017
…-schema utility

Several inter-related things here:

* Added SchemaDescriptor and ColumnDescriptor types to hold computed structure
  information (e.g. max ref/def levels) about the file schema. These are used
  now in the FileReader and ColumnReader
* I also added, very similar to parquet-mr (though leaned down), a logical
  schema node class structure which can be used for both the file reading and
  writing.
* Added FlatSchemaConverter to convert Parquet flat schema metadata into a
  nested logical schema
* Added a SchemaPrinter tool and parquet-dump-schema CLI tool to visit a nested
  schema and print it to the console.
* Another big thing here is that per PARQUET-446 and related work in
  parquet-mr, it's important for both the public API of this project and
  internal development to limit our coupling to the compiled Thrift headers. I
  added `Type`, `Repetition`, and `LogicalType` enums to the `parquet_cpp`
  namespace and inverted the dependency between the column readers, scanners,
  and encoders to use these enums.
* A bunch of unit tests.

Author: Wes McKinney <wes@cloudera.com>

Closes apache#38 from wesm/PARQUET-442 and squashes the following commits:

9ca0219 [Wes McKinney] Add a unit test for SchemaPrinter
fdd37cd [Wes McKinney] Comment re: FLBA node ctor
3a15c0c [Wes McKinney] Add some SchemaDescriptor and ColumnDescriptor tests
27e1805 [Wes McKinney] Don't squash supplied CMAKE_CXX_FLAGS
76dd283 [Wes McKinney] Refactor Make* methods as static member functions
2fae8cd [Wes McKinney] Trim some includes
b2e2661 [Wes McKinney] More doc about the parquet_cpp enums
bd78d7c [Wes McKinney] Move metadata enums to parquet/types.h and add rest of parquet:: enums. Add NONE value to Compression
415305b [Wes McKinney] cpplint
4ac84aa [Wes McKinney] Refactor to make PrimitiveNode and GroupNode ctors private. Add MakePrimitive and MakeGroup factory functions. Move parquet::SchemaElement function into static FromParquet ctors so can set private members
3169b24 [Wes McKinney] NewPrimitive should set num_children = 0 always
954658e [Wes McKinney] Add a comment for TestSchemaConverter.InvalidRoot and uncomment tests for root nodes of other repetition types
55d21b0 [Wes McKinney] Remove schema-builder-test.cc
71c1eab [Wes McKinney] Remove crufty builder.h, will revisit
7ef2dee [Wes McKinney] Fix list encoding comment
8c5af4e [Wes McKinney] Remove old comment, unneeded cast
6b041c5 [Wes McKinney] First draft SchemaDescriptor::Init. Refactor to use ColumnDescriptor. Standardize on parquet_cpp enums instead of Thrift metadata structs. Limit #include from Thrift
841ae7f [Wes McKinney] Don't export SchemaPrinter for now
834389a [Wes McKinney] Add Node::Visotor API and implement a simple schema dump CLI tool
a8bf5c8 [Wes McKinney] Catch and throw exception (instead of core dump) if run out of schema children. Add a Node::Visitor abstract API
bde8b18 [Wes McKinney] Can compare FLBA type metadata in logical schemas
f0df0ba [Wes McKinney] Finish a nested schema conversion test
0af0161 [Wes McKinney] Check that root schema node is repeated
5df00aa [Wes McKinney] Expose GroupConverter API, add test for invalid root
beaa99f [Wes McKinney] Refactor slightly and add an FLBA test
6e248b8 [Wes McKinney] Schema tree conversion first cut, add a couple primitive tests
9685c90 [Wes McKinney] Rename Schema -> RootSchema and add another unit test
f7d0487 [Wes McKinney] Schema types test coverage, move more methods into compilation unit
d746352 [Wes McKinney] Better isolate thrift dependency. Move schema/column descriptor into its own header
a8e5a0a [Wes McKinney] Tweaks
fb9d7ad [Wes McKinney] Draft of flat to nested schema conversion. No tests yet
3015063 [Wes McKinney] More prototyping. Rename Type -> Node. PrimitiveNode factory functions
a8a7a01 [Wes McKinney] Start drafting schema types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants