Skip to content

jerolba/parquet-for-java

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Parquet for Java

Sample code of "Parquet for Java" talk.

Load data using:

  • Apache Avro:
    • Generic Records
    • Reflection
    • Generated Code
  • Protocol Buffers
  • Carpet
  • DuckDB

Write data using:

  • Apache Avro:
    • Generic Records
    • Reflection
    • Generated Code
  • Protocol Buffers
  • Carpet

Commands

To fetch sample data set:

wget https://d37ci6vzurychx.cloudfront.net/trip-data/fhvhv_tripdata_2022-01.parquet

To generate Protocol buffers classes

docker run --rm -v $(pwd):$(pwd) -w $(pwd) znly/protoc --java_out=./src/main/java -I=./src/main/resources ./src/main/resources/trips.proto2

To generate Avro classes:

java -jar avro-tools-1.10.0.jar compile schema ./src/main/resources/organizations.avsc ./src/main/java/
docker run --rm -v $(pwd)/src:/avro/src kpnnv/avro-tools:1.10.0 compile schema /avro/src/main/resources/trips.avsc /avro/src/main/java/

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published