parquet
Here are 53 public repositories matching this topic...
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
-
Updated
Mar 23, 2024 - Scala
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
-
Updated
Mar 5, 2020 - Scala
GCS support for avro-tools, parquet-tools and protobuf
-
Updated
May 8, 2024 - Scala
A set of connectors for Monix. 🔛
-
Updated
May 7, 2024 - Scala
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
-
Updated
Nov 16, 2022 - Scala
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
-
Updated
Apr 19, 2024 - Scala
A collection of Apache Parquet add-on modules
-
Updated
May 8, 2024 - Scala
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
-
Updated
Jun 7, 2021 - Scala
Embulk (https://github.com/embulk/embulk/) output plugin to dump records as Apache Parquet (https://parquet.apache.org/) files on S3.
-
Updated
Feb 14, 2023 - Scala
Improve this page
Add a description, image, and links to the parquet topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the parquet topic, visit your repo's landing page and select "manage topics."