Add Apache parquet as an output format for enriched data #87

christoph-buente · 2017-06-26T12:38:20Z

In order to use S3 as a queryable data lake, it would be beneficial to store the enriched data in a columnar data format like Apache Parquet [1]. We did some performance tests with Athena, and it seems to perform best for parquet, as opposed to TSV.

Thanks!

[1] https://parquet.apache.org/

darrenhaken · 2018-08-23T08:42:46Z

I'm interested in picking this up, would a PR get merged for this work?

BenFradet · 2018-08-23T09:08:45Z

yup sure 👍

yoelb · 2018-10-19T08:03:38Z

Any update? @darrenhaken are you working on it?

darrenhaken · 2018-10-19T08:24:44Z

I’ve been away on leave, I can take a look when I’m back in a few weeks. Feel free to pick it up if you want it sooner though!

…

On Fri, 19 Oct 2018 at 17:04, Yoel Benharrous ***@***.***> wrote: Any update? @darrenhaken <https://github.com/darrenhaken> are you working on it? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#87 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA6Me5FQJeKQes8xFMpAhTQ3nq61ocLZks5umYdzgaJpZM4OFQHw> .

ollikaven · 2018-12-18T09:34:57Z

@darrenhaken any updates?

sumit-saurabh · 2018-12-24T10:49:35Z

Any update here?

itgd-techsupport · 2022-02-25T10:22:49Z

is there any progress on this

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Apache parquet as an output format for enriched data #87

Add Apache parquet as an output format for enriched data #87

christoph-buente commented Jun 26, 2017

darrenhaken commented Aug 23, 2018

BenFradet commented Aug 23, 2018

yoelb commented Oct 19, 2018

darrenhaken commented Oct 19, 2018 via email

ollikaven commented Dec 18, 2018

sumit-saurabh commented Dec 24, 2018

itgd-techsupport commented Feb 25, 2022

Navigation Menu

Add Apache parquet as an output format for enriched data #87

Add Apache parquet as an output format for enriched data #87

Comments

christoph-buente commented Jun 26, 2017

darrenhaken commented Aug 23, 2018

BenFradet commented Aug 23, 2018

yoelb commented Oct 19, 2018

darrenhaken commented Oct 19, 2018 via email

ollikaven commented Dec 18, 2018

sumit-saurabh commented Dec 24, 2018

itgd-techsupport commented Feb 25, 2022