Import apach-arrow from spark overlay #1192

littlewu2508 · 2023-03-20T14:13:25Z

These commits pulls dev-libs/apache-arrow, dev-python/pyarrow and their depedency from spark overlay

They are initially introduced into spark overlay because they are runtime dependencies of pyspark, responsible for parquet data format.

Nowadays we found the parquet data format is suitable for generic scientific computing, and pyspark is not the only use case, so I picked up these ebuilds, bumped its version, did some QA enhancements, and created this PR at ::science.

There are many optional features apache-arrow provided, and I do not have time to turn all the options into use flags. So I just add some useful ones for me (mainly for parquet IO). Also, there are many programming languages supported, these two ebuilds provides support for c++ and python. So, further contributions are always welcome.

Tests passed on amd64 Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>

littlewu2508 · 2023-03-20T14:13:33Z

/cc @heroxbd

littlewu2508 · 2023-03-20T14:38:35Z

@Berrysoft Currently I do not include https://github.com/6-6-6/spark-overlay/blob/e575584fc60afe93316316f5e3b8d4cd7ecab0e0/dev-libs/apache-arrow/files/apache-arrow-9.0.0-thrift-limit.patch you introduced. Can you give some explanations on this patch, and should it be included for generic usage?

Turns on all use flags, tests all passed: 100% tests passed, 0 tests failed out of 59 Label Time Summary: arrow-tests = 18.62 sec*proc (31 tests) arrow_compute = 15.69 sec*proc (12 tests) arrow_dataset = 5.38 sec*proc (9 tests) filesystem = 1.02 sec*proc (1 test) parquet-tests = 3.46 sec*proc (7 tests) unittest = 43.15 sec*proc (59 tests) Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>

Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>

Nowa-Ammerlaan · 2023-03-28T16:41:16Z

Thanks 👍

dev-cpp/xsimd: new package, add 10.0.0

0f02b3b

Tests passed on amd64 Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>

littlewu2508 force-pushed the apache-arrow branch from 5e1a249 to e47ffb4 Compare March 20, 2023 14:22

littlewu2508 added 2 commits March 27, 2023 16:39

dev-python/pyarrow: new package, add 11.0.0

a3bfedc

Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>

littlewu2508 force-pushed the apache-arrow branch from e47ffb4 to a3bfedc Compare March 27, 2023 08:40

gentoo-bot closed this in 9329a3b Mar 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import apach-arrow from spark overlay #1192

Import apach-arrow from spark overlay #1192

littlewu2508 commented Mar 20, 2023

littlewu2508 commented Mar 20, 2023

littlewu2508 commented Mar 20, 2023

Nowa-Ammerlaan commented Mar 28, 2023

Import apach-arrow from spark overlay #1192

Import apach-arrow from spark overlay #1192

Conversation

littlewu2508 commented Mar 20, 2023

littlewu2508 commented Mar 20, 2023

littlewu2508 commented Mar 20, 2023

Nowa-Ammerlaan commented Mar 28, 2023