-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Website] January 2023 Rust Apache Arrow 32.0.0 Highlights #305
Conversation
I plan to review this carefully today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great -- thank you @tustvold
What would you think about breaking this into three smaller posts (arrow/arrow-flight, parquet, and object_store)? I think we may end up hiding some of the parquet and object store content with one massive omnibus post...
_posts/2023-01-26-rust-32.0.0.md
Outdated
* **Arbitrarily Nested Schema**: arbitrarily nested schemas can be read to and written from arrow, see [here](https://arrow.apache.org/blog/2022/10/05/arrow-parquet-encoding-part-1/) for more information, thanks to @tustvold | ||
* **Predicate Pushdown**: the arrow reader now supports advanced predicate pushdown, including late materialization, see [here](https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/) for more information, thanks to @tustvold, @Ted-Jiang, @thinkharderdev, and @liukun4515 | ||
* **Bloom Filter Support**: support for both reading and writing bloom filters has been added, thanks to @Jimexist and @viirya | ||
* **Additional CLI Tools**: additional CLI tools have been added for introspecting and manipulating parquet data, thanks to @tustvold, @Jimexist, @crepererum and @bmmeijers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be neat to add a link here to a readme that explained the tools and how to use them
Update here is I am going to try and help accelerate getting this post ready for publishing. |
I took the liberty of pushing several changes to this branch; I know thing it is ready for review by the larger community |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly just some minor nits, thank you for picking this up. I can't approve, as it is technically still my PR, but I would if I could
_posts/2023-01-26-rust-32.0.0.md
Outdated
* **Support for Copy-On-Write**: Arrow arrays now support copy-on-write, via the [`into_builder`](https://docs.rs/arrow/32.0.0/arrow/array/struct.ArrayData.html#method.into_builder) methods | ||
* **Comparable Row Format**: [Much faster multi-column Sorting and Grouping](https://arrow.apache.org/blog/2022/11/07/multi-column-sorts-in-arrow-rust-part-1/) is now possible with the the new spillable, comparable [row-format](https://docs.rs/arrow-row/32.0.0/arrow_row/index.html) | ||
* **FlightSQL Support**: [FlightSQL](https://arrow.apache.org/docs/format/FlightSql.html) [support](https://docs.rs/arrow-flight/32.0.0/arrow_flight/sql/index.html) has been expanded | ||
* **Mid-Level Flight Client**: A new [FlightClight](https://docs.rs/arrow-flight/32.0.0/arrow_flight/client/struct.FlightClient.html) is available that handles lower level protocol details, and easier to use [encoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/encode/struct.FlightDataEncoderBuilder.html) and [decoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/decode/struct.FlightDataDecoder.html) APIs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* **Mid-Level Flight Client**: A new [FlightClight](https://docs.rs/arrow-flight/32.0.0/arrow_flight/client/struct.FlightClient.html) is available that handles lower level protocol details, and easier to use [encoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/encode/struct.FlightDataEncoderBuilder.html) and [decoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/decode/struct.FlightDataDecoder.html) APIs. | |
* **Mid-Level Flight Client**: A new [FlightClight](https://docs.rs/arrow-flight/32.0.0/arrow_flight/client/struct.FlightClient.html) is available that handles lower level protocol details, and easier to use [encoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/encode/struct.FlightDataEncoderBuilder.html) and [decoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/decode/struct.FlightDataDecoder.html) APIs. |
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
I updated the date on this blog to reflect the publishing date and then will merge |
The publish job did not complete and thus #305 is not yet visible https://github.com/apache/arrow-site/actions <img width="1398" alt="Screenshot 2023-02-14 at 10 20 23 AM" src="https://user-images.githubusercontent.com/490673/218780518-423f7b68-fd60-478c-a5a3-8460f5e2ba57.png"> ``` https://github.com/apache/arrow-site/actions/runs/4175072450/jobs/7229492372#step:10:63Run git config user.name "$(git log -1 --pretty=format:%an)" From https://github.com/apache/arrow-site * [new branch] asf-site -> deploy/asf-site * [new branch] master -> deploy/master Switched to a new branch 'asf-site' branch 'asf-site' set up to track 'deploy/asf-site'. [asf-site 5e2e8da16ba] Updating built site (build c4fe669) 44 files changed, 726 insertions(+), 387 deletions(-) create mode 100644 blog/2023/02/13/rust-32.0.0/index.html To https://github.com/apache/arrow-site.git ! [rejected] asf-site -> asf-site (fetch first) error: failed to push some refs to 'https://github.com/apache/arrow-site.git' hint: Updates were rejected because the remote contains work that you do hint: not have locally. This is usually caused by another repository pushing hint: to the same ref. You may want to first integrate the remote changes hint: (e.g., 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details. ``` This may be because I merged multiple PRs without the previous job completing. I am making some minor changes to the blog post to see if I can get the next job to deploy.
This is a pretty rough first draft, but wanted to get something up to solicit feedbackUpdate: this is ready for review