Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Website] January 2023 Rust Apache Arrow 32.0.0 Highlights #305

Merged
merged 8 commits into from
Feb 14, 2023

Conversation

tustvold
Copy link
Contributor

@tustvold tustvold commented Jan 26, 2023

This is a pretty rough first draft, but wanted to get something up to solicit feedback

Update: this is ready for review

@alamb
Copy link
Contributor

alamb commented Jan 30, 2023

I plan to review this carefully today

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great -- thank you @tustvold

What would you think about breaking this into three smaller posts (arrow/arrow-flight, parquet, and object_store)? I think we may end up hiding some of the parquet and object store content with one massive omnibus post...

_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
* **Arbitrarily Nested Schema**: arbitrarily nested schemas can be read to and written from arrow, see [here](https://arrow.apache.org/blog/2022/10/05/arrow-parquet-encoding-part-1/) for more information, thanks to @tustvold
* **Predicate Pushdown**: the arrow reader now supports advanced predicate pushdown, including late materialization, see [here](https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/) for more information, thanks to @tustvold, @Ted-Jiang, @thinkharderdev, and @liukun4515
* **Bloom Filter Support**: support for both reading and writing bloom filters has been added, thanks to @Jimexist and @viirya
* **Additional CLI Tools**: additional CLI tools have been added for introspecting and manipulating parquet data, thanks to @tustvold, @Jimexist, @crepererum and @bmmeijers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be neat to add a link here to a readme that explained the tools and how to use them

@alamb alamb changed the title January 2023 Rust Apache Arrow 32.0.0 Highlights [Website] January 2023 Rust Apache Arrow 32.0.0 Highlights Feb 7, 2023
@alamb
Copy link
Contributor

alamb commented Feb 11, 2023

Update here is I am going to try and help accelerate getting this post ready for publishing.

@alamb
Copy link
Contributor

alamb commented Feb 11, 2023

I took the liberty of pushing several changes to this branch; I know thing it is ready for review by the larger community

@alamb alamb marked this pull request as ready for review February 11, 2023 17:27
Copy link
Contributor Author

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just some minor nits, thank you for picking this up. I can't approve, as it is technically still my PR, but I would if I could

* **Support for Copy-On-Write**: Arrow arrays now support copy-on-write, via the [`into_builder`](https://docs.rs/arrow/32.0.0/arrow/array/struct.ArrayData.html#method.into_builder) methods
* **Comparable Row Format**: [Much faster multi-column Sorting and Grouping](https://arrow.apache.org/blog/2022/11/07/multi-column-sorts-in-arrow-rust-part-1/) is now possible with the the new spillable, comparable [row-format](https://docs.rs/arrow-row/32.0.0/arrow_row/index.html)
* **FlightSQL Support**: [FlightSQL](https://arrow.apache.org/docs/format/FlightSql.html) [support](https://docs.rs/arrow-flight/32.0.0/arrow_flight/sql/index.html) has been expanded
* **Mid-Level Flight Client**: A new [FlightClight](https://docs.rs/arrow-flight/32.0.0/arrow_flight/client/struct.FlightClient.html) is available that handles lower level protocol details, and easier to use [encoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/encode/struct.FlightDataEncoderBuilder.html) and [decoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/decode/struct.FlightDataDecoder.html) APIs.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* **Mid-Level Flight Client**: A new [FlightClight](https://docs.rs/arrow-flight/32.0.0/arrow_flight/client/struct.FlightClient.html) is available that handles lower level protocol details, and easier to use [encoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/encode/struct.FlightDataEncoderBuilder.html) and [decoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/decode/struct.FlightDataDecoder.html) APIs.
* **Mid-Level Flight Client**: A new [FlightClight](https://docs.rs/arrow-flight/32.0.0/arrow_flight/client/struct.FlightClient.html) is available that handles lower level protocol details, and easier to use [encoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/encode/struct.FlightDataEncoderBuilder.html) and [decoding](https://docs.rs/arrow-flight/32.0.0/arrow_flight/decode/struct.FlightDataDecoder.html) APIs.

_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
_posts/2023-01-26-rust-32.0.0.md Outdated Show resolved Hide resolved
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
alamb and others added 3 commits February 12, 2023 17:45
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
@alamb
Copy link
Contributor

alamb commented Feb 14, 2023

I updated the date on this blog to reflect the publishing date and then will merge

@alamb alamb merged commit c4fe669 into apache:master Feb 14, 2023
alamb added a commit that referenced this pull request Feb 14, 2023
The publish job did not complete and thus
#305 is not yet visible

https://github.com/apache/arrow-site/actions

<img width="1398" alt="Screenshot 2023-02-14 at 10 20 23 AM"
src="https://user-images.githubusercontent.com/490673/218780518-423f7b68-fd60-478c-a5a3-8460f5e2ba57.png">

```
https://github.com/apache/arrow-site/actions/runs/4175072450/jobs/7229492372#step:10:63Run git config user.name "$(git log -1 --pretty=format:%an)"
From https://github.com/apache/arrow-site
 * [new branch]              asf-site   -> deploy/asf-site
 * [new branch]              master     -> deploy/master
Switched to a new branch 'asf-site'
branch 'asf-site' set up to track 'deploy/asf-site'.
[asf-site 5e2e8da16ba] Updating built site (build c4fe669)
 44 files changed, 726 insertions(+), 387 deletions(-)
 create mode 100644 blog/2023/02/13/rust-32.0.0/index.html
To https://github.com/apache/arrow-site.git
 ! [rejected]                asf-site -> asf-site (fetch first)
error: failed to push some refs to 'https://github.com/apache/arrow-site.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

```

This may be because I merged multiple PRs without the previous job
completing.

I am making some minor changes to the blog post to see if I can get the
next job to deploy.
@alamb alamb mentioned this pull request Feb 14, 2023
alamb added a commit that referenced this pull request Feb 14, 2023
🤦  I introduced in #305
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants