Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ArrowArray and ArrowArrayStream C++ iterators #404

Merged
merged 20 commits into from
May 2, 2024

Conversation

bkietz
Copy link
Member

@bkietz bkietz commented Mar 18, 2024

All of this should be c++11

-  auto arrow_array = ImportArray(&array, large_utf8());
-  ARROW_EXPECT_OK(arrow_array);
-
-  auto builder = LargeStringBuilder();
-  ARROW_EXPECT_OK(builder.Append("1234"));
-  ARROW_EXPECT_OK(builder.AppendNulls(2));
-  ARROW_EXPECT_OK(builder.Append("56789"));
-  ARROW_EXPECT_OK(builder.AppendEmptyValue());
-  auto expected_array = builder.Finish();
-
-  EXPECT_TRUE(arrow_array.ValueUnsafe()->Equals(expected_array.ValueUnsafe()));
+  EXPECT_THAT(nanoarrow::ViewAsBytes<64>(&array),
+              ElementsAre("1234"_v, kNull, kNull, "56789"_v, ""_v));

fixes #403

@bkietz bkietz requested a review from paleolimbot March 18, 2024 18:24
@codecov-commenter
Copy link

codecov-commenter commented Mar 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.78%. Comparing base (dc50114) to head (89e73bc).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #404      +/-   ##
==========================================
+ Coverage   88.74%   88.78%   +0.03%     
==========================================
  Files          81       81              
  Lines       14398    14486      +88     
==========================================
+ Hits        12778    12861      +83     
- Misses       1620     1625       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're still working on this...just a few early comments!

src/nanoarrow/nanoarrow.hpp Outdated Show resolved Hide resolved
src/nanoarrow/nanoarrow.hpp Outdated Show resolved Hide resolved
src/nanoarrow/nanoarrow.hpp Outdated Show resolved Hide resolved
src/nanoarrow/nanoarrow.hpp Outdated Show resolved Hide resolved
src/nanoarrow/nanoarrow.hpp Outdated Show resolved Hide resolved
src/nanoarrow/nanoarrow.hpp Outdated Show resolved Hide resolved
Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In file included from /home/runner/work/arrow-nanoarrow/arrow-nanoarrow/dev/benchmarks/c/array_benchmark.cc:22:
  /home/runner/work/arrow-nanoarrow/arrow-nanoarrow/dev/benchmarks/c/util.h:25:2: error: #error "Zip cannot be supported without C++17 features"
     25 | #error "Zip cannot be supported without C++17 features"
        |  ^~~~~

...might be fixed by set(CMAKE_CXX_STANDARD 17) in the dev/benchmarks/CMakeLists.txt (unless you were checking that it failed on purpose, in which case ignore me 🙂 ).

src/nanoarrow/nanoarrow.hpp Show resolved Hide resolved
@bkietz bkietz marked this pull request as ready for review April 19, 2024 23:42
Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this!

Just small comments from me...the big one is that I'd prefer to put the ViewArrayAs helper in nanoarrow_testing.hpp if we don't have the bandwidth to dedicate test coverage for it right now.

dist/nanoarrow_testing.hpp Outdated Show resolved Hide resolved
src/nanoarrow/array_stream_test.cc Outdated Show resolved Hide resolved
src/nanoarrow/array_test.cc Outdated Show resolved Hide resolved
src/nanoarrow/nanoarrow.hpp Outdated Show resolved Hide resolved
src/nanoarrow/nanoarrow.hpp Show resolved Hide resolved
src/nanoarrow/nanoarrow.hpp Outdated Show resolved Hide resolved
Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Pushes to the branch at Ben's invitation 🙂 )

Thank you for bearing with me on this PR and thank you for the tests! I made a few changes to satisfy the constraints enforced by CI here:

  • Because we test against the bundled version (as defined by cmake . -NANOARROW_BUNDLE=ON && cmake --install .), the extra file had to be bundled in the same way as all the others.
  • The benchmarks are specifically designed to build against previous versions (so that they can be run by downstream users to determine if they should upgrade their version pin). This means that ViewArrayAs would have to be guarded with an #ifdef, but then different versions would be benchmarking different code. It seems as though the use of Zip is mostly useful alongside ViewArrayAs, so I removed that particular change from this PR (but saved those changes and pushed them to chore(dev/benchmarks): Benchmark ViewArrayAs for versions that contain the helper #443). I'm sorry for not catching that at an earlier stage in the review 😬

Before this merges I think the new helpers need at least a cursory documentation section since this is a user-facing change (I'm happy to circle back and add this and will definitely do so before the next release).

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@paleolimbot paleolimbot changed the title add some range-for helpers feat: Add ArrowArray and ArrowArrayStream C++ iterators May 2, 2024
@paleolimbot paleolimbot merged commit 1724785 into apache:main May 2, 2024
31 checks passed
@bkietz bkietz deleted the range-for-helpers branch May 2, 2024 21:27
@paleolimbot paleolimbot added this to the nanoarrow 0.5.0 milestone May 22, 2024
paleolimbot added a commit that referenced this pull request Jun 21, 2024
In #404, the `_v` literal
was added, making it much more compact to define an `ArrowStringView`
and compare its equality to something else.

This PR refactors tests that used `EXPECT/ASSERT_EQ()` to be more
compact, and replaces `ArrowCharView()` with the string literal.

From reading
https://en.cppreference.com/w/cpp/string/basic_string_view/operator%22%22sv
and https://json.nlohmann.me/api/macros/json_use_global_udls/#notes it
seems like constraining the user-defined literal to a namespace is the
"modern" way to do this...I am not sure that `"_sv"` is the best choice
(maybe too close to `"sv"`?), but it did seem more descriptive than
`"_v` for those reading the tests without knowing that these literals
are defined.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add c++ range-for loop helpers
4 participants