Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ORC writer hit cudaErrorInvalidValue exception with ZSTD compression #14932

Closed
ttnghia opened this issue Jan 30, 2024 · 5 comments · Fixed by #14947
Closed

[BUG] ORC writer hit cudaErrorInvalidValue exception with ZSTD compression #14932

ttnghia opened this issue Jan 30, 2024 · 5 comments · Fixed by #14947
Assignees
Labels
bug Something isn't working cuIO cuIO issue

Comments

@ttnghia
Copy link
Contributor

ttnghia commented Jan 30, 2024

With some changes to the current ORC unit tests, I got cuda exception:

unknown file: Failure
C++ exception with description "CUDA error encountered at: ../../src/io/orc/writer_impl.cu:1591: 1
 cudaErrorInvalidValue invalid argument" thrown in the test body.
[  FAILED  ] OrcWriterNumericTypeTest/0.SingleColumn, where TypeParam = signed char (382 ms)

The changes are below:

diff --git a/cpp/tests/io/orc_test.cpp b/cpp/tests/io/orc_test.cpp
index 2ae6edc6c7..a8bfed6400 100644
--- a/cpp/tests/io/orc_test.cpp
+++ b/cpp/tests/io/orc_test.cpp
@@ -224,16 +224,17 @@ struct SkipRowTest {
 
 TYPED_TEST(OrcWriterNumericTypeTest, SingleColumn)
 {
-  auto sequence = cudf::detail::make_counting_transform_iterator(0, [](auto i) { return i; });
+  auto sequence = cudf::detail::make_counting_transform_iterator(0, [](auto i) { return i % 100; });
 
-  constexpr auto num_rows = 100;
+  constexpr auto num_rows = 1000000;
   column_wrapper<TypeParam, typename decltype(sequence)::value_type> col(sequence,
                                                                          sequence + num_rows);
   table_view expected({col});
 
   auto filepath = temp_env->get_temp_filepath("OrcSingleColumn.orc");
   cudf::io::orc_writer_options out_opts =
-    cudf::io::orc_writer_options::builder(cudf::io::sink_info{filepath}, expected);
+    cudf::io::orc_writer_options::builder(cudf::io::sink_info{filepath}, expected)
+      .compression(cudf::io::compression_type::ZSTD);
   cudf::io::write_orc(out_opts);
@ttnghia ttnghia added bug Something isn't working Needs Triage Need team to review and classify cuIO cuIO issue labels Jan 30, 2024
@ttnghia
Copy link
Contributor Author

ttnghia commented Jan 30, 2024

CC @vuule.

@ttnghia
Copy link
Contributor Author

ttnghia commented Jan 30, 2024

Notice that the exception is thrown with compression ZSTD, not with other compression like SNAPPY. I didn't test all compression types.

@ttnghia
Copy link
Contributor Author

ttnghia commented Jan 30, 2024

I also tested that roudtrip example with Parquet writer/reader without any issues. So the problem should be due to ORC writer itself.

  cudf::io::parquet_writer_options args =
    cudf::io::parquet_writer_options::builder(cudf::io::sink_info{filepath}, expected)
      .compression(cudf::io::compression_type::ZSTD);
  cudf::io::write_parquet(args);

  cudf::io::parquet_reader_options read_opts =
    cudf::io::parquet_reader_options::builder(cudf::io::source_info{filepath});
  auto result = cudf::io::read_parquet(read_opts);

  CUDF_TEST_EXPECT_TABLES_EQUAL(expected, result.tbl->view());

@ttnghia ttnghia changed the title [BUG] ORC writer hit cudaErrorInvalidValue exception [BUG] ORC writer hit cudaErrorInvalidValue exception with ZSTD compression Jan 30, 2024
@vuule
Copy link
Contributor

vuule commented Jan 30, 2024

Thank you for filing the issue. I'll try to get a local repro today.

@vuule
Copy link
Contributor

vuule commented Jan 31, 2024

Got local repro, will look into this tomorrow.

@vuule vuule self-assigned this Jan 31, 2024
@bdice bdice removed the Needs Triage Need team to review and classify label Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuIO cuIO issue
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants