Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Parquet] Debug Build might fail in DictEncoder dtor checking #38117

Closed
mapleFU opened this issue Oct 7, 2023 · 1 comment · Fixed by #38118
Closed

[C++][Parquet] Debug Build might fail in DictEncoder dtor checking #38117

mapleFU opened this issue Oct 7, 2023 · 1 comment · Fixed by #38118

Comments

@mapleFU
Copy link
Member

mapleFU commented Oct 7, 2023

Describe the enhancement requested

This is the case described in #35520 . But I didn't find it when previous meet the bug because I just use release build.

  void Close() override {
    if (!closed_) {
      closed_ = true;
      CheckRowsWritten();

      // Avoid invalid state if ColumnWriter::Close() throws internally.
      auto column_writers = std::move(column_writers_);
      for (size_t i = 0; i < column_writers.size(); i++) {
        if (column_writers[i]) {
          total_bytes_written_ += column_writers[i]->Close();
          total_compressed_bytes_written_ +=
              column_writers[i]->total_compressed_bytes_written();
        }
      }

      // Ensures all columns have been written
      metadata_->set_num_rows(num_rows_);
      metadata_->Finish(total_bytes_written_, row_group_ordinal_);
    }
  }

When close file with 10 columns, assume column writer 2 has exception on Close, it throw an exception. and the third one is a DictEncoder, so it would DCHECK failed:

  ~DictEncoderImpl() override { DCHECK(buffered_indices_.empty()); }

Component(s)

C++, Parquet

@mapleFU
Copy link
Member Author

mapleFU commented Oct 7, 2023

@pitrou I've draft a basic fixing for this #38118

I change DCHECK to ARROW_LOG(WARNING). This might not the best, because it might cause user forget to flush the DictEncoder. Another way is provide an abort.

      // Avoid invalid state if ColumnWriter::Close() throws internally.
      auto column_writers = std::move(column_writers_);
      for (size_t i = 0; i < column_writers.size(); i++) {
        if (column_writers[i]) {
           column_writers[i]->abort(); // abort might not throw exception.
        }
      }
      for (size_t i = 0; i < column_writers.size(); i++) {
        if (column_writers[i]) {
          total_bytes_written_ += column_writers[i]->Close();
          ...
        }
      }

And change DCHECK to:

DCHECK(aborted_ || !buffered_indices_.empty());

Any idea is ok for me :-)

pitrou pushed a commit that referenced this issue Oct 19, 2023
…log (#38118)

### Rationale for this change

This is a minor change, just change `DCHECK` to warning log.

### What changes are included in this PR?

change `DCHECK` to warning log.

### Are these changes tested?

no, I don't know how to test this, any idea is welcome.

### Are there any user-facing changes?

no

* Closes: #38117

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@pitrou pitrou added this to the 15.0.0 milestone Oct 19, 2023
JerAguilon pushed a commit to JerAguilon/arrow that referenced this issue Oct 23, 2023
…rning log (apache#38118)

### Rationale for this change

This is a minor change, just change `DCHECK` to warning log.

### What changes are included in this PR?

change `DCHECK` to warning log.

### Are these changes tested?

no, I don't know how to test this, any idea is welcome.

### Are there any user-facing changes?

no

* Closes: apache#38117

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…rning log (apache#38118)

### Rationale for this change

This is a minor change, just change `DCHECK` to warning log.

### What changes are included in this PR?

change `DCHECK` to warning log.

### Are these changes tested?

no, I don't know how to test this, any idea is welcome.

### Are there any user-facing changes?

no

* Closes: apache#38117

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…rning log (apache#38118)

### Rationale for this change

This is a minor change, just change `DCHECK` to warning log.

### What changes are included in this PR?

change `DCHECK` to warning log.

### Are these changes tested?

no, I don't know how to test this, any idea is welcome.

### Are there any user-facing changes?

no

* Closes: apache#38117

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants