New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][Parquet] Minor: Declare lifetime of shared_ptr<Page>
on PageReader
#34722
Comments
@pitrou @wjones127 mind take a look? |
It might be easier to review if you can paster the code link here. |
The interface: // Abstract page iterator interface. This way, we can feed column pages to the
// ColumnReader through whatever mechanism we choose
class PARQUET_EXPORT PageReader {
using DataPageFilter = std::function<bool(const DataPageStats&)>;
public:
virtual ~PageReader() = default;
// @returns: shared_ptr<Page>(nullptr) on EOS, std::shared_ptr<Page>
// containing new Page otherwise
virtual std::shared_ptr<Page> NextPage() = 0; The actual: // This subclass delimits pages appearing in a serialized stream, each preceded
// by a serialized Thrift format::PageHeader indicating the type of each page
// and the page metadata.
class SerializedPageReader : public PageReader {
public:
SerializedPageReader(std::shared_ptr<ArrowInputStream> stream, int64_t total_num_values,
Compression::type codec, const ReaderProperties& properties,
const CryptoContext* crypto_ctx, bool always_compressed)
: properties_(properties),
stream_(std::move(stream)),
decompression_buffer_(AllocateBuffer(properties_.memory_pool(), 0)),
decryption_buffer_(AllocateBuffer(properties_.memory_pool(), 0)) {} It (implicitly) depend on |
wjones127
added a commit
that referenced
this issue
May 9, 2023
…35368) ### Rationale for this change This change update wording of Parquet `NextPage`. It will reuse same decompression/decrypt buffer internal. So, use should attension it syntax. ### What changes are included in this PR? No code change, just add some comments. ### Are these changes tested? No need ### Are there any user-facing changes? No. * Closes: #34722 Lead-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: mwish <1506118561@qq.com> Co-authored-by: Will Jones <willjones127@gmail.com> Signed-off-by: Will Jones <willjones127@gmail.com>
liujiacheng777
pushed a commit
to LoongArch-Python/arrow
that referenced
this issue
May 11, 2023
…age (apache#35368) ### Rationale for this change This change update wording of Parquet `NextPage`. It will reuse same decompression/decrypt buffer internal. So, use should attension it syntax. ### What changes are included in this PR? No code change, just add some comments. ### Are these changes tested? No need ### Are there any user-facing changes? No. * Closes: apache#34722 Lead-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: mwish <1506118561@qq.com> Co-authored-by: Will Jones <willjones127@gmail.com> Signed-off-by: Will Jones <willjones127@gmail.com>
ArgusLi
pushed a commit
to Bit-Quill/arrow
that referenced
this issue
May 15, 2023
…age (apache#35368) ### Rationale for this change This change update wording of Parquet `NextPage`. It will reuse same decompression/decrypt buffer internal. So, use should attension it syntax. ### What changes are included in this PR? No code change, just add some comments. ### Are these changes tested? No need ### Are there any user-facing changes? No. * Closes: apache#34722 Lead-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: mwish <1506118561@qq.com> Co-authored-by: Will Jones <willjones127@gmail.com> Signed-off-by: Will Jones <willjones127@gmail.com>
rtpsw
pushed a commit
to rtpsw/arrow
that referenced
this issue
May 16, 2023
…age (apache#35368) ### Rationale for this change This change update wording of Parquet `NextPage`. It will reuse same decompression/decrypt buffer internal. So, use should attension it syntax. ### What changes are included in this PR? No code change, just add some comments. ### Are these changes tested? No need ### Are there any user-facing changes? No. * Closes: apache#34722 Lead-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: mwish <1506118561@qq.com> Co-authored-by: Will Jones <willjones127@gmail.com> Signed-off-by: Will Jones <willjones127@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
In C++ Parquet,
SerializedPageReader
holds adecryption_buffer_
anddecompression_buffer_
, and whenNextPage
called, it will reuse that buffer. So,Page
's buffer data is bounded onSerializedPageReader::NextPage
, when NextPage is called, it's data might be reset.I have two problems here:
Component(s)
C++, Parquet
The text was updated successfully, but these errors were encountered: