Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write bloom filters #213

Open
ozgrakkurt opened this issue Dec 30, 2022 · 1 comment
Open

Write bloom filters #213

ozgrakkurt opened this issue Dec 30, 2022 · 1 comment

Comments

@ozgrakkurt
Copy link

This pr mentions this requires big changes: #99. But this seems like a feature that is important to implement for performance. How doable is it in the current state of the library? I would like to work on it if possible

@ozgrakkurt
Copy link
Author

ozgrakkurt commented Feb 25, 2023

Hey! @jorgecarleitao can you give guidence on this? I started doing it. What I come up with is something like this:

/// Creates a bloom filter from the bitset and writes it into the `writer`.
pub fn write<R: Write + Seek>(
    column_metadata: &mut ColumnChunkMetaData,
    mut writer: &mut W,
    bitset: &[u8],
) -> Result<(), Error> {

    // create bloom filter header
    // create TCompactInputProtocol containing the bloom filter
    // write the offset to column_metadata
    // write the bloom filter to the writer

}

does it look correct?

edit: actually I found that is should be something like this:

/// Creates a bloom filter from the bitset and writes it into the `writer`.
pub fn write(
    protocol: &mut TCompactOutputProtocol,
    bitset: &[u8],
) -> Result<(), Error> {

    // create bloom filter header
    // create TCompactInputProtocol containing the bloom filter
    // write the offset to column_metadata
    // write the bloom filter to the protocol

}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant