Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ORC bloom filter support #914

Merged
merged 3 commits into from Jun 12, 2019

Conversation

3 participants
@dain
Copy link
Member

commented Jun 5, 2019

No description provided.

@cla-bot cla-bot bot added the cla-signed label Jun 5, 2019

@martint

martint approved these changes Jun 5, 2019

Copy link
Member

left a comment

Just a couple of minor comments.

falsePositive++;
}
}
if (falsePositive != 0 && 1.0 * falsePositive / 100_000_000 > 0.5) {

This comment has been minimized.

Copy link
@martint

martint Jun 5, 2019

Member

This should be 100_000, too. Maybe pull it into a variable and reuse to avoid future discrepancies.

private static OrcRecordReader createCustomOrcRecordReader(TempFile tempFile, OrcPredicate predicate, Type type, int initialBatchSize)
throws IOException
{
OrcDataSource orcDataSource = new FileOrcDataSource(tempFile.getFile(), new DataSize(1, MEGABYTE), new DataSize(1, MEGABYTE), new DataSize(1, MEGABYTE), true);

This comment has been minimized.

Copy link
@martint

martint Jun 5, 2019

Member

Add a version with the bloom filter flag set to false so we can show that rows were filtered because of the bloom filter.

@@ -382,16 +386,22 @@ static boolean isIndexStream(Stream stream)
private Map<Integer, List<BloomFilter>> readBloomFilterIndexes(Map<StreamId, Stream> streams, Map<StreamId, OrcChunkLoader> streamsData)
throws IOException
{
ImmutableMap.Builder<Integer, List<BloomFilter>> bloomFilters = ImmutableMap.builder();
HashMap<Integer, List<BloomFilter>> bloomFilters = new HashMap<>();

This comment has been minimized.

Copy link
@martint

martint Jun 5, 2019

Member

Use Map for the declaration

This comment has been minimized.

Copy link
@findepi

findepi Jun 5, 2019

Member

btw. this looks like Multimap<Integer, BloomFilter>

@Praveen2112 Praveen2112 referenced this pull request Jun 5, 2019

Closed

Fix ORC Bloom Filter #921

@dain dain referenced this pull request Jun 5, 2019

Merged

Fix ORC Bloom Filter #921 #923

@dain dain force-pushed the dain:fix-orc-bloom-filter branch 2 times, most recently from d9f7f50 to 3f8f4f6 Jun 7, 2019

@dain dain force-pushed the dain:fix-orc-bloom-filter branch from 3f8f4f6 to 4f92ae8 Jun 7, 2019

@dain dain merged commit b47ddb9 into prestosql:master Jun 12, 2019

2 checks passed

Travis CI - Pull Request Build Passed
Details
verification/cla-signed
Details

@dain dain deleted the dain:fix-orc-bloom-filter branch Jun 12, 2019

@dain dain added this to the 315 milestone Jun 12, 2019

@dain dain referenced this pull request Jun 12, 2019

Closed

Release notes for 315 #948

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.