Skip to content

fix: fix hash func in bloom filter implementation#2412

Merged
Snailclimb merged 1 commit intoSnailclimb:mainfrom
codesssss:patch-1
Jun 19, 2024
Merged

fix: fix hash func in bloom filter implementation#2412
Snailclimb merged 1 commit intoSnailclimb:mainfrom
codesssss:patch-1

Conversation

@codesssss
Copy link
Copy Markdown
Contributor

@codesssss codesssss commented Jun 19, 2024

标题

修复布隆过滤器中哈希计算导致的位图越界溢出问题

描述

问题描述:
在原有的布隆过滤器实现中,哈希计算会导致位图(BitSet)出现越界溢出的问题。具体表现为生成的哈希值超出了位图的容量范围(cap),导致布隆过滤器的位图操作出现意外行为,并最终导致位图溢出。

修正方法:
通过调整哈希计算逻辑,确保生成的哈希值在 cap 范围内。具体修正方法如下:

  1. 使用括号明确运算顺序,确保哈希值计算不会溢出。
  2. 增加边界检查,确保哈希值在合法范围内。

修正后的代码:

public int hash(Object value) {
    int h;
    return (value == null) ? 0 : Math.abs(((cap - 1) & (seed * ((h = value.hashCode()) ^ (h >>> 16)))));
}

测试用例及结果:

  1. 测试用例:

    String value1 = "https://javaguide.cn/";
    String value2 = "https://github.com/Snailclimb";
    MyBloomFilter filter = new MyBloomFilter();
    System.out.println(filter.contains(value1)); // false
    System.out.println(filter.contains(value2)); // false
    filter.add(value1);
    filter.add(value2);
    System.out.println(filter.contains(value1)); // true
    System.out.println(filter.contains(value2)); // true
  2. 修正前的结果:

    Size before set: 33554432
    Size after set: -1582569344
    Size before set: -1582569344
    Size after set: -1582569344
    
  3. 修正后的结果:

    Size before set: 33554432
    Size after set: 33554432
    Size before set: 33554432
    Size after set: 33554432
    

通过修正哈希计算逻辑,哈希值始终在 cap 范围内,避免了位图越界溢出的问题。修正后的布隆过滤器表现正常,各测试用例结果符合预期。

关联issue: #2413

@Snailclimb
Copy link
Copy Markdown
Owner

标题

修复布隆过滤器中哈希计算导致的位图越界溢出问题

描述

问题描述: 在原有的布隆过滤器实现中,哈希计算会导致位图(BitSet)出现越界溢出的问题。具体表现为生成的哈希值超出了位图的容量范围(cap),导致布隆过滤器的位图操作出现意外行为,并最终导致位图溢出。

修正方法: 通过调整哈希计算逻辑,确保生成的哈希值在 cap 范围内。具体修正方法如下:

  1. 使用括号明确运算顺序,确保哈希值计算不会溢出。
  2. 增加边界检查,确保哈希值在合法范围内。

修正后的代码:

public int hash(Object value) {
    int h;
    return (value == null) ? 0 : Math.abs(((cap - 1) & (seed * ((h = value.hashCode()) ^ (h >>> 16)))));
}

测试用例及结果:

  1. 测试用例:
    String value1 = "https://javaguide.cn/";
    String value2 = "https://github.com/Snailclimb";
    MyBloomFilter filter = new MyBloomFilter();
    System.out.println(filter.contains(value1)); // false
    System.out.println(filter.contains(value2)); // false
    filter.add(value1);
    filter.add(value2);
    System.out.println(filter.contains(value1)); // true
    System.out.println(filter.contains(value2)); // true
  2. 修正前的结果:
    Size before set: 33554432
    Size after set: -1582569344
    Size before set: -1582569344
    Size after set: -1582569344
    
  3. 修正后的结果:
    Size before set: 33554432
    Size after set: 33554432
    Size before set: 33554432
    Size after set: 33554432
    

通过修正哈希计算逻辑,哈希值始终在 cap 范围内,避免了位图越界溢出的问题。修正后的布隆过滤器表现正常,各测试用例结果符合预期。

关联issue: #2413

感谢完善👍

@Snailclimb Snailclimb merged commit 3c92443 into Snailclimb:main Jun 19, 2024
@codesssss codesssss deleted the patch-1 branch June 20, 2024 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants