Skip to content

Commit

Permalink
Explicitly cast char to signed char in Hash()
Browse files Browse the repository at this point in the history
Summary:
The compilers we use treat char as signed. However, this is not guarantee of C standard and some compilers (for ARM platform for example), treat char as unsigned. Code that assumes that char is either signed or unsigned is wrong.

This change explicitly casts the char to signed version. This will not break any of our use cases on x86, which, I believe are all of them. In case somebody out there is using RocksDB on ARM AND using bloom filters, they're going to have a bad time. However, it is very unlikely that this is the case.

Test Plan: sanity test with previous commit (with new sanity test)

Reviewers: yhchiang, ljin, sdong

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D22767
  • Loading branch information
igorcanadi committed Sep 9, 2014
1 parent 5231146 commit 88841bd
Show file tree
Hide file tree
Showing 4 changed files with 43 additions and 20 deletions.
2 changes: 2 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Rocksdb Change Log

## Unreleased (will be released with 3.6)
### Disk format changes
* If you're using RocksDB on ARM platforms and you're using default bloom filter, there is a disk format change you need to be aware of. There are three steps you need to do when you convert to new release: 1. turn off filter policy, 2. compact the whole database, 3. turn on filter policy

### Behavior changes
* We have refactored our system of stalling writes. Any stall-related statistics' meanings are changed. Instead of per-write stall counts, we now count stalls per-epoch, where epochs are periods between flushes and compactions. You'll find more information in our Tuning Perf Guide once we release RocksDB 3.6.
Expand Down
10 changes: 10 additions & 0 deletions tools/auto_sanity_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,23 @@ echo "Running db sanity check with commits $commit_new and $commit_old."

echo "============================================================="
echo "Making build $commit_new"
git checkout $commit_new
if [ $? -ne 0 ]; then
echo "[ERROR] Can't checkout $commit_new"
exit 1
fi
makestuff
mv db_sanity_test new_db_sanity_test
echo "Creating db based on the new commit --- $commit_new"
./new_db_sanity_test $dir_new create

echo "============================================================="
echo "Making build $commit_old"
git checkout $commit_old
if [ $? -ne 0 ]; then
echo "[ERROR] Can't checkout $commit_old"
exit 1
fi
makestuff
mv db_sanity_test old_db_sanity_test
echo "Creating db based on the old commit --- $commit_old"
Expand Down
29 changes: 14 additions & 15 deletions tools/db_sanity_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,15 @@
#include <vector>
#include <memory>

#include "include/rocksdb/db.h"
#include "include/rocksdb/options.h"
#include "include/rocksdb/env.h"
#include "include/rocksdb/slice.h"
#include "include/rocksdb/status.h"
#include "include/rocksdb/comparator.h"
#include "include/rocksdb/table.h"
#include "include/rocksdb/slice_transform.h"
#include "include/rocksdb/filter_policy.h"
#include "rocksdb/db.h"
#include "rocksdb/options.h"
#include "rocksdb/env.h"
#include "rocksdb/slice.h"
#include "rocksdb/status.h"
#include "rocksdb/comparator.h"
#include "rocksdb/table.h"
#include "rocksdb/slice_transform.h"
#include "rocksdb/filter_policy.h"

namespace rocksdb {

Expand Down Expand Up @@ -50,7 +50,7 @@ class SanityTest {
return s;
}
}
return Status::OK();
return db->Flush(FlushOptions());
}
Status Verify() {
DB* db;
Expand Down Expand Up @@ -149,18 +149,17 @@ class SanityTestPlainTableFactory : public SanityTest {

class SanityTestBloomFilter : public SanityTest {
public:
explicit SanityTestBloomFilter(const std::string& path)
: SanityTest(path) {
table_options_.filter_policy.reset(NewBloomFilterPolicy(10));
options_.table_factory.reset(NewBlockBasedTableFactory(table_options_));
explicit SanityTestBloomFilter(const std::string& path) : SanityTest(path) {
BlockBasedTableOptions table_options;
table_options.filter_policy.reset(NewBloomFilterPolicy(10));
options_.table_factory.reset(NewBlockBasedTableFactory(table_options));
}
~SanityTestBloomFilter() {}
virtual Options GetOptions() const { return options_; }
virtual std::string Name() const { return "BloomFilter"; }

private:
Options options_;
BlockBasedTableOptions table_options_;
};

namespace {
Expand Down
22 changes: 17 additions & 5 deletions util/hash.cc
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,26 @@ uint32_t Hash(const char* data, size_t n, uint32_t seed) {

// Pick up remaining bytes
switch (limit - data) {
// Note: It would be better if this was cast to unsigned char, but that
// would be a disk format change since we previously didn't have any cast
// at all (so gcc used signed char).
// To understand the difference between shifting unsigned and signed chars,
// let's use 250 as an example. unsigned char will be 250, while signed char
// will be -6. Bit-wise, they are equivalent: 11111010. However, when
// converting negative number (signed char) to int, it will be converted
// into negative int (of equivalent value, which is -6), while converting
// positive number (unsigned char) will be converted to 250. Bitwise,
// this looks like this:
// signed char 11111010 -> int 11111111111111111111111111111010
// unsigned char 11111010 -> int 00000000000000000000000011111010
case 3:
h += data[2] << 16;
// fall through
h += static_cast<uint32_t>(static_cast<signed char>(data[2]) << 16);
// fall through
case 2:
h += data[1] << 8;
// fall through
h += static_cast<uint32_t>(static_cast<signed char>(data[1]) << 8);
// fall through
case 1:
h += data[0];
h += static_cast<uint32_t>(static_cast<signed char>(data[0]));
h *= m;
h ^= (h >> r);
break;
Expand Down

0 comments on commit 88841bd

Please sign in to comment.