New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(file cache): introduce hummock file cache system prototype #3556
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3556 +/- ##
==========================================
+ Coverage 73.91% 74.02% +0.10%
==========================================
Files 810 816 +6
Lines 114457 115130 +673
==========================================
+ Hits 84606 85227 +621
- Misses 29851 29903 +52
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 890 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
887 | 2 | 1 | 0 |
Click to see the invalid file list
- src/storage/src/hummock/file_cache/filter.rs
- src/storage/src/hummock/file_cache/test_utils.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 890 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
888 | 1 | 1 | 0 |
Click to see the invalid file list
- src/storage/src/hummock/file_cache/meta.rs
pub dir: String, | ||
pub capacity: usize, | ||
pub cache_file_fallocate_unit: usize, | ||
pub filters: Vec<Arc<dyn Filter>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the filters
used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for user to customize some rule to judge if the key can be inserted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. LGTM.
@Little-Wallace @wenym1 PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest generally LGTM
src/common/src/cache.rs
Outdated
@@ -305,6 +305,19 @@ impl<K: LruKey, T: LruValue> LruHandleTable<K, T> { | |||
assert_eq!(count, self.elems); | |||
self.list = new_list; | |||
} | |||
|
|||
unsafe fn fill_with<F>(&self, f: &mut F) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name fill_with
is kind of confusing. IIUC, it's more like for_all
?
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
macro_rules! for_all_unsigned_type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need these macros to generate code for all the unsigned types.
We can first define a trait with name like UnsignedType
with definition as
trait UnsignedType: Sub, BitWiseAnd ... {}
. You may refer to https://doc.rust-lang.org/std/ops/index.html for corresponding traits of the operators you use.
And then you impl UnsignedType for
all the listed unsigned type.
In the end, you can define your utils function as, for example
pub fn align_up<T: UnsignedType>(align: T, v: T) -> T {
debug_assert_pow2(align);
(v + align - 1) & !(align - 1)
}
In this way, when you call the method, you don't have to always specify the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, actually I tried it before. But I found some problems:
- With operation like
(v + aligned - 1) & (aligned - 1) == 0
, how to treat0
and1
asU
without extra cost? - Sometimes the compiler cannot correctly infer the type of
U::Output
asU
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved! Thanks a lot for your help! ❤️
// Avoid allocate a new buffer. | ||
self.buffer.swap(); | ||
} else { | ||
let slots = self.store.insert(&batch).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something wrong happen during inserting into the store, the method run
will return and will not run again? I think we may need more careful failure handling. Same for all the ?
used in the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is mostly caused by IO error, which is rare and hard to handle (and most error cannot be recovered without a restart, refer to RocksDB io error handling framework).
For now, I prefer directly raise the error. We can introduce an error handling framework later.
block_size: usize, | ||
buffer_capacity: usize, | ||
|
||
mf: Arc<RwLock<MetaFile<K>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use clearer naming instead of abbreviation for mf
and cf
.
} | ||
|
||
#[inline(always)] | ||
pub fn assert_pow2<U: Unsigned>(v: U) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we use
assert(is_pow2(v))
to replace it ? avoid that whenever you change the logic of is_pow2(maybe), but forget to change the assert logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both are okay to me.
pub fn assert_aligned<U: Unsigned>(align: U, v: U) { | ||
debug_assert_pow2(align); | ||
assert_eq!( | ||
v & (align - U::from(1)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use the same logic for is_aligned to replace it like last comment
/// Size of `st_blocks` with `fstat(2)`. | ||
const ST_BLOCK_SIZE: usize = 512; | ||
|
||
const LRU_SHARD_BITS: usize = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Do we need to make it configurable for LRU_SHARD_BITS ?
-
how about use DIO_BUFFER_ALLOCATOR to simply the define of DioBuffer
static DIO_BUFFER_ALLOCATOR: _ = alloc::AlignedAllocator::<LOGICAL_BLOCK_SIZE>;
type DioBuffer = Vec<u8, &'static DIO_BUFFER_ALLOCATOR>;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do we need to make it configurable for LRU_SHARD_BITS ?
I think it's okey to hardcode it. I've benched sharded hash table before in my side project. And 32 / 64 shards are reasonable parameter for most cases.
- how about use DIO_BUFFER_ALLOCATOR to simply the define of DioBuffer
It's a static variable, not a type.
Most issues solved, others commented. PTAL @wenym1 🥰 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
// TODO(MrCroxx): Graceful shutdown. | ||
let _handle = tokio::task::spawn(async move { | ||
if let Err(e) = buffer_flusher.run().await { | ||
tracing::error!("error raised within file cache buffer flusher: {}", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the buffer flusher stops with error, should we panic here instead of only printing a log? As you mentioned above, when the buffer flusher stops with error, something unrecoverable has happened on disk. In this case, I think panic can warn us earlier about the accident and otherwise the error log is likely to be buried.
…ngwavelabs#3556) * intorduce filter * feat(file cache): implement memory mapped cache meta file, refactor cache data file * fix license header * fix test * refactor mmapped meta file, discard fixed capacity, use growable size * fix license header * impl meta file slot free * impl basic write/read path * tiny fix * impl basic write/read path * extract buffer mod * add flush hook for test or other usages * add some uts * make clippy happy, fix ut * fix bugs, support custom hasher for test * remove ut output * remove erase listener, it is not correct when concurrent deleting and dropping * replace macro with trait in utils, rename fill_with * fix typo and rename some fields * tiny modify * panic if flusher raise an error Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.
What's changed and what's your intention?
This PR implement the most parts of the Hummock File Cache System.
There is a brief design doc on Quip.
State:
This PR crosses several experiments and the design changed several times based on the experiments. So the PR volume is a little large. 🙏🏻
Checklist
./risedev check
(or alias,./risedev c
)Refer to a related PR or issue link (optional)
#198