Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curvefs/client: improve metadata performance. #2387

Merged
merged 1 commit into from Jun 1, 2023

Conversation

Wine93
Copy link
Contributor

@Wine93 Wine93 commented Apr 11, 2023

No description provided.

@Wine93 Wine93 force-pushed the improve/perf_meta branch 5 times, most recently from 38f9fdd to 76d2633 Compare April 16, 2023 14:09
@Wine93 Wine93 force-pushed the improve/perf_meta branch 7 times, most recently from 3edc197 to 2a864cf Compare April 25, 2023 09:45
@Wine93
Copy link
Contributor Author

Wine93 commented Apr 25, 2023

cicheck

playground:
@bash util/playground.sh

check:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does 'make check' do?

Copy link
Contributor Author

@Wine93 Wine93 Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does 'make check' do?

It does what cpplint does, see util.check.sh.

fs.maxNameLength=255
fs.disableXattr=true
fs.accessLogging=true
fs.kernelCache.attrTimeout=3600
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add unit to timeout conf item

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add unit to timeout conf item

fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the latest code not pushed up?
fs.kernelCache.attrTimeoutSec or fs.kernelCache.attrTimeoutMs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the latest code not pushed up? fs.kernelCache.attrTimeoutSec or fs.kernelCache.attrTimeoutMs?

fs.kernelCache.attrTimeoutSec.

fs.kernelCache.entryTimeout=3600
fs.kernelCache.dirEntryTimeout=3600
fs.lookupCache.negativeTimeout=0
fs.lookupCache.lruSize=100000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit if **Size is byte?

Copy link
Contributor Author

@Wine93 Wine93 Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit if **Size is byte?

The number of directory entries.

@@ -418,8 +419,6 @@ CURVEFS_ERROR RenameOperator::UpdateInodeCtime() {
}

void RenameOperator::UpdateCache() {
dentryManager_->DeleteCache(parentId_, name_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this step unnecessary?

Copy link
Contributor Author

@Wine93 Wine93 Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this step unnecessary?

yep, because:

  • it works well if you only wants to show files under directory, like ls -l does.
  • we control all cache consistent by cache timeout, you can adjust cache timeout for some specified cases if the attribute is sensitive.


// { filesystem option
struct KernelCacheOption {
uint32_t entryTimeout;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add unit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add unit

fixed.

ino, name, size, flags, StrErr(rc));
});

if (IsSpecialReq(name)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 'IsWarmupReq' is better, some other special requests may support later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 'IsWarmupReq' is better, some other special requests may support later.

fixed.

@@ -130,8 +102,6 @@ class DentryCacheManagerImpl : public DentryCacheManager {

private:
std::shared_ptr<MetaServerClient> metaClient_;
// key is parentId + name
std::shared_ptr<TimedLRUCache<std::string, Dentry>> dCache_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the dCache is needn't? where the dentry will be cached?

Copy link
Contributor Author

@Wine93 Wine93 Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the dCache is needn't? where the dentry will be cached?

All dir entries will be cached in kernel.

lru_->Put(parent, entries);
nentries_ += entries->Size();

LOG(INFO) << "Insert directory cache: parent = " << parent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log level is suitable?

Copy link
Contributor Author

@Wine93 Wine93 Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log level is suitable?

Maybe VLOG(1) is better.

@@ -0,0 +1,2 @@
container_name: curve-build-playground-master
container_image: opencurvedocker/curve-base:build-debian9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this new file for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this new file for?

The manifest for build.

fs.cto=true
fs.maxNameLength=255
fs.disableXattr=true
fs.accessLogging=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

fs.maxNameLength=255
fs.disableXattr=true
fs.accessLogging=true
fs.kernelCache.attrTimeout=3600
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the latest code not pushed up?
fs.kernelCache.attrTimeoutSec or fs.kernelCache.attrTimeoutMs?

&o->dirAttrTimeout);
c->GetValueFatalIfFail("fs.kernelCache.entryTimeout",
&o->entryTimeout);
c->GetValueFatalIfFail("fs.kernelCache.dirEntryTimeout",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if fs.kernelCache.entryTimeout=true, fs.kernelCache.dirEntryTimeout=false, dir entry will not cache, but file entry will be cached?

if fs.kernelCache.entryTimeout=false, fs.kernelCache.dirEntryTimeout=false, only dir entry cached?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the diffrence between fs.kernelCache.dirEntryTimeout and fs.kernelCache.entryTimeout?

Copy link
Contributor Author

@Wine93 Wine93 May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if fs.kernelCache.entryTimeout=true, fs.kernelCache.dirEntryTimeout=false, dir entry will not cache, but file entry will be cached?

Yep.

if fs.kernelCache.entryTimeout=false, fs.kernelCache.dirEntryTimeout=false, only dir entry cached?

All entries will not cached.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the diffrence between fs.kernelCache.dirEntryTimeout and fs.kernelCache.entryTimeout?

This designed for assgin different timeout for different types of entris. If the entry is file, we will reply to kernel with timeout fs.kernelCache.entryTimeout, otherwise reply with fs.kernelCache.dirEntryTimeout.

void FileSystem::SetEntryTimeout(EntryOut* entryOut) {
    auto option = option_.kernelCacheOption;
    if (IsDir(entryOut->attr)) {
        entryOut->entryTimeout = option.dirEntryTimeout;
        entryOut->attrTimeout = option.dirAttrTimeout;
    } else {
        entryOut->entryTimeout = option.entryTimeout;
        entryOut->attrTimeout = option.attrTimeout;
    }
}

# spdlog
http_archive(
name = "spdlog",
urls = ["https://github.com/gabime/spdlog/archive/refs/tags/v1.11.0.tar.gz"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a new logging component needs to be introduced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a new logging component needs to be introduced?

Why we need spdlog:

  • it faster than glog
  • we need a separate file to output access log. if we use glog, it may not be so easy to achieve
  • we hope the access log have no performance impact

+----------------+
```

(1) kernel cache timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

| Kernel Cache | (L1 Cache)
+-------------------------------------------------------+
^ (update length)
|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This figure does not reflect which layer the curve-fuse is on

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This figure does not reflect which layer the curve-fuse is on

Actually, everything except the MetaServer and Kernerl Cache is belongs to curve-fuse.

|
+--------------------------------------------------------+
| Open File Cache | (L2 Cache)
| (only length and chunk) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the basis for caching at different levels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the basis for caching at different levels?

It depends on our solution design.

@wuhongsong wuhongsong added the repairing bug repairing label May 22, 2023
@Wine93
Copy link
Contributor Author

Wine93 commented May 24, 2023

cicheck

@Wine93
Copy link
Contributor Author

Wine93 commented May 25, 2023

cicheck

Signed-off-by: Wine93 <wine93.info@gmail.com>
@Wine93
Copy link
Contributor Author

Wine93 commented May 25, 2023

cicheck

fs.lookupCache.negativeTimeoutSec=0
fs.lookupCache.minUses=1
fs.lookupCache.lruSize=100000
fs.dirCache.lruSize=5000000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there are more lru caches, have you paid attention to their memory usage during the test?

Copy link
Contributor Author

@Wine93 Wine93 Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there are more lru caches, have you paid attention to their memory usage during the test?

For reference only: I've seen that if all 5,000,000 directory entries are cached, the fuse process will consume about 8~9GB.

BTW: we can watching it in the next self-test for v2.6.

FsFileType type, dev_t rdev, bool internal,
fuse_entry_param *e) {
if (strlen(name) > option_.maxNameLength) {
CURVEFS_ERROR FuseClient::MakeNode(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MakeNode creates inode and denty. Why not use the filesystem to handle, and cache dentries and attrs into the corresponding cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MakeNode creates inode and denty. Why not use the filesystem to handle, and cache dentries and attrs into the corresponding cache.

To ensure the reliability of the file system in this version, we prefer to minimize changes and use the original function unless it is necessary to rewrite the function.

void Destory();

// fuse request
CURVEFS_ERROR Lookup(Request req,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the following FuseOpXX are handled through the filesystem, and others not?

Copy link
Contributor Author

@Wine93 Wine93 Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the following FuseOpXX are handled through the filesystem, and others not?

The fuse requests which should been rewrite will be taken over by the filesystem, actually, the part of those requests are usually related to cache. on the other hand, i think we should rewrite all request in filesystem, but to ensure the reliability of the file system, i haven't done it yet.

<< ", mtime = " << InodeMtime(file->inode);
}

void OpenFiles::Evit(size_t size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the file has been opened from the upper layer, but it is better than the limit of LRU, the open state here is cleared, what is the impact?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will reopen the file which will fetch inode from metaserver, maybe there is a pefomance skew, but actually this is not often happend.

std::shared_ptr<OpenFile> file;
bool yes = files_->Get(ino, &file);
if (yes) {
file->refs++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of ref++ for this variable, and it does not return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of ref++ for this variable, and it does not return.

The refs is used for the file which had opened multi-times in the same time, for this we shouldn't really close it.

}

if (file->refs == 0) {
Delete(ino, file, false); // file already flushed before close
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file already flushed before close
Why not Delete(ino, file, true);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file already flushed before close Why not Delete(ino, file, true);

Because the inode already sync to metaserver in FuseOpFlush which trigger by close() system call, so it is no need to sync again here.

auto key = CacheKey(parent, name);
bool yes = lru_->Get(key, &entry);
if (yes) {
entry.uses++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances will there be multiple puts, making uses++?

Will only not found entry be cached in lookup?

Copy link
Contributor Author

@Wine93 Wine93 Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances will there be multiple puts, making uses++?

This is a fairly general design, and one that appears in some caching systems, like nginx: proxy_cache_min_uses, we originally introduced this to solve kernel compilation problems.

Will only not found entry be cached in lookup?

Because the positive result had been cached in kernel.


void DeferSync::Start() {
if (!running_.exchange(true)) {
thread_ = std::thread(&DeferSync::SyncTask, this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single thread doing sync may have a performance bottleneck.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single thread doing sync may have a performance bottleneck.

Actually we invoke the inode flush by asynchronously in this thread, so i think we can accpet this model.

std::shared_ptr<DirEntryList> entries,
bool evit) {
nentries_ -= entries->Size();
mq_->Publish(entries); // clear entries in background
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to manually clean up the content in the Dentrylist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to manually clean up the content in the Dentrylist

For free the memory and does not affect performance, the time complexity of the vector.clear() is O(n), so if the entries contains a lot of directory entry which will be an overhead.

@Wine93 Wine93 merged commit 4ae45e5 into opencurve:master Jun 1, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
repairing bug repairing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants