Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a file system parameter: --fs_uri to db_stress and db_bench #6878

Closed
wants to merge 3 commits into from

Conversation

yhr
Copy link
Contributor

@yhr yhr commented May 25, 2020

Summary:

This pull request adds the parameter --fs_uri to db_bench and db_stress, creating a composite env combining the default env with a specified registered rocksdb file system.

This makes it easier to develop and test new RocksDB FileSystems.

The pull request also registers the posix file system for testing purposes.

Examples:

$./db_bench --fs_uri=posix:// --benchmarks=fillseq

$./db_stress --fs_uri=zenfs://nullb1

zenfs is a RocksDB FileSystem I'm developing to add support for zoned block devices, and in that case the zoned block device is specified in the uri (a zoned null block device in the above example).

@yhr
Copy link
Contributor Author

yhr commented May 26, 2020

Is that failing CI test for visual studio 2014 anything to worry about? It does not seem related to my patch.

env/env_posix.cc Outdated
@@ -528,4 +532,12 @@ std::unique_ptr<Env> NewCompositeEnv(std::shared_ptr<FileSystem> fs) {
return std::unique_ptr<Env>(new PosixEnv(default_env, fs));
}

std::unique_ptr<Env> NewCompositeEnv(std::shared_ptr<FileSystem> fs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems off to me. Why are we potentially creating two Envs in order to create/return one?

Copy link
Contributor Author

@yhr yhr May 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that we create a base env that can be shared among composite envs, and that we can control the order of destruction of the base env and its dependent composite envs in a separate compilation unit.

raw_env = new ROCKSDB_NAMESPACE::HdfsEnv(FLAGS_hdfs);
} else if (!FLAGS_env_uri.empty()) {
Status s = Env::LoadEnv(FLAGS_env_uri, &raw_env, &env_guard);
if (raw_env == nullptr) {
fprintf(stderr, "No Env registered for URI: %s\n", FLAGS_env_uri.c_str());
exit(1);
}
} else if (!FLAGS_fs_uri.empty()) {
#ifndef OS_WIN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in an OS_WIN block? Can't we load FileSystems on Windows?

Copy link
Contributor Author

@yhr yhr May 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment in the source below: - // NewCompositeEnv is not available in Windows
Linking would fail on windows if we would not preprocess the call out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking over this again, I found that there is is an implementation of NewCompositeEnv for OS_WIN in env/env.cc. I'll add support for the new variant there as well, and db_bench/db_stress should compile just fine for windows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented NewCompositeEnv for windows and removed the ifdef.

Status s = Env::LoadEnv(FLAGS_env_uri, &FLAGS_env, &env_guard);
if (FLAGS_env == nullptr) {
fprintf(stderr, "No Env registered for URI: %s\n", FLAGS_env_uri.c_str());
exit(1);
}
} else if (!FLAGS_fs_uri.empty()) {
#ifndef OS_WIN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the OS_WIN block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment in the source below: - // NewCompositeEnv is not available in Windows
Linking would fail on windows if we would preprocess the call out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented NewCompositeEnv for windows and removed the ifdef.

@yhr yhr force-pushed the fs_uri branch 2 times, most recently from 2fcd0ba to 12a9c70 Compare June 1, 2020 08:09
Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm overall, but I wonder whether the functionality can be achieved without introducing the new NewCompositeEnv() overload

env/env.cc Outdated
std::unique_ptr<Env> NewCompositeEnv(std::shared_ptr<FileSystem> fs,
std::shared_ptr<Env>* shared_env) {
assert(shared_env != nullptr);
if (!*shared_env) shared_env->reset(Env::Default());
Copy link
Contributor

@ajkr ajkr Jun 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this cause Env::Default() to be double deleted (once when the last shared_ptr is destroyed, again when the static Env it points to goes out of scope)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The windows implementation does this (port/win/env_default.cc) : std::call_once(winenv_once_flag, []() { envptr = new WinEnv(); });

So it should work as long as Env::Default() is only used for NewComposite envs, right?

Maybe a better solution would be to add a new Env:Default() that returns a shared pointer, avoiding the out-of-scope-racing entirely?

Thanks for reviewing!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it should work as long as Env::Default() is only used for NewComposite envs, right?

For Windows yes, but not for Linux as its Env::Default() uses a function-local static:

rocksdb/env/env_posix.cc

Lines 508 to 509 in 722ebba

static PosixEnv default_env;
return &default_env;
.

Can we call the NewCompositeEnv(std::shared_ptr<FileSystem> fs) overload in case of !*shared_env? It seems we do not need a shared_ptr to guard the CompositeEnvWrapper's Env when it is Env::Default().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now removed the new NewCompositeEnv overload, and added a getter in stead. This allows for composite envs to share thread pools and for the base env to join those threads before the composite envs are destroyed(avoiding the race) - without destroying Env::Default(). Does this make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @ajkr , could you have a look at the new revision of the branch? Looks neater to me. Cheers!

@@ -29,6 +29,8 @@

namespace ROCKSDB_NAMESPACE {
namespace {
static std::shared_ptr<ROCKSDB_NAMESPACE::FileSystem> fs_guard;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we introduce getter functions for these, we can then make them static local variables. By calling Env::Default() before initializing our static local variables, we make sure the Env on which they depend is destroyed later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. sounds like a better solution. I'll revise the patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a getter now, GetCompositeBaseEnv

@@ -29,6 +29,8 @@

namespace ROCKSDB_NAMESPACE {
namespace {
static std::shared_ptr<ROCKSDB_NAMESPACE::FileSystem> fs_guard;
static std::shared_ptr<ROCKSDB_NAMESPACE::Env> composite_env_guard;
Copy link
Contributor

@ajkr ajkr Jun 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think composite_env_guard depends on env_guard for enforcing its dependencies outlive it when --fs_uri is provided. Then env_guard should be defined above composite_env_guard to ensure it is destroyed after.

But actually, this problem only arises because we have composite_env_guard as a non-local static. If instead it were a static local variable (maybe in a Env* GetCompositeEnv() which could call FileSystem::Load() and NewCompositeEnv() on first invocation) , the proper destruction order should be enforced without any guards.

edit: Also I think in that case we wouldn't even need to replace Env::Default() with GetCompositeBaseEnv().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkr - Its not very intuitive :)

This is my view of this:

Composite envs depend on the "base" environment to join all threads that it has started, and that is done in the base env destructor (at least for the posix env)

env_guard must be destroyed before the composite_env_guard to avoid segfaults in any threads running with a reference back to the composite env. If the base env is destroyed first, joining all threads, it is safe to destroy the composite env. This is the problem i'm addressing. Unless an app can't order the destruction of the composite and base envs, this race happens.

The test needs to be updated since CompositeEnvs and Env::Default does not share thread pools - they cannot as we can't destroy Env::Default, and we need to destroy the base env for the reason above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @anand1976 , since you did the refactoring of the env interface and the file system split, do you have time to comment on this?

Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not clear to me why there would be threads running with a dangling reference back to the composite env. The app should be responsible for cancelling/waiting for any such background jobs before destroying the composite env. IMO as @ajkr suggested, we should be able to do without GetCompositeBaseEnv and use a getter with a static variable instead of composite_env_guard.

Copy link
Contributor Author

@yhr yhr Aug 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to preserve the same behavior as for normal envs to avoid having new weird segfaults crop up when NewCompositeEnv would be used. What I saw(it i remember correctly) was that if something unexpected happens(like running out of disk space) with background work running (like compaction), a thread might continue running for some time with a reference back to the env it is operating in while the app(db_bench) is exiting. The only way(without changing the app code) is to safely destroy that reference is to destroy it's base env, joining all threads.

Does that make sense as a background?

If the app is doing it wrong, then a getter with a static variable is much more straightforward. I'll rework this pull request accordingly - then fix any dangling-reference/unclean shutdown issues as I discover them as separate issues.

Cheers!

Copy link
Contributor

@ajkr ajkr Aug 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood the concern now. The way it's supposed to work in a clean shutdown is the app calls DB::Close() or delete DB on all its DBs. The DB shutdown process involves waiting for all its jobs in its Env's thread pool to finish (here, finish usually just means failing quickly due to noticing the shutting_down_ flag is set). Following that, the thread pool should no longer have any opportunity to reference the DB's Env.

env/env_posix.cc Outdated
static std::shared_ptr<PosixEnv> shared_env;
// Call Env::Default to make sure the thread singletons have been setup
(void)Env::Default();
if (!shared_env) shared_env.reset(new PosixEnv());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will create a new set of thread pools, which I don't think we want. The expectation users have is one thread pool per process. Is the intent to allow multiple thread pools? If so, then we should add a new API instead of changing the behavior of NewCompositeEnv.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to create a new PosixEnv instance(along with new thread pools) to allow it to be destroyed. The destruction was needed to join the threads before the composite env was destroyed. Since Env::Default is not reference counted it can't be destroyed safely.

@@ -29,6 +29,8 @@

namespace ROCKSDB_NAMESPACE {
namespace {
static std::shared_ptr<ROCKSDB_NAMESPACE::FileSystem> fs_guard;
static std::shared_ptr<ROCKSDB_NAMESPACE::Env> composite_env_guard;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not clear to me why there would be threads running with a dangling reference back to the composite env. The app should be responsible for cancelling/waiting for any such background jobs before destroying the composite env. IMO as @ajkr suggested, we should be able to do without GetCompositeBaseEnv and use a getter with a static variable instead of composite_env_guard.

Summary:
Register the posix file system so it can be loaded with the uri posix://

This is useful when testing custom file system support in i.e.
db_bench and db_stress.
@yhr yhr force-pushed the fs_uri branch 2 times, most recently from 1b3392e to 3596cdd Compare August 6, 2020 11:51
yhr added 2 commits August 6, 2020 11:57
Summary:
Add the parameter --fs_uri to db_bench, creating a composite
env combining the default env with a specified registered
rocksdb file system.
Summary:
Add the parameter --fs_uri to db_stress, creating a composite
env combining the default env with a specified registered
rocksdb file system.
@yhr
Copy link
Contributor Author

yhr commented Aug 7, 2020

@ajkr, @anand1976 : i've now introduced static getters in db_bench / db_stress in stead. Is this what you had in mind?

Cheers!

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ajkr merged this pull request in 2a0d3c7.

facebook-github-bot pushed a commit that referenced this pull request Sep 3, 2020
Summary:
Delete database instances to make sure there are no loose threads
running before exit(). This fixes segfaults seen when running
workloads through CompositeEnvs with custom file systems.

For further background on the issues arising when using CompositeEnvs, see the discussion in:
#6878

Pull Request resolved: #7327

Reviewed By: cheng-chang

Differential Revision: D23433244

Pulled By: ajkr

fbshipit-source-id: 4e19cf2067e3fe68c2a3fe1823f24b4091336bbe
codingrhythm pushed a commit to SafetyCulture/rocksdb that referenced this pull request Mar 5, 2021
…book#6878)

Summary:
This pull request adds the parameter --fs_uri to db_bench and db_stress, creating a composite env combining the default env with a specified registered rocksdb file system.

This makes it easier to develop and test new RocksDB FileSystems.

The pull request also registers the posix file system for testing purposes.

Examples:
```
$./db_bench --fs_uri=posix:// --benchmarks=fillseq

$./db_stress --fs_uri=zenfs://nullb1
```

zenfs is a RocksDB FileSystem I'm developing to add support for zoned block devices, and in that case the zoned block device is specified in the uri (a zoned null block device in the above example).

Pull Request resolved: facebook#6878

Reviewed By: siying

Differential Revision: D23023063

Pulled By: ajkr

fbshipit-source-id: 8b3fe7193ce45e683043b021779b7a4d547af247
codingrhythm pushed a commit to SafetyCulture/rocksdb that referenced this pull request Mar 5, 2021
Summary:
Delete database instances to make sure there are no loose threads
running before exit(). This fixes segfaults seen when running
workloads through CompositeEnvs with custom file systems.

For further background on the issues arising when using CompositeEnvs, see the discussion in:
facebook#6878

Pull Request resolved: facebook#7327

Reviewed By: cheng-chang

Differential Revision: D23433244

Pulled By: ajkr

fbshipit-source-id: 4e19cf2067e3fe68c2a3fe1823f24b4091336bbe
levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 18, 2021
Summary:
This pull request adds the parameter --fs_uri to db_bench and db_stress, creating a composite env combining the default env with a specified registered rocksdb file system.

This makes it easier to develop and test new RocksDB FileSystems.

The pull request also registers the posix file system for testing purposes.

Examples:
```
$./db_bench --fs_uri=posix:// --benchmarks=fillseq

$./db_stress --fs_uri=zenfs://nullb1
```

zenfs is a RocksDB FileSystem I'm developing to add support for zoned block devices, and in that case the zoned block device is specified in the uri (a zoned null block device in the above example).

Pull Request resolved: facebook/rocksdb#6878

Reviewed By: siying

Differential Revision: D23023063

Pulled By: ajkr

fbshipit-source-id: 8b3fe7193ce45e683043b021779b7a4d547af247
Signed-off-by: Changlong Chen <levisonchen@live.cn>
mm304321141 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 23, 2021
Summary:
This pull request adds the parameter --fs_uri to db_bench and db_stress, creating a composite env combining the default env with a specified registered rocksdb file system.

This makes it easier to develop and test new RocksDB FileSystems.

The pull request also registers the posix file system for testing purposes.

Examples:
```
$./db_bench --fs_uri=posix:// --benchmarks=fillseq

$./db_stress --fs_uri=zenfs://nullb1
```

zenfs is a RocksDB FileSystem I'm developing to add support for zoned block devices, and in that case the zoned block device is specified in the uri (a zoned null block device in the above example).

Pull Request resolved: facebook/rocksdb#6878

Reviewed By: siying

Differential Revision: D23023063

Pulled By: ajkr

fbshipit-source-id: 8b3fe7193ce45e683043b021779b7a4d547af247
Signed-off-by: Changlong Chen <levisonchen@live.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants