Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert ArgsManager::GetDataDir to a read-only function #27073

Merged
merged 2 commits into from
Feb 23, 2023

Conversation

willcl-ark
Copy link
Member

@willcl-ark willcl-ark commented Feb 10, 2023

Fixes #20070

Currently ArgsManager::GetDataDir() ensures it will always return a datadir by creating one if necessary. The function is shared between bitcoind bitcoin-qt and bitcoin-cli which results in the undesirable behaviour described in #20070.

This PR splits out the part of the function which creates directories and adds it as a standalone function, only called as part of bitcoind and bitcoin-qt init, but not bitcoin-cli.

ReadConfigFiles' behavior is changed to use the absolute path of the config file in error and warning messages instead of a relative path.

This was inadvertantly the form being tested here, whilst we were not testing that a relative path was returned by the message even though we passed a relative path in as argument.

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 10, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK hebasto, TheCharlatan, achow101, ryanofsky
Concept ACK stickies-v, pinheadmz

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #26177 (refactor / kernel: Move non-gArgs chainparams functionality to kernel by TheCharlatan)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK - having the cli generate multiple datadirs is not great. Suggested some simplifications, I hope I'm not missing some nuance there?

src/util/system.cpp Outdated Show resolved Hide resolved
src/util/system.cpp Outdated Show resolved Hide resolved
src/util/system.cpp Outdated Show resolved Hide resolved
Copy link
Member

@pinheadmz pinheadmz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concept ACK

I reproduced the steps from the issue and confirmed that the patch prevents creating new Bitcoin/wallets directory when running bitcoin-cli -getinfo from a new user on the same machine running bitcoind

src/util/system.cpp Outdated Show resolved Hide resolved
src/util/system.cpp Outdated Show resolved Hide resolved
@willcl-ark
Copy link
Member Author

Thanks @stickies-v & @pinheadmz , I addressed your comments in the new push.

src/util/system.cpp Outdated Show resolved Hide resolved
src/util/system.cpp Outdated Show resolved Hide resolved
src/bitcoind.cpp Outdated Show resolved Hide resolved
src/qt/bitcoin.cpp Outdated Show resolved Hide resolved
@hebasto
Copy link
Member

hebasto commented Feb 13, 2023

Concept ACK.

src/util/system.cpp Show resolved Hide resolved
}

return path;
}

void ArgsManager::EnsureDataDir(bool net_specific)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe introduce two functions: one for net_specific == true and the other for net_specific == true?

It will make code cleaner on the caller sites.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree there could be two wrapper functions here if desired, EnsureDataDir and EnsureNetDataDir perhaps? It might just means more duplicated code in the end though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some digging, and I think (would be nice to get more confirmation on this) we can scrap the net_specific == false version entirely, we don't seem to need it. EnsureDataDir is called 3 times: 2 times right before reading config files - but it looks like we don't actually need to ensure a datadir there. If datadir doesn't exist, by definition no config files can be read (and that's handled gracefully). The only time we need it is where net_specific==true.

Intuitively, this makes sense - it can never hurt to ensure that all the necessary datadirs (both base as well as the more specific net) exist. However, there actually is a pitfall, because GetDataDir() does some caching, we're not allowed to call GetDatDir(true) before we've determined the network we're on (which requires us to first read the basedir and check if there's a config file there).

For that reason, I think calling it EnsureDataDirNet is probably better - and aligns with GetDataDirNet naming.

git diff (quick patch just to prototype)
diff --git a/src/bitcoind.cpp b/src/bitcoind.cpp
index 3a36ff700..07002cc8a 100644
--- a/src/bitcoind.cpp
+++ b/src/bitcoind.cpp
@@ -153,7 +153,6 @@ static bool AppInit(NodeContext& node, int argc, char* argv[])
         if (!CheckDataDirOption()) {
             return InitError(Untranslated(strprintf("Specified data directory \"%s\" does not exist.\n", args.GetArg("-datadir", ""))));
         }
-        args.EnsureDataDir(/*net_specific=*/false);
         if (!args.ReadConfigFiles(error, true)) {
             return InitError(Untranslated(strprintf("Error reading configuration file: %s\n", error)));
         }
diff --git a/src/qt/bitcoin.cpp b/src/qt/bitcoin.cpp
index 80e607f77..59f433749 100644
--- a/src/qt/bitcoin.cpp
+++ b/src/qt/bitcoin.cpp
@@ -596,7 +596,6 @@ int GuiMain(int argc, char* argv[])
     try {
         /// 6b. Parse bitcoin.conf
         /// - Do not call gArgs.GetDataDirNet() before this step finishes
-        gArgs.EnsureDataDir(/*net_specific=*/false);
         if (!gArgs.ReadConfigFiles(error, true)) {
             InitError(strprintf(Untranslated("Error reading configuration file: %s\n"), error));
             QMessageBox::critical(nullptr, PACKAGE_NAME,
diff --git a/src/util/system.cpp b/src/util/system.cpp
index 329100008..75fd91516 100644
--- a/src/util/system.cpp
+++ b/src/util/system.cpp
@@ -439,9 +439,9 @@ const fs::path& ArgsManager::GetDataDir(bool net_specific) const
     return path;
 }
 
-void ArgsManager::EnsureDataDir(bool net_specific)
+void ArgsManager::EnsureDataDirNet()
 {
-    fs::path path = GetDataDir(net_specific);
+    fs::path path = GetDataDir(true);
     if (!fs::exists(path)) {
         fs::create_directories(path / "wallets");
     }
@@ -492,7 +492,7 @@ bool ArgsManager::IsArgSet(const std::string& strArg) const
 
 bool ArgsManager::InitSettings(std::string& error)
 {
-    EnsureDataDir(/*net_specific=*/true);
+    EnsureDataDirNet();
     if (!GetSettingsPath()) {
         return true; // Do nothing if settings file disabled.
     }
diff --git a/src/util/system.h b/src/util/system.h
index 3bcbd5c8e..23c15228e 100644
--- a/src/util/system.h
+++ b/src/util/system.h
@@ -478,10 +478,8 @@ protected:
 
     /**
      * Create data directory if it doesn't exist
-     *
-     * @param net_specific Create a network-identified data directory
      */
-    void EnsureDataDir(bool net_specific);
+    void EnsureDataDirNet();
 
 private:
     /**

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have updated to EnsureDataDirNet as suggested

@fanquake
Copy link
Member

Does/can this also close #16220?

@willcl-ark
Copy link
Member Author

Does/can this also close #16220?

I will check this out, specifically w.r.t this comment which I had not been considering:

It would be, however the reason that wallets/ is not created when the .bitcoin directory already exists is backwards compatibility. If there is an existing datadir but no wallets directory it's assumed to be an old data directory, and creating wallets would effectively hide the existing wallets in the top level dir, possibly leading to people to suspect loss of funds.

@willcl-ark
Copy link
Member Author

I will check this out, specifically w.r.t this comment which I had not been considering:

I don't believe this fully closes the loop on #16220, but I am also not sure that we ever will close that fully whilst also providing backwards-compatibility...

There can exist currently two different working wallet configurations:

  1. Legacy: wallets in subfolders in top-level, e.g. ~/.bitcoin/test-wallet/wallet.dat or ~/.bitcoin/regtest/test-wallet/wallet.dat
  2. Current: wallets in wallets/ subfolder, e.g. ~/.bitcoin/wallets/test-wallet/wallet.dat or ~/.bitcoin/regtest/wallets/test-wallet/wallet.dat

The current init logic (preserved by this PR) is that we will check for the presence of .bitcoin/ dir, and if it exists then use it as DataDir and take no further action. If it does not exist, then we will create it, along with a wallets/ subdirectory. This allows old installs to continue without "disappearing wallets", as described by #16220.

Without implementing some brittle filesystem-checking code as laanwj described, I don't see this as really "fixable" (and probably that issue should be closed). One way of doing it could be to recursively check each subdir not called wallets/ in the datadir for the presence of *.dat files, but I don't much like that idea.


sidenote: I notice in testing this that loadwallet RPC says that it can take a path to a wallet.dat file, to load the wallet, which I found not to work at all, whether or not the wallet was in a wallets/ subdir or not (and IIRC from another recent issue it is not supposed to work like this, so could do with helptext being updated):

{"filename", RPCArg::Type::STR, RPCArg::Optional::NO, "The wallet directory or .dat file."},

Details
will@ubuntu in ~/src/bitcoin on  2022-02-cli-datadir [$⇕] via 🐍 v3.7.16
✗ /home/will/src/bitcoin/src/bitcoin-cli loadwallet ~/.bitcoin/test/wallet.dat
error code: -4
error message:
Wallet file verification failed. Invalid -wallet path '/home/will/.bitcoin/test/wallet.dat'. -wallet path should point to a directory where wallet.dat and database/log.?????????? files can be stored, a location where such a directory could be created, or (for backwards compatibility) the name of an existing data file in -walletdir ("/home/will/.bitcoin")

will@ubuntu in ~/src/bitcoin on  2022-02-cli-datadir [$⇕] via 🐍 v3.7.16

####
# Moved wallet from ~/.bitcoin/test/wallet.dat to ~/.bitcoin/wallet.dat
####

✗ /home/will/src/bitcoin/src/bitcoin-cli loadwallet ~/.bitcoin/wallet.dat
error code: -4
error message:
Wallet file verification failed. Invalid -wallet path '/home/will/.bitcoin/wallet.dat'. -wallet path should point to a directory where wallet.dat and database/log.?????????? files can be stored, a location where such a directory could be created, or (for backwards compatibility) the name of an existing data file in -walletdir ("/home/will/.bitcoin")

@willcl-ark
Copy link
Member Author

I don't particularly want to feature-creep too much, but another adjacent issue is #19990

Which could potentially be addressed in this PR too if desired?

@ryanofsky
Copy link
Contributor

ryanofsky commented Feb 14, 2023

I could be wrong, but the current version of this 41e9a81 appears to change behavior by only creating the <datadir>/<netdir>/wallets directory, and no longer creating the <datadir>/wallets one level up if <datadir> did not exist.

This seems like a potentially bad thing, because if testnet or signet or regtest bitcoin is started before mainnet bitcoin, mainnet wallets will go be created in <datadir> not <datadir>/wallets as intended. I think more ideally this PR would just refactor the code as desired without changing behavior.

@willcl-ark
Copy link
Member Author

Thanks @ryanofsky that was unintentional and is now fixed. The only behaviour change is now in bitcoin-cli.

This does still leave us with the (undesirable) behaviour described in #27073 (comment). It seems likely a likely flow that a user might download bitcoin core, create bitcoin.conf and then run the application, which results in not creating a wallets/ subdir on mainnet (and technically other networks, but it's unlikely a new user would manually create those subdirectories). E.g. the following:

$ mkdir ~/.bitcoin
$ echo "all_my_settings" >> ~/.bitcoin/bitcoin.conf
$ bitcoind    # first run
$ find .bitcoin -type d
.bitcoin
.bitcoin/chainstate
.bitcoin/blocks
.bitcoin/blocks/index   # no wallets/ subdir created as datadir already existed

$ bitcoind -regtest
$ find .bitcoin -type d
.bitcoin
.bitcoin/chainstate
.bitcoin/regtest
.bitcoin/regtest/wallets    # created as regtest subdir did not already exist, even though datadir did
.bitcoin/regtest/chainstate
.bitcoin/regtest/blocks
.bitcoin/regtest/blocks/index
.bitcoin/blocks
.bitcoin/blocks/index

But I think this is something for a different PR to fix.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Light code review ACK 64b812e. I want to look a little more closely at this and test a little more to be sure it works as expected, but this seems like a very nice change. Avoids confusing side effect of functions like GetDataDirNet, AbsPathForConfigVal, and GetConfigFile creating directories in the background.

src/util/system.cpp Show resolved Hide resolved
src/util/system.cpp Show resolved Hide resolved
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Light code review ACK 64b812e. I want to look a little more closely at this and test a little more to be sure it works as expected, but this seems like a very nice change. Avoids confusing side effect of functions like GetDataDirNet, AbsPathForConfigVal, and GetConfigFile creating directories in the background.

@TheCharlatan
Copy link
Contributor

Concept ACK

You can mark both GetConfigFilePath and EnsureDataDir as const.

@willcl-ark
Copy link
Member Author

Concept ACK

You can mark both GetConfigFilePath and EnsureDataDir as const.

Updated in 5626101

Copy link
Contributor

@TheCharlatan TheCharlatan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 0af17e1

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 0af17e1, tested on Ubuntu 22.04.

Fixes #20070

Confirming the bitcoin-cli does not create a data directory and its subdirectories.

I've noticed another behavior change which looks good for me:

  • master:
$ ./src/bitcoind rubbish
Error: Command line contains unexpected token 'rubbish', see bitcoind -h for a list of options.

$ tree ~/.bitcoin
/home/hebasto/.bitcoin
└── wallets

1 directory, 0 files
$ ./src/bitcoind rubbish
Error: Command line contains unexpected token 'rubbish', see bitcoind -h for a list of options.

$ ls ~/.bitcoin
ls: cannot access '/home/hebasto/.bitcoin': No such file or directory

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 0af17e1. Nice change that makes initialization more straightforward and gets rid of strange behavior. Only change since last review is adding a comment. I left some small suggestions, but they are not important.


I think part of the PR description #27073 (comment) is inaccurate.

As part of the refactoring it introduces a slight behaviour change to GetConfigFilePath, which now always returns the absolute path of the provided -conf argument, specifically when surfacing an error.

The GetConfigFilePath() function is a new function so it's behavior isn't changing. I think it would be more accurate to say "ReadConfigFiles behavior is changed to use the absolute path of the config file in error and warning messages instead of a relative path."

@@ -475,6 +476,17 @@ class ArgsManager
*/
void LogArgs() const;

/** If datadir does not exist, create it along with wallets/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "util: make GetDataDir read-only & create datadir.." (0af17e1)

Would be good to move this comment from the EnsureDataDir declaration to the EnsureDataDir implementation. The comment is trying to explain the confusing create_directories("wallets") calls there, so probably more helpful next to those calls.

It would be good to have API documentation for the two new functions though. I would move the function descriptions you wrong in the commit message out of the commit message and into system.h comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 56e370f

* Add ArgsManager::EnsureDataDir()
  Creates data directory if it doesn't exist
* Add ArgsManager::GetConfigFilePath()
  Return config file path (read-only)
.. only in bitcoind and bitcoin-qt

This changes behaviour of GetConfigFilePath which now always returns the
absolute path of the provided -conf argument.
Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 64c1054, only comments have been adjusted as requsted since my previous review.

Copy link
Contributor

@TheCharlatan TheCharlatan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-ACK 64c1054

@achow101
Copy link
Member

ACK 64c1054

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 64c1054. Only comment changes since last review

@achow101 achow101 merged commit 1258af4 into bitcoin:master Feb 23, 2023
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Feb 25, 2023
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Feb 27, 2023
New function was introduced by willcl-ark <will@256k1.dev> in commit
56e370f from
bitcoin#27073 and removes some duplicate code.
stickies-v added a commit to stickies-v/bitcoin that referenced this pull request Feb 28, 2023
Since bitcoin#27073, the behaviour of GetDataDir changed to only return
the datadir path, but not create it. This also changed the behaviour
of GetDataDirNet and GetDataDirBase but the docs do not yet reflect
that.
fanquake added a commit that referenced this pull request Feb 28, 2023
…adir

fb0dbe9 docs: GetDataDirNet and GetDataDirBase don't create datadir (stickies-v)

Pull request description:

  Since #27073, the behaviour of `GetDataDir()` [changed](https://github.com/bitcoin/bitcoin/pull/27073/files#diff-19427b0dd1a791adc728c82e88f267751ba4f1c751e19262cac03cccd2822216L435-L443) to only return the datadir path, but not create it if non-existent. This also changed the behaviour of `GetDataDirNet()` and `GetDataDirBase()` but the docs do not yet reflect that.

ACKs for top commit:
  TheCharlatan:
    ACK fb0dbe9
  theStack:
    ACK fb0dbe9
  willcl-ark:
    ACK fb0dbe9

Tree-SHA512: 3f10f4871df59882f3649c6d3b2362cae2f8a01ad0bd0c636c5608b0d177d279a2e8712930b819d6d3912e91fa6447b9e54507c33d8afe427f7f39002b013bfb
achow101 added a commit that referenced this pull request Feb 28, 2023
9a9d5da refactor: Stop using gArgs global in system.cpp (Ryan Ofsky)
b20b34f refactor: Use new GetConfigFilePath function (Ryan Ofsky)

Pull request description:

  Most of the code in `util/system.cpp` that was hardcoded to use the global `ArgsManager` instance `gArgs` has been changed to stop using it (for example in #20092). But a few hardcoded references to `gArgs` remain. This commit removes the last ones so these functions aren't reading or writing global state.

  Noticed these `gArgs` references while reviewing #27073

ACKs for top commit:
  achow101:
    ACK 9a9d5da
  stickies-v:
    ACK 9a9d5da
  willcl-ark:
    tACK 9a9d5da

Tree-SHA512: 2c74b0d5fc83e9ed2ec6562eb26ec735512f75db8876a11a5d5f04e6cdbe0cd8beec19894091aa2cbf29319194d2429ccbf8036f5520ecc394f6fe89a0079a7b
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Mar 1, 2023
…ate datadir

fb0dbe9 docs: GetDataDirNet and GetDataDirBase don't create datadir (stickies-v)

Pull request description:

  Since bitcoin#27073, the behaviour of `GetDataDir()` [changed](https://github.com/bitcoin/bitcoin/pull/27073/files#diff-19427b0dd1a791adc728c82e88f267751ba4f1c751e19262cac03cccd2822216L435-L443) to only return the datadir path, but not create it if non-existent. This also changed the behaviour of `GetDataDirNet()` and `GetDataDirBase()` but the docs do not yet reflect that.

ACKs for top commit:
  TheCharlatan:
    ACK fb0dbe9
  theStack:
    ACK fb0dbe9
  willcl-ark:
    ACK fb0dbe9

Tree-SHA512: 3f10f4871df59882f3649c6d3b2362cae2f8a01ad0bd0c636c5608b0d177d279a2e8712930b819d6d3912e91fa6447b9e54507c33d8afe427f7f39002b013bfb
achow101 added a commit that referenced this pull request Mar 7, 2023
802cc1e Deduplicate bitcoind and bitcoin-qt init code (Ryan Ofsky)
d172b5c Add InitError(error, details) overload (Ryan Ofsky)
3db2874 Extend bilingual_str support for tinyformat (Ryan Ofsky)
c361df9 scripted-diff: Remove double newlines after some init errors (Ryan Ofsky)

Pull request description:

  Add common InitConfig function to deduplicate bitcoind and bitcoin-qt code reading config files and creating the datadir.

  Noticed the duplicate code while reviewing #27073 and want to remove it because difference in bitcoin-qt and bitcoind behavior make it hard to evaluate changes like #27073

  There are a few minor changes in behavior:

  - In bitcoin-qt, when there is a problem reading the configuration file, the GUI error text has changed from "Error: Cannot parse configuration file:" to "Error reading configuration file:" to be consistent with bitcoind.
  - In bitcoind, when there is a problem reading the settings.json file, the error text has changed from "Failed loading settings file" to "Settings file could not be read" to be consistent with bitcoin-qt.
  - In bitcoind, when there is a problem writing the settings.json file, the error text has changed from "Failed saving settings file" to "Settings file could not be written" to be consistent with bitcoin-qt.
  - In bitcoin-qt, if there datadir is not accessible (e.g. no permission to read), there is an normal error dialog showing "Error: filesystem error: status: Permission denied [.../settings.json]", instead of an uncaught exception.

ACKs for top commit:
  Sjors:
    Light review ACK 802cc1e
  TheCharlatan:
    ACK 802cc1e
  achow101:
    ACK 802cc1e

Tree-SHA512: 9c78d277e9ed595fa8ce286b97d2806e1ec06ddbbe7bd3434bd9dd7b456faf8d989f71231e97311f36edb9caaec645a50c730bd7514b8e0fe6e6f7741b13d981
janus pushed a commit to BitgesellOfficial/bitgesell that referenced this pull request Sep 3, 2023
New function was introduced by willcl-ark <will@256k1.dev> in commit
56e370fbb9413260723c598048392219b1055ad0 from
bitcoin/bitcoin#27073 and removes some duplicate code.
@bitcoin bitcoin locked and limited conversation to collaborators Feb 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bitcoin-cli needlessly creates empty ~/.bitcoin/wallets directory
9 participants