New -includeconf argument for including external configuration files #10267

Open
wants to merge 3 commits into
from

Conversation

Projects
None yet
6 participants
Contributor

kallewoof commented Apr 24, 2017 edited

Fixes: #10071.

Done:

  • adds -includeconf=<path>, where <path> is relative to datadir or to the path of the file being read, if in a file
  • protects against circular includes
  • updates help docs

Thoughts:

  • I am not sure how to test this in a neat manner. Feedback on this would be nice. Will dig/think though.
Member

jonasschnelli commented Apr 24, 2017

Yeah. Why not. This can be useful.

  • I would recommend to use -addconf= (otherwise user may think it replaces the bitcoin.conf configuration file).
  • If I follow GetConfigFile() correctly, you can also use absolut paths, right?
Member

NicolasDorier commented Apr 24, 2017

Why not making the existing -config a repeatable argument.

Contributor

kallewoof commented Apr 24, 2017

@jonasschnelli Good point - will switch to -addconf=. Yes, you can use absolute paths. My worry above is for when a user presumes the path is relative to the config file when it is in fact not.

@NicolasDorier I think -config simply tells what name to use and defaults to bitcoin.conf -- it doesn't actually load the file. This feature lets you load other files arbitrarily from within bitcoin.conf.

Contributor

kallewoof commented Apr 24, 2017 edited

Owner

laanwj commented Apr 24, 2017

Concept ACK

My worry above is for when a user presumes the path is relative to the config file when it is in fact not.

Yes, making it relative to the data directory is a good choice. I think we should handle all relative paths in bitcoind that way.

Why not making the existing -config a repeatable argument.

That was also my first thought, but it may just be confusing as it changes the meaning of the option slightly. It's possible that some setups already use multiple -conf options, and rely on the overriding behavior.

So I'm good with making it an explicit option. Another suggestion for the name would be -includeconf.

Member

jtimon commented Apr 24, 2017

Concept ACK. Don't care much about the name, but what about -extraconf ?

Contributor

kallewoof commented Apr 25, 2017 edited

From the given suggestions I think includeconf is the most clear so I'll switch to that.

@laanwj:

Yes, making it relative to the data directory is a good choice. I think we should handle all relative paths in bitcoind that way.

To clarify, you mean that the relative path inside /dir/file.conf should be /dir/, not [bitcoin datadir], right? It will require some lines of code I bet but I think that makes sense too.

Contributor

kallewoof commented Apr 25, 2017 edited

The includeconf feature now defines the path relative to the file being read, if any. For command line, it is [datadir], for /dir/abc.conf it is /dir/. I tested this with

src/testreadconfig/bitcoin.conf: [...] includeconf=../global.conf
src/global.conf: includeconf=secrets.conf
src/secrets.conf: rpcpassword=foo

with bitcoind -datadir=testreadconfig. Ensured bitcoin-cli with password foo worked and password bar did not.

[...]: → 5⊱16⊱2

kallewoof changed the title from New -readconfig argument for including external configuration files to New -includeconf argument for including external configuration files Apr 25, 2017

Owner

laanwj commented Apr 25, 2017 edited

To clarify, you mean that the relative path inside /dir/file.conf should be /dir/, not [bitcoin datadir], right? It > will require some lines of code I bet but I think that makes sense too.

Yes, seems good to me too. So it's like C's include "" - I wasn't thinking about relative includes in other includes.

Contributor

kallewoof commented Apr 25, 2017

To clarify, the code now does what @laanwj suggested.

laanwj added the RPC/REST/ZMQ label Apr 25, 2017

Owner

laanwj commented May 1, 2017

I am not sure how to test this in a neat manner. Feedback on this would be nice. Will dig/think though.

Some ideas:

  • RPC test that creates a tree of bitcoin config files including each other beneath the data directory
  • Starts a node w/ -includeconf=<path>
  • Then interrogate node over RPC to verify the files got included, in the right order

To achieve the latter the option uacomment= is useful, as these will be added to an array, then querying getnetworkinfo to see if the (...) part of the subversion matches expected content and order.

Member

jnewbery commented May 1, 2017

@laanwj's suggested test method seems sensible. I'm happy to review that or lend a hand implementing it. Feel free to reach me on IRC.

Contributor

kallewoof commented May 2, 2017 edited

Thanks for the suggestion! I added a test that checks for load order and ensures circular include is guarded against. @jnewbery review would be wonderful :)

Edit: If anyone has ideas why travis is failing I'd appreciate it. It works fine on all the machines I test it on locally (mac, linux).

Member

jnewbery commented May 2, 2017

If anyone has ideas why travis is failing I'd appreciate it

You've made ReadConfigFile() recursive (through ProcessSetting()). ReadConfigFile() locks cs_args, and then at the end calls ClearDatadirCache(), which locks csPathCached. That means that the bottom-most ReadConfigFile() locks csPathCached while cs_args is still held.

There's already a function that locks in the other order: GetDataDir() locks csPathCached and then locks cs_args (in its call to IsArgSet()).

If those two functions are called in different threads, we'd have a deadlock.

There's a CPP_FLAG option that checks lock ordering CPPFLAGS=-DDEBUG_LOCKORDER, which is used in Travis build 5. That's why that build is failing. You can repro locally by running configure with that option.

You can fix this by not locking cs_args recursively.

Contributor

kallewoof commented May 8, 2017

@jnewbery Thanks a lot for the explanation! I should've paid closer attention to locks considering the added recursiveness.

97ee63b fixes this by moving the conditionally-locked code into a new ReadConfigStream function which is called with locking/clearing in one case and without in the other, based on a bool lockAndClear added to ReadConfigFile.

(Also had to tweak tests a tiny bit; 8fb6511.)

Member

jnewbery commented May 15, 2017

I think you've introduced a subtle bug here. If -datadir is configured in one of the additional config files, then the datadir cache won't be cleared, which means that bitcoind will continue to use the old datadir.

I think you should try to not make ReadConfigFile recursive. For me, it would be acceptable to only allow one level of redirection here (ie the "base" config file can specify -includeconf config file, but those included config files cannot themselves include other config files). I think that would be a simpler model and would remove a whole bunch of potential bugs (circular references, blowing the stack through too many -includeconf redirects, etc).

Contributor

kallewoof commented May 16, 2017 edited

@jnewbery Hm, no the datadir cache is cleared after any recursions happen, which means it is always cleared, just not directly after the config file has been parsed. There are two cases:

  1. ParseParameters (util.cpp:407 called from bitcoind.cpp:75) -- this will work as normal and does not require cache clearing.
  2. Nested ReadConfigs from the initial bitcoin.conf file: bitcoind:107 calls ReadConfigFile with the lockAndClear flag set; recursion then happens in ProcessSetting (L401) via ReadConfigStream (L622) with lockAndClear unset. Eventually this gets back to original caller which leaves ReadConfigStream and gets to util.cpp:649 which clears the datadir cache.
    Or am I missing something?

As for forbidding multiple levels of recursion, I think the value outweighs the issues personally (and I addressed circular refs I believe), but if people think it's not worth it I'll restrict it to one include.

Member

jnewbery commented May 16, 2017

@kallewoof yes you're right. datadir cache is cleared after all files are read. My mistake.

I still don't like the recursion and the fact that there can be multiple levels of imports. It means there are more edge cases and unexpected behaviour. For example, if -includeconf is included as a command line parameter, then the includeconf file is read before the regular conf file, and so takes precedence. If an includeconf line is included in the regular conf file, then it is read during the regular config file, and which settings are taken from the regular conf file and which are taken from the includeconf file depend on the ordering of settings in the regular conf file.

The new warnOnFailure and lockAndClear bool arguments to ReadConfigFile() seem pretty strange to me. They're only used when ReadConfigFile() is being called recursively, and they control a large chunk of the behaviour within ReadConfigFile(). That's a clue to me that maybe the functionality isn't split up correctly - perhaps the locking/clearing should be in an outer function which calls an inner function for each of the config files?

Finally, you've introduced a new crash bug. If -conf or -includeconf don't refer to a valid file, the bitcoind will crash on startup. Here's the backtrace:

Core was generated by `bitcoind -datadir=/tmp/user/1000/test6p7xn_xt/856/node0 -server -keypool=1 -dis'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f3b77df0428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007f3b77df0428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007f3b77df202a in __GI_abort () at abort.c:89
#2  0x00007f3b7873284d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007f3b787306b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f3b78730701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f3b78730919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f3b79f29f82 in boost::filesystem::detail::canonical(boost::filesystem::path const&, boost::filesystem::path const&, boost::system::error_code*) () from /usr/lib/x86_64-linux-gnu/libboost_filesystem.so.1.58.0
#7  0x000055e5c58b2ae2 in boost::filesystem::canonical (base=..., p=...) at /usr/include/boost/filesystem/operations.hpp:459
#8  GetConfigFile (confPath="global.conf", relativePath="") at util.cpp:603
#9  0x000055e5c58b4353 in ArgsManager::ProcessSetting (this=this@entry=0x55e5c5db1fa0 <gArgs>, strKey="-includeconf", strValue="global.conf", relativePath="") at util.cpp:386
#10 0x000055e5c58b487d in ArgsManager::ParseParameters (this=0x55e5c5db1fa0 <gArgs>, argc=<optimized out>, argv=<optimized out>) at util.cpp:424
#11 0x000055e5c5653d4a in ParseParameters (argv=0x7fff5492e618, argc=14) at util.h:263
#12 AppInit (argc=14, argv=0x7fff5492e618) at bitcoind.cpp:75
#13 0x000055e5c5648bef in main (argc=14, argv=0x7fff5492e618) at bitcoind.cpp:196
(gdb) quit

If I use an invalid filename for -conf on master. I don't see this crash.

Contributor

kallewoof commented May 17, 2017 edited

@jnewbery Thanks a lot for all the feedback.

I still don't like the recursion and the fact that there can be multiple levels of imports. It means there are more edge cases and unexpected behaviour. For example, if -includeconf is included as a command line parameter, then the includeconf file is read before the regular conf file, and so takes precedence. If an includeconf line is included in the regular conf file, then it is read during the regular config file, and which settings are taken from the regular conf file and which are taken from the includeconf file depend on the ordering of settings in the regular conf file.

That is the case without recursion as well, unless we forbid command line case. (Which I can do already, even while keeping recursion.)

The new warnOnFailure and lockAndClear bool arguments to ReadConfigFile() seem pretty strange to me. They're only used when ReadConfigFile() is being called recursively, and they control a large chunk of the behaviour within ReadConfigFile(). That's a clue to me that maybe the functionality isn't split up correctly - perhaps the locking/clearing should be in an outer function which calls an inner function for each of the config files?

Hm.. the warnOnFailure was just a nice-to-have to inform the user when an explicitly included file didn't actually exist, but I can remove it for cleanliness. Since I moved most of ReadConfigFile into ReadConfigStream, the only remaining stuff was the caching stuff, which doesn't feel odd to me. I'm not actually sure why you consider this to be a problem: the ReadConfigFile is mostly there to do or not do the locking and data cache clearing, and the ReadConfigStream is there to do the actual reading/parsing part.

That said, I'm not overly attached to the idea of allowing recursion, so unless someone speaks for it I am going to try to simplify the code to only allow one single include and to only allow it in the file, i.e. not from command line. I believe that would address most of your concerns.

Edit: Oh, and thanks for finding the crash -- I was sure I tested that, but I guess not.

Edit 2: Yeah, I never tested the case where the path was not a valid path, only when it was a non-existent one.

@jnewbery

This looks much much better. Thanks @kallewoof! I much prefer this implementation, which restricts the way --includeconf can be used and cuts out lots of edge cases.

If it turns out that people need more flexibility, we can extend this later to allow multiple includeconfs, including multiple levels. That's much easier than going the other way (from less restrictive to more restrictive).

I have a few nits, mainly around naming.

I think there's one bug, or at least confusing behaviour. You said in your comment that you don't allow -includeconf from the command line, but your new code is allowing that. Further, if an -includeconf is included on the command line and in the base config file, then the command line -importconf takes precedence. I don't think that's what we want. To fix that, I think the first thing ReadConfigFile() should do clear the -includeconf argument if it's set.

src/util.cpp
@@ -592,26 +592,41 @@ fs::path GetConfigFile(const std::string& confPath)
return pathConfigFile;
}
+void ArgsManager::ReadConfigStream(fs::ifstream& streamConfig)
@jnewbery

jnewbery May 31, 2017

Member

I'd prefer to name this function ReadConfigFile(). At the moment, the only type of input it can read is a file, so naming it ReadInputStream() is a little misleading.

@kallewoof

kallewoof Jun 1, 2017

Contributor

I named if for the fact it got anifstream as argument, but you're right that it's probably better to just name it ReadConfigFile.

src/util.cpp
@@ -592,26 +592,41 @@ fs::path GetConfigFile(const std::string& confPath)
return pathConfigFile;
}
+void ArgsManager::ReadConfigStream(fs::ifstream& streamConfig)
+{
+ std::set<std::string> setOptions;
@jnewbery

jnewbery May 31, 2017

Member

Is it possible to add a log to this function to output which file it's reading, or is it too early to start logging?

Sorry - by my earlier comment, I didn't mean you should remove the logging on failure, just that the structure of the function suggested to me that it shouldn't be called recursively.

@kallewoof

kallewoof Jun 1, 2017

Contributor

I am logging before the call to this method now. That should cover it I think.

src/util.cpp
+ mapMultiArgs[strKey].push_back(strValue);
+ }
+}
+
void ArgsManager::ReadConfigFile(const std::string& confPath)
@jnewbery

jnewbery May 31, 2017

Member

I think it'd be clearer to rename this to ReadConfigFiles(). You could remove the confPath argument since it's a property of the ArgsManager class. The three places where this is called can just call gArgs.ReadConfigFiles().

The responsibilities of this function then becomes very clear: read all the config files.

@kallewoof

kallewoof Jun 1, 2017

Contributor

Done!

src/util.cpp
+ if (mapArgs.count("-includeconf")) includeconf = mapArgs["-includeconf"];
+ }
+ if (includeconf != "") {
+ fs::path includeFile = GetConfigFile(includeconf);
@jnewbery

jnewbery May 31, 2017

Member

Why not directly:

fs::ifstream includeConfig(GetConfigFile(includeconf));

(like above)

src/util.cpp
+ if (includeconf != "") {
+ fs::path includeFile = GetConfigFile(includeconf);
+ fs::ifstream includeConfig(includeFile);
+ if (includeConfig.good()) {
@jnewbery

jnewbery May 31, 2017

Member

Can we have an else clause that prints an error message if we fail to open an includeconfig file?

@@ -0,0 +1,43 @@
+#!/usr/bin/env python3
@jnewbery

jnewbery May 31, 2017

Member

It'd be nice to have a couple of other subtests:

  • including --includeconf in the command line argument has no effect
  • including includeconf in an includeconf file has no effect.
@kallewoof

kallewoof Jun 1, 2017 edited

Contributor

I'm not sure what a neat and tidy way to test that would look like. Should I just make multiple classes, one for each, and then call them one at a time in the bottom if __name__ == '__main__':?

@jnewbery

jnewbery Jun 1, 2017

Member

These don't actually need to be separate tests. You can just have the following in setup_chain():

        with open(os.path.join(self.options.tmpdir + "/node0", "relative.conf"), "w", encoding="utf8") as f:
            f.write("uacomment=relative\nincludeconf=relativeofrelative.conf\n")
        with open(os.path.join(self.options.tmpdir + "/node0", "bitcoin.conf"), 'a', encoding='utf8') as f:
            f.write("uacomment=main\nincludeconf=relative.conf\n")
        with open(os.path.join(self.options.tmpdir + "/node0", "commandline.conf"), "w", encoding="utf8") as f:
            f.write("uacomment=commandline\n")
        with open(os.path.join(self.options.tmpdir + "/node0", "relativeofrelative.conf"), "w", encoding="utf8") as f:
            f.write("uacomment=relativeofrelative\n")

and the following line in __init__():

        self.extra_args = [['-includeconf=commandline.conf']]

and then keep the same assert:

        assert subversion.endswith("main; relative)/")

to check that neither comandline.conf nor relativeofrelative.conf have been included.

(I recommend you also update the docstring to describe what's happening)

test/functional/includeconf.py
+ super().setup_chain()
+ # Create additional config file
+ # - tmpdir/node0/relative.conf
+ with open(os.path.join(self.options.tmpdir+"/node0", "relative.conf"), "w", encoding="utf8") as f:
@jnewbery

jnewbery May 31, 2017

Member

nit: spaces around + please

test/functional/includeconf.py
+ self.setup_clean_chain = False
+ self.num_nodes = 1
+
+ def setup_network(self):
@jnewbery

jnewbery May 31, 2017

Member

not required. setup_network() in the base class just calls setup_nodes() when there's only one node.

test/functional/includeconf.py
+ def setup_network(self):
+ self.setup_nodes()
+
+ def run_test (self):
@jnewbery

jnewbery May 31, 2017

Member

no space before open parentheses please.

test/functional/includeconf.py
+ self.setup_nodes()
+
+ def run_test (self):
+ '''
@jnewbery

jnewbery May 31, 2017

Member

My personal preference is to have the description of the test in the module-level doc string (since that's the first thing people see when they open the file).

Contributor

kallewoof commented Jun 1, 2017

Updated and squashed. Unsquashed history.

Member

jnewbery commented Jun 1, 2017

Looks great. Tested ACK bc4f7a4

One suggestion for adding to the testcase. Up to you whether you want to take it.

Member

jtimon commented Jun 1, 2017

This conflicts a little bit with 7246fae

There I use the old ArgsManager::ReadConfigFile(path) which this PR remove.
Could you conserve that method even if it's temporarily unused (although preferrably using it internally like "ArgsManager::ReadConfigFile(fs::ifstream& streamConfig)")?

void ArgsManager::ReadConfigFile(const std::string& confPath)
{
     fs::ifstream streamConfig(GetConfigFile(confPath));
      if (!streamConfig.good())
          return; // No bitcoin.conf file is OK
// ...
}

Or perhaps I should restore it later if this PR gets merged first.
utACK bc4f7a4 beyond that.

+ LOCK(cs_args);
+ ReadConfigFile(includeConfig);
+ } else {
+ LogPrintf("Failed to include configuration file %s\n", includeconf.c_str());
}
}
// If datadir is changed in .conf file:
@jtimon

jtimon Jun 1, 2017 edited

Member

Shouldn't ClearDatadirCache() be called from ArgsManager::ReadConfigFile(fs::ifstream& streamConfig) ?

@kallewoof

kallewoof Jun 5, 2017

Contributor

ReadConfigFile is only called from ReadConfigFiles, which clears the data cache already, so it should be fine I think.

@jtimon

jtimon Jun 5, 2017

Member

Yes, unless you conserve a ArgsManager::ReadConfigFile(const std::string& confPath), then that will presumably call call ArgsManager::ReadConfigFile(fs::ifstream& streamConfig too.

Contributor

kallewoof commented Jun 5, 2017

@jtimon The method was not removed, it was renamed. If you change to ReadConfigFiles you should get the exact same result after this is merged. Let me know if that is not the case!

Member

jtimon commented Jun 5, 2017

The method that is removed is the one that allows you to call with a path, ArgsManager::ReadConfigFile(const std::string& confPath). The new ReadConfigFiles() will work just fine here, but in #8994 I cannot use it, because want I want is precisely to load from a different file (and not allow -includeconf or command line for "custom chainparams" configuration). If I use AgsManager::ReadConfigFile(fs::ifstream& streamConfig) directly from chainparams.cpp, not only I will duplicate code, but also call things like GetConfigFile(), which I would prefer not to call from chainparams.cpp. So my conclusion is that I would just restore ArgsManager::ReadConfigFile(const std::string& confPath). That's why I ask that you maintain it even if you don't need it for anything (but it can also be removed and then restored, it's not a big deal)

jnewbery referenced this pull request Jun 5, 2017

Open

Testchains: Introduce custom chain whose constructor... #8994

5 of 5 tasks complete
Contributor

kallewoof commented Jun 6, 2017 edited

Ohh, damn... It seems the removal of the path may have been premature. What do you think of ReadConfigFiles()ReadConfigFiles(std::string path = "", bool allowIncludes = true) whose default does exactly what ReadConfigFiles does now? @jtimon @jnewbery

Member

jnewbery commented Jun 6, 2017

I think ReadConfigFiles() shouldn't take any arguments and should be responsible for finding and reading all config files. I don't understand why you'd want to read config files from other places in the codebase in #8994. It seems to me to be much simpler to reason about what config is loaded if it all happens in one place.

Member

jtimon commented Jun 6, 2017

@kallewoof yeah, I think that would work too, and you could still call it without parameters. That solution is very simple for me to "restore" on #8994 if people don't like it here. There's no need to slow this down if other people don't like my request. Thank you for offering a good and simple solution to my concern.

@jnewbery I don't want the chain custom parameters to be perceived as "config". They select the chain you will be on, it is mostly intended to create new testnets, and sharing a "testnet config file" for a newly created one could be a useful thing. But perhaps that's something to discuss on #8994 rather than here. I still have it on a separated commit in case people prefer to allow consensus critical parameters to be passed from command line or the other config files that con be loaded.

Member

jnewbery commented Jun 14, 2017

@jtimon

I don't want the chain custom parameters to be perceived as "config".

I actually think having the customchain config file contain general config could be useful. There seemed to be some enthusiasm for #9374 , which is similar in nature - it allows a separate config file for each separate chain.

This is a bit of a sidetrack from this PR though, which I think is a good and useful improvement. This needs rebasing because of a conflict in test_runner.py. Assuming just that needs changing, then I still ACK this. I'll give some more feedback in #9374.

Member

jnewbery commented Jun 14, 2017

alternative: this PR could move the:

fs::ifstream streamConfig(GetConfigFile(confPath));
if (!streamConfig.good())

lines into the new ReadConfigFile() function instead of leaving them in ReadConfigFiles(), and have ReadConfigFile() take a std::string instead of a fs::ifstream&. That would remove some code duplication (since that's called for both the 'base' config file and the -includeconf file).

#9374 could then call ReadConfigFile() exactly as before.

Contributor

kallewoof commented Jun 15, 2017 edited

@jnewbery I would need to either change the return value to be a success flag for streamConfig.good(), or add an additional bool warnOnFailure flag, like I had before, as we currently don't warn for the main config missing, but we do warn for includeconfs.

Rebased, btw.

kallewoof added some commits Apr 24, 2017

@kallewoof kallewoof -includeconf=<path> support in config handler, for including external…
… configuration files. Fixes #10071
d28a638
@kallewoof kallewoof [doc] Added -includeconf CLI help, translation strings, man page entr…
…ies.
6a7b90c
@kallewoof kallewoof [test] Test includeconf parameter. 9e280e3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment