New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
configparser support for reading from strings and dictionaries #53697
Comments
Overview It's a fairly common need in configuration parsing to take configuration from a string or a Python data structure (most commonly, a dictionary). The attached patch introduces two new methods to RawConfigParser that serve this purpose: readstring and readdict. In the process, two behavioral bugs were detected and fixed. Detailed information about the patch
Test changes:
Documentation changes:
|
There goes the patch. |
Patch updated after review by Ezio Melotti. To answer a common question that came up in the review: all atypical names and implementation details are there due to consistency with existing configparser code, e.g.:
API won't change so this has to remain that way. Exceptions may be refactored in one go at a later stage. |
Although you say this is fairly common, I haven't heard of anyone using or requesting this type of feature. Do you have any real-world use cases for this? Before we start adding more read methods I think we should know who wants them and why. I'm not sure duplicates should raise exceptions. To me, the current behavior of using the last read section/option is fine. It's predictable and it works. Halting a program's operation due to duplicate sections/options seems a bit harsh to me. |
Good questions, thanks! The answers will come useful for documentation and later hype :) READING CONFIGURATION FROM A DATA STRUCTURE This is all about templating a decent set of default values. The major use case I'm using this for (with a homebrew SafeConfigParser subclass at the moment) is to provide in one place a set of defaults for the whole configuration. The so-called
[name-server] [workflow-manager] [legacy-integration] # there were about 15 of these
[company1-report] [company2-report] # and so on for ~10 entries As you can see, in both examples
I personally like the dictionary reading method but this is a matter of taste. Plus, .fromstring() is already used in unit tests :) DUPLICATE OPTION VALIDATION Firstly, I'd like to stress that this validation does NOT mean that we cannot update keys once they appear in configuration. Duplicate option detection works only while parsing a single file, string or dictionary. In this case duplicates are a configuration error and should be notified to the user. You are right that for a programmer accepting the last value provided is acceptable. In this case the impact should be on the user who might not feel the same. If his configuration is ambiguous, it's best to use the Zen: "In the face of ambiguity, refuse the temptation to guess." This is very much the case for large configuration files (take /etc/ssh/sshd_config or any real life ftpd config, etc.) when users might miss the fact that one option is uncommented in the body or thrown in at the end of the file by another admin or even the user himself. Users might also be unaware of the case-insensitivity. These two problems are even more likely to cause headaches for the dictionary reading algorithm where there actually isn't an order in the keys within a section and you can specify a couple of values that represent the same key because of the case-insensitivity. Plus, this is going to be even more visible once we introduce mapping protocol access when you can add a whole section with keys using the dictionary syntax. Another argument is that there is already section duplicate validation but it doesn't work when reading from files. This means that the user might add two sections of the same name with contradicting options. SUMMARY In terms of validation, after you remark and thinking about it for a while, I think that the best path may be to let programmers choose during parser initialization whether they want validation or not. This would be also a good place to include section duplicate validation during file reading. Should I provide an updated patch? After a couple of years of experience with external customers configuring software I find it better for the software to aid me in customer support. This is the best solution when users can help themselves. And customers (and we ourselves, too!) do stupid things all the time. And so, specifying a default set of sane values AND checking for duplicates within the same section helps with that. |
Reading from a string is certainly fairly common, though I'm pretty happy with using an io.StringIO seems reasonable and straightforward. I've never stumbled over the need to "read" from dictionaries as described. |
Corrected a simple mistake in the patch. |
Updated patch after discussion on #python-dev:
|
FTR, some people questioned the purpose of read_dict(). Let me summarize this very briefly here:
|
Rietveld review link: http://codereview.appspot.com/1924042/edit |
I agree that the existing defaults={...} should never have been added to the stdlib. It made sense in the originating application, but should have been implemented differently to keep application-specific behavior out of what eventually was added to the stdlib. Will think further about the rest of this when I'm on my own computer and can read & play with the patch in a more usable environment. |
Patch updated after review by Ezio Melotti and Éric Araujo. Thanks guys. |
(Apparently I don't have the right permissions on Rietveld.)
I think this should have been several separate patches:
Don't change that at this point, but please consider smaller chunks in |
Updated patch after review by Fred Drake. Thanks, it was terrific! Status:
Corrected where applicable. Is it OK if the one-sentence summary is occasionally longer than one line? Check out DuplicateSectionError, could it have a summary as complete as this that would fit a single line? On a similar note, an inconsistency of configparser.py is that sometimes a blank line is placed after the docstring and sometimes there is none. How would you want this to look vi in the end?
Are you sure about that?
Updated. All in all, some major unit test lipsticking should be done as a separate patch.
Corrected, even made my Vim show me these kinds of formatting problems. I also corrected a couple of these which were there before the change.
Good point, thanks.
Corrected.
All of these arguments are new in trunk so there is no backwards compatibility here to think about. The arguments that are not new (defaults and dict_type) are placed before the asterisk.
I will, thanks. |
Ah, forgot to remind you that I don't have commit privileges yet. |
It’s a one-line summary, not one sentence. PEP-257 has all the details you’re looking for, and more. |
def __init__(self, section, option):
Error.__init__(
self,
"Option %r in section %r already exists" % (option, section))
self.section = section
self.option = option
self.args = (section, option) |
Patch updated. All docstrings now have a one-line summary. All multiline docstrings now have a newline character before the closing """. No method docstrings now include any additional newlines between them and the All read_* methods now have a source= attribute. read_file defaults to <???>, As for Duplicate*Error, I've added source= and lineno= to both. Didn't add line= The
Documentation and unit tests updated. BTW, if PS. I made Vim show me too long lines with a delicate red background. Way better |
Patch committed on py3k branch in r83889. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: