ConfigParser should never write broken configurations #69909

spaceone · 2015-11-24T15:31:44Z

BPO	25723
Nosy	@terryjreedy, @vstinner, @bitdancer, @ambv, @spaceone

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2015-11-24.15:31:44.255>
labels = ['type-bug', 'library']
title = 'ConfigParser should never write broken configurations'
updated_at = <Date 2015-11-27.00:18:56.091>
user = 'https://github.com/spaceone'

bugs.python.org fields:

activity = <Date 2015-11-27.00:18:56.091>
actor = 'spaceone'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2015-11-24.15:31:44.255>
creator = 'spaceone'
dependencies = []
files = []
hgrepos = []
issue_num = 25723
keywords = []
message_count = 12.0
messages = ['255270', '255272', '255305', '255324', '255360', '255373', '255374', '255387', '255388', '255400', '255424', '255442']
nosy_count = 5.0
nosy_names = ['terry.reedy', 'vstinner', 'r.david.murray', 'lukasz.langa', 'spaceone']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue25723'
versions = ['Python 3.6']

spaceone · 2015-11-24T15:31:44Z

>>> from configparser import ConfigParser
>>> from io import StringIO
>>> from configparser import ConfigParser
>>> c = ConfigParser()
>>> c.add_section('foo]\nbar=baz\n[bar')
>>> fd = StringIO()
>>> c.write(fd)
>>> print(fd.getvalue())
[foo]
bar=baz
[bar]

User input should always be validated.

At least a ValueError should be raised if add_section() is called with a string containing anything like ']\x00\n[' or any other non-printable string. As this will always create a broken configuration or might lead to ini-injections.

Otherwise ConfigParser cannot be used to write new config files without having deeper knowledge about the implementation.

See also:
http://bugs.python.org/issue23301
http://bugs.python.org/issue20923

spaceone · 2015-11-24T15:32:52Z

I would have expected something like ValueError('A section must not contain any of ...') or at least that the characters are simply stripped.

terryjreedy · 2015-11-24T23:42:09Z

This strikes me as an overt bug. .add_section should add one new empty section, not a section with an item and a second section. Since a section name cannot contain \n, I would agree that passing a string containing \n should raise ValueError("Section names cannot contain newlines."). Since anything else without ']' is valid and since even that can be accommodated with a custom section name re, that is the only check that should be done.

vstinner · 2015-11-25T07:57:38Z

Terry: "Since anything else without ']' is valid (...)"

A Python script can be used to generate a configuration read by another application. This application can more more strict on the configuration format than Python, so I would prefer to deny '\n', '[' and ']' characters in section names.

I'm not sure that it's ok to modify Python < 3.6 since it can break applications relying on this ugly "feature". I propose to only modify Python 3.6.

If you need strict ConfigParser, you can inherit from the class to override add_section() to add checks on the section name.

@spaceone: Are you interested to work on a patch?

spaceone · 2015-11-25T14:42:22Z

I added also a rejection of '\r' and '\x00':
spaceone/cpython@41d6e27

It's only my opinion but I would prefer to reject all of these non printable characters:
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f[]\x7f'

But as the _parse currently detects them as valid and your opinion is probably a very different one I will not add these characters.

terryjreedy · 2015-11-25T18:03:47Z

Newline (\n) and possibly \x00, if it necessarily causes an actual problem, are the only characters that we might reject by default. Rejecting anything else is unwarrented editorializing about what people 'should' use. As you said, people who want something stricter can replace .add_section. (This would be most useful in a group or corporate setting which might, for instance, want to severely limit the character set allowed.)

In particular, I showed in bpo-20923 how to replace .SECTCRE to make "[Test[2]_foo]" yield "Test[2]_foo". The OP for that issue filed it after seeing an application that used such section names and they are not at all unreasonable. Python should be able to continue writing .ini files for that application as well as any others.

Victor, your point that even a minimal change could break working code is a good one. It suggests to me that we should maybe do nothing. This issue is motivated by a theoretical principle "User input should always be validated" that is not a Python design principle (what is valid, who decides), not by actual problems in the field.

spaceone · 2015-11-25T18:14:43Z

Isn't is an actual problem in the field? We had a vulnerability in our code due to this as we only sanitized the config values and didn't recognized that add_section() does no validation of input.

terryjreedy · 2015-11-25T20:47:58Z

Can you explain how passing \n createda vulnerability (to who, doing what) that flagging \n would prevent? Your opening example (nicely presented, by the way) shows that passing \n allows one to do with one call what would otherwise take (in the case of the example) three calls. But the result is identical, so the shortcut seems harmless.

bitdancer · 2015-11-25T21:36:15Z

The issue would be if the section name came from user input. Then an attacker could craft a section name that would result in inserting arbitrary names and values into the config file.

terryjreedy · 2015-11-26T04:56:35Z

We all know that blindly inserting external data into a database can be a bad idea. But raising ValueError if the data contains \n barely scratches the surface of a real defense. The external data should be checked before passing it to .add_section or as part of a derived method in a subclass. I already suggested the possibility of allowing only a restricted set of letter characters. Such a check comes after defending against the possibility of someone submitting 'a'*1000000 as, in this case, a section name.

configparser is permissive by design, not by accident. The un-abbreviated verbose re for ConfigParser.SECTCRE say so.
(?P<header>[^]]+) # very permissive!

bitdancer · 2015-11-26T16:56:07Z

I view this as similar to the corresponding issue with email headers, where we fixed a similar security issue. The special danger of \n is that it allows you to create a *new* header, or in this case section, with an arbitrary value, possibly overriding an existing section and thus changing the behavior of the program in an exploitable way. This is *far* easier to exploit than the ability to introduce arbitrary data into the section name itself. Good security involves concentric rings of defense, and one should almost always be more secure by default when it has a small usability impact. In this case, there is no legitimate use for \n in a section name, so the only usability impact would be if some weird program out there was actually making use of this for some reason, against all reasonable logic :). Which is why we are suggesting changing it only in 3.6.

\x00 is problematic (though somewhat less so) for the same reason, as many file readers will treat it as equivalent to end of line and allow a similar exploit. \r, \f, and \x1c-\x1e should also be blocked, but otherwise we should probably ignore non-printables for backward compatibility reasons (there we move further into the usability impact area).

spaceone · 2015-11-27T00:18:55Z

Of course both of you have reasonable arguments.
For compatibility with overridden SECTRE attributes it should not raise ValueError for characters like [ and ]. (too bad that SECTRE is a public attribute otherwise it could also be used to validate the name (SECTRE.match('[%s]')). What if somebody changed SECTRE to multiline? Then even rejecting '\n' would break compatibility.
But: How often does this happen? In open source projects it seems none. A nullege.com and google search exposed that no project does this.

Terry, I completely agree with your argument "that blindly inserting external input into a database is bad idea". But in the real world it happens that there are many applications out which doesn't validate what they pass into .add_section(). (Do you want me to give you a list of python projects which are either broken or vulnerable?). In my opinion this is dangerous, as well as not validating HTTP/Mail/MIME headers for such characters and so on.
What's the goal of python here? Giving programmers nice utilities which have security considerations in its software design by default or giving everything up to the programmer which is forced to never trust the stdlib and always have to read the source code it uses?

As I understand when I read the documentation is that config parser is loosely based on M$ INI files and as the name says it is for configuration files. Usually(!) configuration files are human readable files editable with an editor. Disallowing non-printable characters would have been the best option in the first release of config parser.
From my experience it is good to restrict things from the beginning and make them overrideable to be more relaxed if this is really needed.

And regarding bpo-20923: I think it would be a great feature to include the code change instead of changing the documentation. In my research about add_section() I found some projects which uses URI's as section name. As you know the WWW is evolving and actually http://[::1]/ is a valid URI nowadays. If this would be changed these implementations will not have to overwrite SECTRE in the future and they also won't run into that bug one day.

I adapted my commit to only disallow \r \n and \x00. [ ] are allowed for customization of SECTRE.
spaceone/cpython@a0cdb85

spaceone mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Nov 24, 2015

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConfigParser should never write broken configurations #69909

ConfigParser should never write broken configurations #69909

spaceone mannequin commented Nov 24, 2015

spaceone mannequin commented Nov 24, 2015

spaceone mannequin commented Nov 24, 2015

terryjreedy commented Nov 24, 2015

vstinner commented Nov 25, 2015

spaceone mannequin commented Nov 25, 2015

terryjreedy commented Nov 25, 2015

spaceone mannequin commented Nov 25, 2015

terryjreedy commented Nov 25, 2015

bitdancer commented Nov 25, 2015

terryjreedy commented Nov 26, 2015

bitdancer commented Nov 26, 2015

spaceone mannequin commented Nov 27, 2015

ConfigParser should never write broken configurations #69909

ConfigParser should never write broken configurations #69909

Comments

spaceone mannequin commented Nov 24, 2015

spaceone mannequin commented Nov 24, 2015

spaceone mannequin commented Nov 24, 2015

terryjreedy commented Nov 24, 2015

vstinner commented Nov 25, 2015

spaceone mannequin commented Nov 25, 2015

terryjreedy commented Nov 25, 2015

spaceone mannequin commented Nov 25, 2015

terryjreedy commented Nov 25, 2015

bitdancer commented Nov 25, 2015

terryjreedy commented Nov 26, 2015

bitdancer commented Nov 26, 2015

spaceone mannequin commented Nov 27, 2015