Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConfigParser should never write broken configurations #69909

Open
spaceone mannequin opened this issue Nov 24, 2015 · 12 comments
Open

ConfigParser should never write broken configurations #69909

spaceone mannequin opened this issue Nov 24, 2015 · 12 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@spaceone
Copy link
Mannequin

spaceone mannequin commented Nov 24, 2015

BPO 25723
Nosy @terryjreedy, @vstinner, @bitdancer, @ambv, @spaceone

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2015-11-24.15:31:44.255>
labels = ['type-bug', 'library']
title = 'ConfigParser should never write broken configurations'
updated_at = <Date 2015-11-27.00:18:56.091>
user = 'https://github.com/spaceone'

bugs.python.org fields:

activity = <Date 2015-11-27.00:18:56.091>
actor = 'spaceone'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2015-11-24.15:31:44.255>
creator = 'spaceone'
dependencies = []
files = []
hgrepos = []
issue_num = 25723
keywords = []
message_count = 12.0
messages = ['255270', '255272', '255305', '255324', '255360', '255373', '255374', '255387', '255388', '255400', '255424', '255442']
nosy_count = 5.0
nosy_names = ['terry.reedy', 'vstinner', 'r.david.murray', 'lukasz.langa', 'spaceone']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue25723'
versions = ['Python 3.6']

@spaceone
Copy link
Mannequin Author

spaceone mannequin commented Nov 24, 2015

>>> from configparser import ConfigParser
>>> from io import StringIO
>>> from configparser import ConfigParser
>>> c = ConfigParser()
>>> c.add_section('foo]\nbar=baz\n[bar')
>>> fd = StringIO()
>>> c.write(fd)
>>> print(fd.getvalue())
[foo]
bar=baz
[bar]

User input should always be validated.

At least a ValueError should be raised if add_section() is called with a string containing anything like ']\x00\n[' or any other non-printable string. As this will always create a broken configuration or might lead to ini-injections.

Otherwise ConfigParser cannot be used to write new config files without having deeper knowledge about the implementation.

See also:
http://bugs.python.org/issue23301
http://bugs.python.org/issue20923

@spaceone spaceone mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Nov 24, 2015
@spaceone
Copy link
Mannequin Author

spaceone mannequin commented Nov 24, 2015

I would have expected something like ValueError('A section must not contain any of ...') or at least that the characters are simply stripped.

@terryjreedy
Copy link
Member

This strikes me as an overt bug. .add_section should add one new empty section, not a section with an item and a second section. Since a section name cannot contain \n, I would agree that passing a string containing \n should raise ValueError("Section names cannot contain newlines."). Since anything else without ']' is valid and since even that can be accommodated with a custom section name re, that is the only check that should be done.

@vstinner
Copy link
Member

Terry: "Since anything else without ']' is valid (...)"

A Python script can be used to generate a configuration read by another application. This application can more more strict on the configuration format than Python, so I would prefer to deny '\n', '[' and ']' characters in section names.

I'm not sure that it's ok to modify Python < 3.6 since it can break applications relying on this ugly "feature". I propose to only modify Python 3.6.

If you need strict ConfigParser, you can inherit from the class to override add_section() to add checks on the section name.

@spaceone: Are you interested to work on a patch?

@spaceone
Copy link
Mannequin Author

spaceone mannequin commented Nov 25, 2015

I added also a rejection of '\r' and '\x00':
spaceone/cpython@41d6e27

It's only my opinion but I would prefer to reject all of these non printable characters:
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f[]\x7f'

But as the _parse currently detects them as valid and your opinion is probably a very different one I will not add these characters.

@terryjreedy
Copy link
Member

Newline (\n) and possibly \x00, if it necessarily causes an actual problem, are the only characters that we might reject by default. Rejecting anything else is unwarrented editorializing about what people 'should' use. As you said, people who want something stricter can replace .add_section. (This would be most useful in a group or corporate setting which might, for instance, want to severely limit the character set allowed.)

In particular, I showed in bpo-20923 how to replace .SECTCRE to make "[Test[2]_foo]" yield "Test[2]_foo". The OP for that issue filed it after seeing an application that used such section names and they are not at all unreasonable. Python should be able to continue writing .ini files for that application as well as any others.

Victor, your point that even a minimal change could break working code is a good one. It suggests to me that we should maybe do nothing. This issue is motivated by a theoretical principle "User input should always be validated" that is not a Python design principle (what is valid, who decides), not by actual problems in the field.

@spaceone
Copy link
Mannequin Author

spaceone mannequin commented Nov 25, 2015

Isn't is an actual problem in the field? We had a vulnerability in our code due to this as we only sanitized the config values and didn't recognized that add_section() does no validation of input.

@terryjreedy
Copy link
Member

Can you explain how passing \n createda vulnerability (to who, doing what) that flagging \n would prevent? Your opening example (nicely presented, by the way) shows that passing \n allows one to do with one call what would otherwise take (in the case of the example) three calls. But the result is identical, so the shortcut seems harmless.

@bitdancer
Copy link
Member

The issue would be if the section name came from user input. Then an attacker could craft a section name that would result in inserting arbitrary names and values into the config file.

@terryjreedy
Copy link
Member

We all know that blindly inserting external data into a database can be a bad idea. But raising ValueError if the data contains \n barely scratches the surface of a real defense. The external data should be checked before passing it to .add_section or as part of a derived method in a subclass. I already suggested the possibility of allowing only a restricted set of letter characters. Such a check comes after defending against the possibility of someone submitting 'a'*1000000 as, in this case, a section name.

configparser is permissive by design, not by accident. The un-abbreviated verbose re for ConfigParser.SECTCRE say so.
(?P<header>[^]]+) # very permissive!

@bitdancer
Copy link
Member

I view this as similar to the corresponding issue with email headers, where we fixed a similar security issue. The special danger of \n is that it allows you to create a *new* header, or in this case section, with an arbitrary value, possibly overriding an existing section and thus changing the behavior of the program in an exploitable way. This is *far* easier to exploit than the ability to introduce arbitrary data into the section name itself. Good security involves concentric rings of defense, and one should almost always be more secure by default when it has a small usability impact. In this case, there is no legitimate use for \n in a section name, so the only usability impact would be if some weird program out there was actually making use of this for some reason, against all reasonable logic :). Which is why we are suggesting changing it only in 3.6.

\x00 is problematic (though somewhat less so) for the same reason, as many file readers will treat it as equivalent to end of line and allow a similar exploit. \r, \f, and \x1c-\x1e should also be blocked, but otherwise we should probably ignore non-printables for backward compatibility reasons (there we move further into the usability impact area).

@spaceone
Copy link
Mannequin Author

spaceone mannequin commented Nov 27, 2015

Of course both of you have reasonable arguments.
For compatibility with overridden SECTRE attributes it should not raise ValueError for characters like [ and ]. (too bad that SECTRE is a public attribute otherwise it could also be used to validate the name (SECTRE.match('[%s]')). What if somebody changed SECTRE to multiline? Then even rejecting '\n' would break compatibility.
But: How often does this happen? In open source projects it seems none. A nullege.com and google search exposed that no project does this.

Terry, I completely agree with your argument "that blindly inserting external input into a database is bad idea". But in the real world it happens that there are many applications out which doesn't validate what they pass into .add_section(). (Do you want me to give you a list of python projects which are either broken or vulnerable?). In my opinion this is dangerous, as well as not validating HTTP/Mail/MIME headers for such characters and so on.
What's the goal of python here? Giving programmers nice utilities which have security considerations in its software design by default or giving everything up to the programmer which is forced to never trust the stdlib and always have to read the source code it uses?

As I understand when I read the documentation is that config parser is loosely based on M$ INI files and as the name says it is for configuration files. Usually(!) configuration files are human readable files editable with an editor. Disallowing non-printable characters would have been the best option in the first release of config parser.
From my experience it is good to restrict things from the beginning and make them overrideable to be more relaxed if this is really needed.

And regarding bpo-20923: I think it would be a great feature to include the code change instead of changing the documentation. In my research about add_section() I found some projects which uses URI's as section name. As you know the WWW is evolving and actually http://[::1]/ is a valid URI nowadays. If this would be changed these implementations will not have to overwrite SECTRE in the future and they also won't run into that bug one day.

I adapted my commit to only disallow \r \n and \x00. [ ] are allowed for customization of SECTRE.
spaceone/cpython@a0cdb85

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants