New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be systematic about parsing and validating hostnames and addresses #12165
Conversation
See also the later discussion in #12108 regarding defining @jimi-c said that we don't want to support this; the code in this PR already rejects the notation. (Aside: if desired, however, "proper" support could easily be added on the basis of this PR, i.e. to detect the user@ prefix, remove it, and set |
@amenonsen agreed, the support could be added if someone wanted to, in which case we would support it. |
e62f1b0
to
69da4d6
Compare
I did a bit more testing and implemented more precise validation of hostname labels (e.g. to enforce the DNS rule that a label can't end with a hyphen, i.e. At this point, my suggestion would be to merge this during the 2.0 alpha period so that we can see if there are any more obscure/undocumented/broken inventory definitions in the wild (such as |
69da4d6
to
3f07986
Compare
FYI I checked this PR on my large inventory by dumping everything into a json file. Dump from
|
78ecc48
to
fab850e
Compare
@chrrrles I believe you were looking at this PR. Did you come to any conclusions? |
range_tests = { | ||
'192.0.2.[3:10]': ['192.0.2.[3:10]', None], | ||
'192.0.2.[3:10]:23': ['192.0.2.[3:10]', 23], | ||
'abcd:ef98::7654:[1:9]': ['abcd:ef98::7654:[1:9]', None], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also test ranges of ipv6 addresses with a port, ie:
'[abcd:ef98::7654:[1:9]]:2222': ['[abcd:ef98::7654:[1:9]]:2222', None]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I've added this test in the latest version.)
9f4b98a
to
9d29bc9
Compare
I've squashed this PR down to three commits (but if it's merged, I request that it not be squashed further; I believe there's value in having the hostname parsing changes separated for better understanding and future debugging). |
e2bfc33
to
b49f17f
Compare
@chrrrles mentioned on IRC that this patch apparently caused an integration test failure. I had a quick look. The test in question is from
First, there's a trivial missing import in If I change this to use, e.g.
I can understand supporting Unicode word characters in inventory hostnames for convenience (though it's not very convenient in practice, because you have to set |
b49f17f
to
dd7b218
Compare
@chrrrles @abadger I've rebased, fixed the import error, made the (BTW, I can't help but feel a bit pleased that the sum total of changes required to make the new code support unicode inventory hostnames was two lines.) AFAIK, I've addressed all the substantive comments on this PR, and it's ready to merge now. |
70b9f7c
to
7c11740
Compare
This adds a parse_address(pattern) utility function that returns (host,port), and uses it wherever where we accept IPv4 and IPv6 addresses and hostnames (or host patterns): the inventory parser the the add_host action plugin. It also introduces a more extensive set of unit tests that supersedes the old add_host unit tests (which didn't actually test add_host, but only the parsing function).
Labels must start with an alphanumeric character, may contain alphanumeric characters or hyphens, but must not end with a hyphen. We enforce those rules, but allow underscores wherever hyphens are accepted, and allow alphanumeric ranges anywhere. We relax the definition of "alphanumeric" to include Unicode characters even though such inventory hostnames cannot be used in practice unless an ansible_ssh_host is set for each of them. We still don't enforce length restrictions—the fact that we have to accept ranges makes it more complex, and it doesn't seem especially worthwhile.
1. The test did "name: '{{hostnames}}.{{item}}'" inside a with_sequence loop, which didn't do what was intended: it expanded hostnames into an array, appended ".1", and set name to the resulting string. This can be converted to a simple with_items loop. 2. Some of the entries in hostnames contained punctuation characters, which I see no reason to support in inventory hostnames anyway. 3. Once the add_host failures are fixed, the playbook later fails when the unicode hostnames are interpolated into debug output in ssh.py due to an encoding error. This is only one of the many places that may fail when using unicode inventory hostnames; we work around it by providing an ansible_ssh_host setting.
7c11740
to
88a20e7
Compare
Hi @amenonsen - thanks for fixing up the hunting down the unicode bug and expanding test_addresses. The code looks good, merging!-- Be systematic about parsing and validating hostnames and addresses
This adds a parse_address(pattern) utility function that returns
(host,port), and uses it wherever where we accept IPv4 and IPv6
addresses and hostnames (or host patterns).
This is a proof of concept for how I think it should be done. Some changes are needed: for example, we may want to be a bit more restrictive about hostnames (e.g. "foo-" should be disallowed), and a few other details may need to be cleaned up. But this is a solid foundation, and it should be backwards-compatible as well.
See also discussion in #11747
I welcome feedback/reviews/testing.
cc: @abadger @srvg @ryanpetrello