New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Py3] ipa-server-install fixes (working NTP, DS, CA install steps) #382
Conversation
| @@ -886,7 +886,8 @@ def encode(self, val): | |||
| elif isinstance(val, tuple): | |||
| return tuple(self.encode(m) for m in val) | |||
| elif isinstance(val, dict): | |||
| dct = dict((self.encode(k), self.encode(v)) for k, v in val.items()) | |||
| # key in dict must be str not bytes | |||
| dct = dict((unicode(k), self.encode(v)) for k, v in val.items()) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be just k, LDAPEntry already guarantees the keys are unicode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| @@ -499,7 +500,7 @@ def __spawn_instance(self): | |||
| # Directory server | |||
| config.set("CA", "pki_ds_ldap_port", "389") | |||
| config.set("CA", "pki_ds_password", self.dm_password) | |||
| config.set("CA", "pki_ds_base_dn", self.basedn) | |||
| config.set("CA", "pki_ds_base_dn", six.text_type(self.basedn)) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the real issue here that DN does not have a well-defined __str__()? See how it's done for DNSName for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see difference between str implementation in DNSName and DN, I will investigate OptionParser, maybe it changed in Py3 to enforce strings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks that it really takes only string in py3
| @@ -582,14 +582,15 @@ def __update_dse_ldif(self): | |||
| 'dse.ldif' | |||
| ) | |||
|
|
|||
| with tempfile.NamedTemporaryFile(delete=False) as new_dse_ldif: | |||
| with tempfile.NamedTemporaryFile( | |||
| mode='w', delete=False) as new_dse_ldif: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default mode is w+b, so I guess if you just want to change from binary to text mode, mode should be w+t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'w' mode is fine. LDIFWriter never reads from its input file. There is no point opening new_dse_ldif for read and write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
| f = os.open(pwd_file, os.O_CREAT | os.O_RDWR) | ||
| os.write(f, password) | ||
| os.close(f) | ||
| with io.open(pwd_file, 'w') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, but could you be more explicit and use wt as mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'wt' is an uncommon way to spell out write in text mode. 'w' is the canonical way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicit is better than implicit, except when it isn't, okay.
| @@ -209,7 +209,7 @@ def _ldap_mod(self, ldif, sub_dict=None, raise_on_err=True, | |||
|
|
|||
| if dm_password: | |||
| [pw_fd, pw_name] = tempfile.mkstemp() | |||
| os.write(pw_fd, dm_password) | |||
| os.write(pw_fd, dm_password.encode()) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make also this code more Pythonic and replace mkstemp() with NamedTemporaryFile()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for getting rid of mkstemp() and os.write(). grep found a couple of more places where a raw FD and os.write(). Feel free to address them in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments on @HonzaCholasta change requests.
| for f in (agent_file, chain_file): | ||
| try: | ||
| os.remove(f.name) | ||
| except IOError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.remove() raises an OSError, not an IOError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks
| os.close(agent_fd) | ||
| agent_file = tempfile.NamedTemporaryFile( | ||
| mode="w", dir=paths.VAR_LIB_IPA, delete=False) | ||
| agent_file.write(self.admin_password) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use with statement here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To use with I have to indent or split a lot of code in that functions, is it worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a single function, so IMO it's OK.
|
|
||
| # create a temp pem file storing the CA chain | ||
| (chain_fd, chain_file) = tempfile.mkstemp(dir=paths.VAR_LIB_IPA) | ||
| os.close(chain_fd) | ||
| chain_file = tempfile.NamedTemporaryFile( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as above
| mode='w', delete=False) as pw_file: | ||
| pw_name = pw_file.name | ||
| pw_file.write(dm_password) | ||
| auth_parms = ["-x", "-D", "cn=Directory Manager", "-y", pw_name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the auth_param part out of the context manager. I got confused by it for a sec. It looked like you omitted pw_file.flush() like in other places. After a closer look it turns out the file name is used after the file has been closed.
with tempfile.NamedTemporaryFile(mode='w', delete=False) as pw_file:
pw_file.write(dm_password)
pw_name = pw_file.name
auth_parms = ["-x", "-D", "cn=Directory Manager", "-y", pw_name]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
@HonzaCholasta @tiran bump for review |
| @@ -135,7 +135,7 @@ def call_handler(_handler, *args, **kwargs): | |||
| cookie = os.environ.pop('CERTMONGER_CA_COOKIE', None) | |||
| if cookie is not None: | |||
| try: | |||
| context = json.loads(cookie) | |||
| context = json.loads(ipautil.decode_json(cookie)) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not necessary because in Py3 cookie is always an instance of str here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| @@ -522,7 +522,7 @@ def _write_schema(self, fingerprint): | |||
|
|
|||
| def _read(self, path): | |||
| with zipfile.ZipFile(self._file, 'r') as zf: | |||
| return json.loads(zf.read(path).decode('utf-8')) | |||
| return json.loads(decode_json(zf.read(path))) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to guess the encoding here, as ZipFile.read() always returns bytes. The original code is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bytes doesn't say anything about encoding, but AFAIK it is part of IPA so we can assume it is always 'utf-8' encoded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's right.
| @@ -1101,7 +1101,8 @@ def __request(self, name, args): | |||
| ) | |||
|
|
|||
| try: | |||
| response = json_decode_binary(json.loads(response.decode('ascii'))) | |||
| response = json_decode_binary( | |||
| json.loads(ipautil.decode_json(response))) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
response is always bytes, so we don't need to guess the encoding here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite curious why it is only ascii in original code
| @@ -969,7 +970,7 @@ def forward(self, *args, **options): | |||
| json_vault_data = decoding_ctx.cipher_op(wrapped_vault_data)\ | |||
| + decoding_ctx.digest_final() | |||
|
|
|||
| vault_data = json.loads(json_vault_data) | |||
| vault_data = json.loads(decode_json(json_vault_data)) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like json_vault_data is always an instance of bytes, so instead of guessing the encoding, we should always decode it as UTF-8. Ditto for encoding json_vault_data in vault_archive.forward().
| if isinstance(value, bytes): | ||
| if six.PY3: | ||
| value = value.decode('raw_unicode_escape') | ||
| if isinstance(value, six.binary_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if you kept bytes here, it's the same as six.binary_type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| @@ -493,7 +493,7 @@ def marshal(self, result, error, _id=None, | |||
|
|
|||
| def unmarshal(self, data): | |||
| try: | |||
| d = json.loads(data) | |||
| d = json.loads(ipautil.decode_json(data)) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data is read from wsgi.input and according to PEP 3333, reading from wsgi.input yields bytes, so we don't need to guess here and can always decode as UTF-8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are assuming that bytes always means UTF-8 encoded text. Although it's the common case for JSON, it is not necessarily correct. JSON can also be encoded as UTF-16 LE/BE or UTF-32 LE/BE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently support only UTF-8:
$ echo '{"method": "ping", "params": [[], {}]}' | iconv -f utf-8 -t utf-16 | curl https://$HOSTNAME/ipa/json --cacert /etc/ipa/ca.crt --negotiate -u : -e https://$HOSTNAME/ipa/json -H 'Content-Type: application/json' --data-binary @-
{
"error": {
"code": 909,
"data": {
"error": "No JSON object could be decoded"
},
"message": "Invalid JSON-RPC request: No JSON object could be decoded",
"name": "JSONError"
},
"id": null,
"principal": "admin@ABC.IDM.LAB.ENG.BRQ.REDHAT.COM",
"result": null,
"version": "4.4.90.dev201701171151+git7aee0d4"
}
Adding support for other encodings is not in the scope of this effort, and should be based on HTTP headers rather than guesswork anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tiran do you agree to hardcode everything into 'utf-8' for now? (we can open a ticket to support more encodings)
|
We had discussion with @HonzaCholasta, and IPA framework only expects that everything is UTF-8 only, so even in case we parse UTF-32 properly, framework will answer by UTF-8 encoding. Maybe we should rather validate if input is utf-8 compatible earlier and send proper public error. |
|
Let's reiterate. It's obviously wrong to assume that request data such as JSON are encoded as UTF-8. It can be just any encoding. Outside the Western world, JSON and XML are often encoded as UTF-16. That doesn't mean we have to support other encodings than UTF-8 right now. It's fine to restrict requests and responses to UTF-8 as only supported encoding. The check should be performed early in the WSGI layer. A client sends can send the request type as part of the content type. The framework should check for the presence of an encoding hint and refuse encodings that are |
|
I left json_decode() only in places where external JSON request are coming, all other internal usages of JSON should be in utf-8 encoding. EDIT: only for dogtag interaction Other requests are out of scope of this PR and should be resolved in separate tickets/PRs |
|
I have opened ticket https://fedorahosted.org/freeipa/ticket/6624 to track the matter. |
Py3 expect bytes to be writed using os.write. Instead of that using io module is more pythonic. https://fedorahosted.org/freeipa/ticket/4985
NamedTemporaryfile can be used in more pythonic way and file can be opened in textual mode that is required with PY3 https://fedorahosted.org/freeipa/ticket/4985
ldif parser uses file in text mode, so we have to open it in text mode in py3 Also values passed to parser should be bytes https://fedorahosted.org/freeipa/ticket/4985
Code in ipautlis works with text, so tempfiles should be open in textmode otherwise TypeErrors are raised https://fedorahosted.org/freeipa/ticket/4985
basedn is DN object it has to be converted to string before it can be used with config parser https://fedorahosted.org/freeipa/ticket/4985
config parser writes data as text so CA/KRA should be opened in textual mode otherwise type errors are raised from installer https://fedorahosted.org/freeipa/ticket/4985
With Python3 files must be opened in textual mode to write text, and best practise is to use fileobject instead fo os.write() and manual encodig https://fedorahosted.org/freeipa/ticket/4985
There is no need to encode hostname to bytes. UTF-8 characters must be encoded in different format in URL anyway and it causes only error in Py3. String must be unicode to support Py2. https://fedorahosted.org/freeipa/ticket/4985
There is no 'dict' attribute in 'msg', but 'msg' attribute is dict-like object in both py2/3, so it can be used instead. https://fedorahosted.org/freeipa/ticket/4985
With bytes as attribute name pyldap raises type error https://fedorahosted.org/freeipa/ticket/4985
Using raw pyldap interface we have to keep vaules as bytes. Is easier to migrate to ipaldap and use strings without decoding and encoding. https://fedorahosted.org/freeipa/ticket/4985
Method escape_filter_chars() requires string as parameter instead of bytes. 'value_to_utf8' returns bytes thus this code has to be removed. https://fedorahosted.org/freeipa/ticket/4985
due perfomance improvement in e4930b3 we have to decode value before it can be used in DN() constructor. https://fedorahosted.org/freeipa/ticket/4985
'read_ca' and 'create_ca' have no logging when exception happened and it masks real reason why it failed.
In py 3.5 json.loads requires to have string as input, all bytes must be decoded. Note: python 3.6 supports bytes for json.loads() https://fedorahosted.org/freeipa/ticket/4985
Let's review all password temp files later.
This PR should allow to install server with py3
Up to "Done configuring certificate server (pki-tomcatd)."
I decided to split this effort as it looke that there will be too many patches