Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Py3] ipa-server-install fixes (working NTP, DS, CA install steps) #382

Closed
wants to merge 16 commits into from
Closed

[Py3] ipa-server-install fixes (working NTP, DS, CA install steps) #382

wants to merge 16 commits into from

Conversation

MartinBasti
Copy link
Contributor

@MartinBasti MartinBasti commented Jan 9, 2017

This PR should allow to install server with py3

Up to "Done configuring certificate server (pki-tomcatd)."

I decided to split this effort as it looke that there will be too many patches

@@ -886,7 +886,8 @@ def encode(self, val):
elif isinstance(val, tuple):
return tuple(self.encode(m) for m in val)
elif isinstance(val, dict):
dct = dict((self.encode(k), self.encode(v)) for k, v in val.items())
# key in dict must be str not bytes
dct = dict((unicode(k), self.encode(v)) for k, v in val.items())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be just k, LDAPEntry already guarantees the keys are unicode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -499,7 +500,7 @@ def __spawn_instance(self):
# Directory server
config.set("CA", "pki_ds_ldap_port", "389")
config.set("CA", "pki_ds_password", self.dm_password)
config.set("CA", "pki_ds_base_dn", self.basedn)
config.set("CA", "pki_ds_base_dn", six.text_type(self.basedn))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the real issue here that DN does not have a well-defined __str__()? See how it's done for DNSName for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see difference between str implementation in DNSName and DN, I will investigate OptionParser, maybe it changed in Py3 to enforce strings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks that it really takes only string in py3

@@ -582,14 +582,15 @@ def __update_dse_ldif(self):
'dse.ldif'
)

with tempfile.NamedTemporaryFile(delete=False) as new_dse_ldif:
with tempfile.NamedTemporaryFile(
mode='w', delete=False) as new_dse_ldif:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default mode is w+b, so I guess if you just want to change from binary to text mode, mode should be w+t.

Copy link
Member

@tiran tiran Jan 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'w' mode is fine. LDIFWriter never reads from its input file. There is no point opening new_dse_ldif for read and write.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

f = os.open(pwd_file, os.O_CREAT | os.O_RDWR)
os.write(f, password)
os.close(f)
with io.open(pwd_file, 'w') as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, but could you be more explicit and use wt as mode?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'wt' is an uncommon way to spell out write in text mode. 'w' is the canonical way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit is better than implicit, except when it isn't, okay.

@@ -209,7 +209,7 @@ def _ldap_mod(self, ldif, sub_dict=None, raise_on_err=True,

if dm_password:
[pw_fd, pw_name] = tempfile.mkstemp()
os.write(pw_fd, dm_password)
os.write(pw_fd, dm_password.encode())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make also this code more Pythonic and replace mkstemp() with NamedTemporaryFile()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for getting rid of mkstemp() and os.write(). grep found a couple of more places where a raw FD and os.write(). Feel free to address them in another PR.

Copy link
Member

@tiran tiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments on @HonzaCholasta change requests.

tiran
tiran previously requested changes Jan 11, 2017
for f in (agent_file, chain_file):
try:
os.remove(f.name)
except IOError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.remove() raises an OSError, not an IOError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks

os.close(agent_fd)
agent_file = tempfile.NamedTemporaryFile(
mode="w", dir=paths.VAR_LIB_IPA, delete=False)
agent_file.write(self.admin_password)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use with statement here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To use with I have to indent or split a lot of code in that functions, is it worth it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a single function, so IMO it's OK.


# create a temp pem file storing the CA chain
(chain_fd, chain_file) = tempfile.mkstemp(dir=paths.VAR_LIB_IPA)
os.close(chain_fd)
chain_file = tempfile.NamedTemporaryFile(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same as above

mode='w', delete=False) as pw_file:
pw_name = pw_file.name
pw_file.write(dm_password)
auth_parms = ["-x", "-D", "cn=Directory Manager", "-y", pw_name]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the auth_param part out of the context manager. I got confused by it for a sec. It looked like you omitted pw_file.flush() like in other places. After a closer look it turns out the file name is used after the file has been closed.

with tempfile.NamedTemporaryFile(mode='w', delete=False) as pw_file:
    pw_file.write(dm_password)
pw_name = pw_file.name
auth_parms = ["-x", "-D", "cn=Directory Manager", "-y", pw_name]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@MartinBasti MartinBasti changed the title [WIP] Py3 ipa-server-install fixes [Py3] ipa-server-install fixes (working NTP, DS, CA install steps) Jan 11, 2017
@MartinBasti
Copy link
Contributor Author

@HonzaCholasta @tiran bump for review

@@ -135,7 +135,7 @@ def call_handler(_handler, *args, **kwargs):
cookie = os.environ.pop('CERTMONGER_CA_COOKIE', None)
if cookie is not None:
try:
context = json.loads(cookie)
context = json.loads(ipautil.decode_json(cookie))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary because in Py3 cookie is always an instance of str here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -522,7 +522,7 @@ def _write_schema(self, fingerprint):

def _read(self, path):
with zipfile.ZipFile(self._file, 'r') as zf:
return json.loads(zf.read(path).decode('utf-8'))
return json.loads(decode_json(zf.read(path)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to guess the encoding here, as ZipFile.read() always returns bytes. The original code is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes doesn't say anything about encoding, but AFAIK it is part of IPA so we can assume it is always 'utf-8' encoded

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's right.

@@ -1101,7 +1101,8 @@ def __request(self, name, args):
)

try:
response = json_decode_binary(json.loads(response.decode('ascii')))
response = json_decode_binary(
json.loads(ipautil.decode_json(response)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

response is always bytes, so we don't need to guess the encoding here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite curious why it is only ascii in original code

@@ -969,7 +970,7 @@ def forward(self, *args, **options):
json_vault_data = decoding_ctx.cipher_op(wrapped_vault_data)\
+ decoding_ctx.digest_final()

vault_data = json.loads(json_vault_data)
vault_data = json.loads(decode_json(json_vault_data))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like json_vault_data is always an instance of bytes, so instead of guessing the encoding, we should always decode it as UTF-8. Ditto for encoding json_vault_data in vault_archive.forward().

if isinstance(value, bytes):
if six.PY3:
value = value.decode('raw_unicode_escape')
if isinstance(value, six.binary_type):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if you kept bytes here, it's the same as six.binary_type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -493,7 +493,7 @@ def marshal(self, result, error, _id=None,

def unmarshal(self, data):
try:
d = json.loads(data)
d = json.loads(ipautil.decode_json(data))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data is read from wsgi.input and according to PEP 3333, reading from wsgi.input yields bytes, so we don't need to guess here and can always decode as UTF-8.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are assuming that bytes always means UTF-8 encoded text. Although it's the common case for JSON, it is not necessarily correct. JSON can also be encoded as UTF-16 LE/BE or UTF-32 LE/BE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently support only UTF-8:

$ echo '{"method": "ping", "params": [[], {}]}' | iconv -f utf-8 -t utf-16 | curl https://$HOSTNAME/ipa/json --cacert /etc/ipa/ca.crt --negotiate -u : -e https://$HOSTNAME/ipa/json -H 'Content-Type: application/json' --data-binary @-
{
    "error": {
        "code": 909, 
        "data": {
            "error": "No JSON object could be decoded"
        }, 
        "message": "Invalid JSON-RPC request: No JSON object could be decoded", 
        "name": "JSONError"
    }, 
    "id": null, 
    "principal": "admin@ABC.IDM.LAB.ENG.BRQ.REDHAT.COM", 
    "result": null, 
    "version": "4.4.90.dev201701171151+git7aee0d4"
}

Adding support for other encodings is not in the scope of this effort, and should be based on HTTP headers rather than guesswork anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tiran do you agree to hardcode everything into 'utf-8' for now? (we can open a ticket to support more encodings)

@MartinBasti
Copy link
Contributor Author

We had discussion with @HonzaCholasta, and IPA framework only expects that everything is UTF-8 only, so even in case we parse UTF-32 properly, framework will answer by UTF-8 encoding. Maybe we should rather validate if input is utf-8 compatible earlier and send proper public error.

@tiran
Copy link
Member

tiran commented Jan 19, 2017

Let's reiterate. It's obviously wrong to assume that request data such as JSON are encoded as UTF-8. It can be just any encoding. Outside the Western world, JSON and XML are often encoded as UTF-16.

That doesn't mean we have to support other encodings than UTF-8 right now. It's fine to restrict requests and responses to UTF-8 as only supported encoding. The check should be performed early in the WSGI layer. A client sends can send the request type as part of the content type. The framework should check for the presence of an encoding hint and refuse encodings that are encoding.lower() not in {'utf8', 'utf-8'}.

@MartinBasti
Copy link
Contributor Author

MartinBasti commented Jan 19, 2017

I left json_decode() only in places where external JSON request are coming, all other internal usages of JSON should be in utf-8 encoding.

EDIT: only for dogtag interaction

Other requests are out of scope of this PR and should be resolved in separate tickets/PRs

@tiran
Copy link
Member

tiran commented Jan 19, 2017

I have opened ticket https://fedorahosted.org/freeipa/ticket/6624 to track the matter.

Py3 expect bytes to be writed using os.write. Instead of that using
io module is more pythonic.

https://fedorahosted.org/freeipa/ticket/4985
NamedTemporaryfile can be used in more pythonic way and file can be
opened in textual mode that is required with PY3

https://fedorahosted.org/freeipa/ticket/4985
ldif parser uses file in text mode, so we have to open it in text mode
in py3

Also values passed to parser should be bytes

https://fedorahosted.org/freeipa/ticket/4985
Code in ipautlis works with text, so tempfiles should be open in
textmode otherwise TypeErrors are raised

https://fedorahosted.org/freeipa/ticket/4985
basedn is DN object it has to be converted to string before it can be
used with config parser

https://fedorahosted.org/freeipa/ticket/4985
config parser writes data as text so CA/KRA should be opened in textual
mode otherwise type errors are raised from installer

https://fedorahosted.org/freeipa/ticket/4985
With Python3 files must be opened in textual mode to write text, and
best practise is to use fileobject instead fo os.write() and manual
encodig

https://fedorahosted.org/freeipa/ticket/4985
There is no need to encode hostname to bytes. UTF-8 characters must be
encoded in different format in URL anyway and it causes only error in
Py3. String must be unicode to support Py2.

https://fedorahosted.org/freeipa/ticket/4985
There is no 'dict' attribute in 'msg', but 'msg' attribute is dict-like object
in both py2/3, so it can be used instead.

https://fedorahosted.org/freeipa/ticket/4985
Using raw pyldap interface we have to keep vaules as bytes. Is easier to
migrate to ipaldap and use strings without decoding and encoding.

https://fedorahosted.org/freeipa/ticket/4985
Method escape_filter_chars() requires string as parameter instead of
bytes. 'value_to_utf8' returns bytes thus this code has to be removed.

https://fedorahosted.org/freeipa/ticket/4985
due perfomance improvement in e4930b3
we have to decode value before it can be used in DN() constructor.

https://fedorahosted.org/freeipa/ticket/4985
'read_ca' and 'create_ca' have no logging when exception happened and it
masks real reason why it failed.
In py 3.5 json.loads requires to have string as input, all bytes must be
decoded.

Note: python 3.6 supports bytes for json.loads()

https://fedorahosted.org/freeipa/ticket/4985
@tiran tiran dismissed their stale review January 24, 2017 11:41

Let's review all password temp files later.

@tiran tiran added the ack Pull Request approved, can be merged label Jan 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ack Pull Request approved, can be merged pushed Pull Request has already been pushed
Projects
None yet
3 participants