New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3: WIP IO/Strings/Text Model/… #16919

Closed
wants to merge 34 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@xmo-odoo
Collaborator

xmo-odoo commented May 10, 2017

  • base64 is a bytes->bytes encoding
  • csv works on native strings (so bytes in P2 and text in P3), inconvenient when using io.*IO
  • werkzeug's Response can take bytes or text, text will be encoded with the response's charset (default UTF-8)
  • subprocess is bytes-based: its result(s) are bytestrings, p.stdout and p.stderr (if PIPE) are binary (byte-based) streams
  • pickle.dumps(object) -> bytes, pickle.loads(bytes) -> object
  • -> can/should (ir.values).value become a binary field? yeeees
  • are view archs text or bytes? Seem mostly manipulated as text…
    decided they're text, we'll see how that works out
  • convert_xml is whack still whack but removed the random encode/decode calls

@xmo-odoo xmo-odoo requested a review from rco-odoo May 10, 2017

@C3POdoo C3POdoo added the RD label May 11, 2017

@xmo-odoo xmo-odoo changed the title from Python 3: fridge horror (IO/Strings/Text Model/…) to Python 3: WIP IO/Strings/Text Model/… May 17, 2017

@xmo-odoo xmo-odoo requested review from aab-odoo and tde-banana-odoo May 17, 2017

@xmo-odoo

This comment has been minimized.

Show comment
Hide comment
@xmo-odoo

xmo-odoo May 17, 2017

Collaborator

Tiens ya des merdes dans les mails alors je tag @tde-banana-odoo

Collaborator

xmo-odoo commented May 17, 2017

Tiens ya des merdes dans les mails alors je tag @tde-banana-odoo

@xmo-odoo

This comment has been minimized.

Show comment
Hide comment
@xmo-odoo

xmo-odoo May 17, 2017

Collaborator

Nota: as of the round() compatibility shim, base installation and tests work, though it's not yet known whether the result works

Collaborator

xmo-odoo commented May 17, 2017

Nota: as of the round() compatibility shim, base installation and tests work, though it's not yet known whether the result works

@dreispt

This comment has been minimized.

Show comment
Hide comment
@dreispt

dreispt May 18, 2017

Contributor

@xmo-odoo I'm curious: are all Odoo dependencies py3 compatible?

Contributor

dreispt commented May 18, 2017

@xmo-odoo I'm curious: are all Odoo dependencies py3 compatible?

@xmo-odoo

This comment has been minimized.

Show comment
Hide comment
@xmo-odoo

xmo-odoo May 19, 2017

Collaborator

@dreispt Following previous fixes, yes, so far as I could find: we replaced pypdf by pypdf2, pychart was removed with the old reports and we switched out the vendored html2text for an external one, the rest seems to work, at this point I can access the web client and website running on Python 3.5 (and 3.6 with some additional fixes as there have been bytecode changes).

I think there's still jcconv which is not P3-compatible, but it's an optional dependency for the POS hardware which IIRC currently runs only on 8 anyway.

Collaborator

xmo-odoo commented May 19, 2017

@dreispt Following previous fixes, yes, so far as I could find: we replaced pypdf by pypdf2, pychart was removed with the old reports and we switched out the vendored html2text for an external one, the rest seems to work, at this point I can access the web client and website running on Python 3.5 (and 3.6 with some additional fixes as there have been bytecode changes).

I think there's still jcconv which is not P3-compatible, but it's an optional dependency for the POS hardware which IIRC currently runs only on 8 anyway.

@odony

This comment has been minimized.

Show comment
Hide comment
@odony

odony May 19, 2017

Contributor

@xmo-odoo regarding the safe_eval opcodes, I think POP_EXCEPT does belong to _SAFE_OPCODES. It's a counterpart to SETUP_EXCEPT, which only makes sense in statement contexts.
By the way, it seems POP_BLOCK belongs to _SAFE_OPCODES too, for the same reason. People have been adding opcodes in the wrong sets.

It may be a good time to cleanup our current opcodes sets:

_CONST_OPCODES should only include opcodes used to declare literals: LOAD_CONST, RETURN_VALUE and BUILD_* (of which we currently seem to be missing BUILD_SET).

_EXPR_OPCODES includes all opcodes that can be used to operate on literal values: all unary and binary operators, and slicing opcodes: (UNARY_*, BINARY_*, INPLACE_*, SLICE+* (py2) and BUILD_SLICE(py3), STORE_MAP (py2).

_SAFE_OPCODES includes the rest of the opcodes that we currently consider safe.

Not sure if you want to do this in this PR, or if we make a separate task for it...

Contributor

odony commented May 19, 2017

@xmo-odoo regarding the safe_eval opcodes, I think POP_EXCEPT does belong to _SAFE_OPCODES. It's a counterpart to SETUP_EXCEPT, which only makes sense in statement contexts.
By the way, it seems POP_BLOCK belongs to _SAFE_OPCODES too, for the same reason. People have been adding opcodes in the wrong sets.

It may be a good time to cleanup our current opcodes sets:

_CONST_OPCODES should only include opcodes used to declare literals: LOAD_CONST, RETURN_VALUE and BUILD_* (of which we currently seem to be missing BUILD_SET).

_EXPR_OPCODES includes all opcodes that can be used to operate on literal values: all unary and binary operators, and slicing opcodes: (UNARY_*, BINARY_*, INPLACE_*, SLICE+* (py2) and BUILD_SLICE(py3), STORE_MAP (py2).

_SAFE_OPCODES includes the rest of the opcodes that we currently consider safe.

Not sure if you want to do this in this PR, or if we make a separate task for it...

@xmo-odoo

This comment has been minimized.

Show comment
Hide comment
@xmo-odoo

xmo-odoo May 19, 2017

Collaborator

@odony

Not sure if you want to do this in this PR, or if we make a separate task for it...

Yeah there's a lot of stuff to split out of this one, right now it's only possible to run on P3 using this branch so most fixes were added to it, but there's lots of stuff which is peripheral or unrelated to the text model and I need to move to separate branches and PR, the opcodes changes is one of them.

Collaborator

xmo-odoo commented May 19, 2017

@odony

Not sure if you want to do this in this PR, or if we make a separate task for it...

Yeah there's a lot of stuff to split out of this one, right now it's only possible to run on P3 using this branch so most fixes were added to it, but there's lots of stuff which is peripheral or unrelated to the text model and I need to move to separate branches and PR, the opcodes changes is one of them.

@xmo-odoo

This comment has been minimized.

Show comment
Hide comment
@xmo-odoo

xmo-odoo May 19, 2017

Collaborator

Nota: split the non-text-model stuff out of this PR, created 4 other PR for broad areas of changes (possibly overly broad but 20 different PRs seems a bit much).

This means as of that update the branch doesn't run on Python 3 anymore, I haven't kept track but I think 5f43625 was the last commit of the pre-cleaning version which should work on 3.5 and 3.6 (modulo extant bugs in the text model & al)

Collaborator

xmo-odoo commented May 19, 2017

Nota: split the non-text-model stuff out of this PR, created 4 other PR for broad areas of changes (possibly overly broad but 20 different PRs seems a bit much).

This means as of that update the branch doesn't run on Python 3 anymore, I haven't kept track but I think 5f43625 was the last commit of the pre-cleaning version which should work on 3.5 and 3.6 (modulo extant bugs in the text model & al)

@xmo-odoo xmo-odoo referenced this pull request May 19, 2017

Merged

Python 3 API changes #17110

@rco-odoo

This comment has been minimized.

Show comment
Hide comment
@rco-odoo

rco-odoo May 19, 2017

Member

@xmo-odoo in my own branch, I adapted jcconv to Python 3 at odoo-dev@60bb796.

Member

rco-odoo commented May 19, 2017

@xmo-odoo in my own branch, I adapted jcconv to Python 3 at odoo-dev@60bb796.

@xmo-odoo xmo-odoo added the Framework label Jun 6, 2017

@xmo-odoo

This comment has been minimized.

Show comment
Hide comment
@xmo-odoo

xmo-odoo Aug 10, 2017

Collaborator

🔔

Collaborator

xmo-odoo commented Aug 10, 2017

🔔

@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul Aug 17, 2017

Contributor

@xmo-odoo I did some (admittedly superficial) tests and things are looking good.

The only issue I noticed so far is that the first ctrl-c does not stop the server (need to do it twice for a forced shutdown). Also requirements.txt needs updating.

What is the status of this PR? Is it going to be merged for v11? What will be the python version(s) officially supported by v11?

Contributor

sbidoul commented Aug 17, 2017

@xmo-odoo I did some (admittedly superficial) tests and things are looking good.

The only issue I noticed so far is that the first ctrl-c does not stop the server (need to do it twice for a forced shutdown). Also requirements.txt needs updating.

What is the status of this PR? Is it going to be merged for v11? What will be the python version(s) officially supported by v11?

@xmo-odoo

This comment has been minimized.

Show comment
Hide comment
@xmo-odoo

xmo-odoo Aug 17, 2017

Collaborator

The only issue I noticed so far is that the first ctrl-c does not stop the server (need to do it twice for a forced shutdown).

Under Python 2, Python 3 or both?

Also requirements.txt needs updating.

Which dependencies does this branch require updating?

What is the status of this PR? Is it going to be merged for v11? What will be the python version(s) officially supported by v11?

Ask @antonylesuisse or @odony, not me.

Collaborator

xmo-odoo commented Aug 17, 2017

The only issue I noticed so far is that the first ctrl-c does not stop the server (need to do it twice for a forced shutdown).

Under Python 2, Python 3 or both?

Also requirements.txt needs updating.

Which dependencies does this branch require updating?

What is the status of this PR? Is it going to be merged for v11? What will be the python version(s) officially supported by v11?

Ask @antonylesuisse or @odony, not me.

@sbidoul

This comment has been minimized.

Show comment
Hide comment
@sbidoul

sbidoul Aug 17, 2017

Contributor

@xmo-odoo the CTRL-C issue is with Python 3.5 on Ubuntu 16.04. It works normally with Python 2.

Regarding requirements.txt, besides jcconv, there is wsgiref which does not pip install under python 3. pip install -e . works fine. Anyway I guess it's a good moment to update requirements.txt with the latest stable versions of all dependencies?

Contributor

sbidoul commented Aug 17, 2017

@xmo-odoo the CTRL-C issue is with Python 3.5 on Ubuntu 16.04. It works normally with Python 2.

Regarding requirements.txt, besides jcconv, there is wsgiref which does not pip install under python 3. pip install -e . works fine. Anyway I guess it's a good moment to update requirements.txt with the latest stable versions of all dependencies?

@xmo-odoo

This comment has been minimized.

Show comment
Hide comment
@xmo-odoo

xmo-odoo Aug 17, 2017

Collaborator

the CTRL-C issue is with Python 3.5 on Ubuntu 16.04. It works normally with Python 2.

OK I'll check it out, that might actually be more of a semantics change (e.g. reshuffle of the exceptions hierarchy and we're catching the wrong one).

Regarding requirements.txt, besides jcconv, there is wsgiref which does not pip install under python 3.

I'm guessing we'll remove both from requirements.

Both of these issues are more general Python 3 compatibility concerns not specifically related to the IO model, so I'll fix them separately (probably directly in master).

Collaborator

xmo-odoo commented Aug 17, 2017

the CTRL-C issue is with Python 3.5 on Ubuntu 16.04. It works normally with Python 2.

OK I'll check it out, that might actually be more of a semantics change (e.g. reshuffle of the exceptions hierarchy and we're catching the wrong one).

Regarding requirements.txt, besides jcconv, there is wsgiref which does not pip install under python 3.

I'm guessing we'll remove both from requirements.

Both of these issues are more general Python 3 compatibility concerns not specifically related to the IO model, so I'll fix them separately (probably directly in master).

xmo-odoo added some commits Apr 28, 2017

[FIX] P3: fix base64 and StringIO uses
* StringIO removed from stdlib, replace with io
* try to correctly handle BytesIO/StringIO (one is for bytes the other
  is for text)
* fix base64: Python 3 removed bytes-encoding and bytes-bytes
  codecs (via #encode) so replace all calls to str.encode('base64'),
  also b64encode is a bytes->bytes conversion so attempt to properly
  handle that

issue #8530
[FIX] P3: text model types
* remove references to basestring & unicode (use relevant pycompat
  helpers)
* remove some str calls (either entirely or replaced by relevant
  helper, either text or native)
* use better API to avoid unnecessary conversions
* remove some XML declarations in views
[FIX] P3: text/binary handling
* unichr -> pycompat (builtin removed from Python 3, ``chr`` has
  become unicode-aware)
* Psycopg2 bytea values (binary fields) are returned as memoryviews in
  P3 but buffers in P2
* Both bytes and (text) strings are conventionally iterable in
  P3 (they have a __iter__ method), so fix up the exclusion pattern
  for flatten
[FIX] P3: encoding & decoding mess while manipulating XML
* IrValues.value should be a binary field, since pickle encodes
  objects to bytes
* Which means the "action" string must be encoded before storage and
  decoded before use
* Remove complete garbage mess of random encodings during XML
  conversion, this probably still requires serious thinking and fixing
  but it seems to run for now (yay)
[FIX] P3: "mogrify" returns bytes, decode it
Otherwise when the subquery is somehow injected in the parent
query (didn't find where but...) it's b-prefixed and treated as a
binary rather than a sub-query somehow, which fails.

xmo-odoo added some commits May 17, 2017

[FIX] P3: WSGI applications must return *bytes*
XMLRPC dumps returns text rather than bytes (for reasons unknown),
which does not suit Werkzeug's WSGI server, we need to encode the
body.

For simplicity and future-proofiness, return Werkzeug responses
instead, Werkzeug takes care of encoding text and passing bytes
through.
[FIX] P3: (ir.values).value is bytes
or buffer/memoryview when coming directly from XML
[FIX] P3: assets bundle text model
* remains is a list of text
* sep is text
* json.dumps generates text (surprise!) so beware with that
[FIX] P3: text model in various mail bits
No formal decision there, and the stdlib API seems like a PITA as it's
based around native strings, not bytes and not text.

* try to use cross-model API instead of random encode/decode
* unicode all the mail_template things
* formataddr *requires* a text tuple now

  In P2, formataddr((False, False)) "worked" (generated "False
  <False>"), in P3 it will fail and blow up.
* attachments are quite definitely bytes
* remove try_coerce_ascii because it's garbage
[FIX] P3: (ir.attachment)._index works on binary data
Thus needs a binary pattern and to decode it into ascii text.

Also improve the regex.
[FIX] P3: JSON is an object<->text encoding
It doesn't load bytes, and it doesn't dump to bytes.
[FIX] P3: file modes
In P3, files *must* be opened in binary mode when reading or writing
binary data, text mode (default) generates text and thus decodes on
the fly.
[FIX] P3: url_for text model
Have url_for work entirely on text internally, especially since
Request.path is text and so are lang names.
[FIX] Response wrapping in http
Explicitly check for both bytes and text results to wrap in a Response
object, as bytes is a string_type in P2 but not P3.
[FIX] P3: phantomjs & text model
In P3, can only append integers to bytearray, not bytes, also fix some
byte/string comparisons & encoding crap

odony added a commit to odoo-dev/odoo that referenced this pull request Aug 19, 2017

odony added a commit to odoo-dev/odoo that referenced this pull request Aug 20, 2017

@odony

This comment has been minimized.

Show comment
Hide comment
@odony

odony Aug 20, 2017

Contributor

@xmo-odoo thanks for the heavy lifting! 👍

I've rebased and merged this on saas-17 (as of 1492670) after a series of extra fixes: some typos, some new places that needed to be fixed, and a few other P3-preparation requirements (including removing the pycompat helpers for dict.keys/values/items, not useful anymore)

We can consider the huge work by @xmo-odoo completed 🎉
Saas-17 should be the first pseudo-stable branch that fully runs on Python 3.5, and @KangOl should be able to forward-port this to master next week (good luck with that!).

Now there are still quite a few leaks at the seams...
@xmo-odoo @antonylesuisse @julienlegros I expect we'll notice leftover conversion errors in the coming weeks, and the setup of a P3-enabled runbot will be essential to track them down.

In addition, here are things I noticed are still flaky with Python 3.5:

  • As @sbidoul reported, shutting down the server with Ctrl-C takes longer than with P2 (it eventually finishes, if you don't force it). I think this comes from the 3.5 changes for PEP 475, which means that workers interrupted by SIGINT silently retry their syscalls (mainly select.select for WorkerHTTP or time.sleep for WorkerCron), instead of stopping. PEP475 highlights strategies for our use case. I tried naively raising a KeyboardInterrupt or socket.error(errno.EINTR, 'retry syscall') in the signal handler, but this requires catching the exception in various places, not just those 2, and may not be the best option. Any ideas?
  • Preloading a database with -d <db> in multithreaded mode fails unless you're in shell mode, at least if you have more than base installed. Looks like the registry gets loaded before all classes are imported or something, so registry construction bails with a missing parent model 🤔 No problem if you don't preload, or in multi-process (even w/ preload).
  • I still have doubts about parts of bc7dce2.
    The first part of this patch initially crashed trying to mix text (base_url) and bytes (query_string). Using URL.replace() only delayed the problem until the reconstruction of the url string. I've tried using iri_to_uri() to convert the query_string to text, and it seems to works well in P2 and P3. Maybe we could use that systematically convert request.query_string when we need to work with URLs?

FWIW, a lame non-scientific mini-benchmark on my laptop, for a cold start install of new database w/ CRM, best of 3:

  • Python 2.7.12: 65.154s
  • Python 3.5.2: 66.270s
Contributor

odony commented Aug 20, 2017

@xmo-odoo thanks for the heavy lifting! 👍

I've rebased and merged this on saas-17 (as of 1492670) after a series of extra fixes: some typos, some new places that needed to be fixed, and a few other P3-preparation requirements (including removing the pycompat helpers for dict.keys/values/items, not useful anymore)

We can consider the huge work by @xmo-odoo completed 🎉
Saas-17 should be the first pseudo-stable branch that fully runs on Python 3.5, and @KangOl should be able to forward-port this to master next week (good luck with that!).

Now there are still quite a few leaks at the seams...
@xmo-odoo @antonylesuisse @julienlegros I expect we'll notice leftover conversion errors in the coming weeks, and the setup of a P3-enabled runbot will be essential to track them down.

In addition, here are things I noticed are still flaky with Python 3.5:

  • As @sbidoul reported, shutting down the server with Ctrl-C takes longer than with P2 (it eventually finishes, if you don't force it). I think this comes from the 3.5 changes for PEP 475, which means that workers interrupted by SIGINT silently retry their syscalls (mainly select.select for WorkerHTTP or time.sleep for WorkerCron), instead of stopping. PEP475 highlights strategies for our use case. I tried naively raising a KeyboardInterrupt or socket.error(errno.EINTR, 'retry syscall') in the signal handler, but this requires catching the exception in various places, not just those 2, and may not be the best option. Any ideas?
  • Preloading a database with -d <db> in multithreaded mode fails unless you're in shell mode, at least if you have more than base installed. Looks like the registry gets loaded before all classes are imported or something, so registry construction bails with a missing parent model 🤔 No problem if you don't preload, or in multi-process (even w/ preload).
  • I still have doubts about parts of bc7dce2.
    The first part of this patch initially crashed trying to mix text (base_url) and bytes (query_string). Using URL.replace() only delayed the problem until the reconstruction of the url string. I've tried using iri_to_uri() to convert the query_string to text, and it seems to works well in P2 and P3. Maybe we could use that systematically convert request.query_string when we need to work with URLs?

FWIW, a lame non-scientific mini-benchmark on my laptop, for a cold start install of new database w/ CRM, best of 3:

  • Python 2.7.12: 65.154s
  • Python 3.5.2: 66.270s

@odony odony closed this Aug 20, 2017

@xmo-odoo xmo-odoo deleted the odoo-dev:master-p3-io-xmo branch Aug 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment