Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #18: encode() only if type is unicode (only for Py2) #23

Closed
wants to merge 26 commits into from

Conversation

bernhardkaindl
Copy link
Collaborator

@bernhardkaindl bernhardkaindl commented Apr 24, 2023

Fix #18:

Rationale: https://github.com/ydirson/xenserver-python-libs/blob/master/xcp/xmlunwrap.py extracts XML Elements from XML, and the change of the XML Test to bytes is wrong, because XML Text elements are defined to be encoded text, UTF-8 is the standard encoding.

Binary data is not legal XML content: https://stackoverflow.com/questions/17301940/encoding-binary-data-within-xml-are-there-better-alternatives-than-base64

Changes:

Optional: Also pushed 3 small cleanup commits to fix warnings of code checking tools

ydirson and others added 22 commits January 20, 2023 17:45
Use of `unicode` needed to be immediately handled, but a few checks
relying on `str` could become insufficient in python2 with the larger
usage of unicode strings.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
…conversion

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
…s to

open() as ths is considered best practice.

(cherry picked from cpython commit 6cef076ba5edbfa42239924951d8acbb087b3b19)

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
…fication

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
…ated

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Running tests on python3 did reveal some of them.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
There is no guaranty about ordering of dict elements, and tests compare
results derived from enumerating a dict element.  We could have used an
OrderedDict to store the formulae and get a predictible output order, but
just considering the output as a set seems better.

Only applying this to rules expected to hold more than one element.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Caught by extended test.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
This goes away in python3.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
FIXME: I'm quite unsure why xcp.xmlunwrap would want to use bytes and not
unicode strings, but the encode/decode calls make it quite clear it wants
to work with bytes.  That makes the API painful to use in python3.
hashlib came with python 2.5, and old md5 module disappears in 3.0

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
This is supposed to be just a module renaming to conform to PEP8, see
https://docs.python.org/3/whatsnew/3.0.html#library-changes

The SafeConfigParser class has been renamed to ConfigParser in Python
3.2, and backported as addon package.  The `readfp` method now
triggers a deprecation warning to replace it with `read_file`.

FIXME: With python3 some Accessor implementations (e.g. FileAccessor)
provide a text stream for repository config (and with python2 all
implementations), while others (e.g. HTTPAccessor) provide a binary
stream.  But on python3 ConfigParser will bomb out if given a binary
stream, so use a TextIOWrapper to access the config.  This is a hack,
which cannot be used when it is binary data which has to be read (see
later commits), so I don't consider this commit to be correct in that
respect.
Testing several accessor classes causes code duplication, which can be
avoided with help from the `parametrized` package (unfortunately, `pytest`
support cannot be used together with `unittest`).

Not a big deal right now, but starts becoming painful when adding new tests
or testing other Accessor classes.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
This test uses the same kind of I/O (file copy) that prepare_host_upgrade.py
does.

FIXME: the copy cannot proceed this way in python3
This works properly for the http case, but FileAccessor provides us with
a text fileobj handle, and `read()` gets a UTF-8 decoding error.

FIXME: Accessor ctor requires a `mode` argument
Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Reported under python3 for members created on-the-fly in `setUp()`

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
With python3, pylint complains about `else: raise()` constructs.
This rework avoids them and reduces cyclomatic complexity by using
the error-out-first idiom.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
diff-cover defaults to origin/main in new version, it seems.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
@bernhardkaindl bernhardkaindl changed the title xcp/xmlunwrap.py:getText() Fix #18: encode only if not str(means: unicode) for Py2 Fix #18: encode only if not str(means: unicode) for Py2 Apr 24, 2023
@bernhardkaindl bernhardkaindl changed the title Fix #18: encode only if not str(means: unicode) for Py2 Fix #18: encode() only if type is unicode (only for Py2) Apr 24, 2023
@bernhardkaindl bernhardkaindl force-pushed the testsuite-driven-py3-xml-getText-encode branch from 21e5cfd to bc879ea Compare April 24, 2023 10:52
@bernhardkaindl
Copy link
Collaborator Author

Closing as obsoleted by commit
ecc8c1f of #27 which:

  • Removed the change from "string" to b"string" in the test case to make it pass
  • Applies the same fix as the 1st commit, and also fixes the same issue at a 2nd location.

Only the 1st commit of this PR was significant, the remaining commits were just suggestions by code checkers, which are independent of this fix, and are better reopened using a new PR.

bernhardkaindl added a commit to rosslagerwall/python-libs that referenced this pull request May 8, 2024
…long-input-six

Private/bernhardk/py3 long input six
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Changing xcp.xmlunwrap API to use unicode (py3:str) instead of str (py3:bytes)
4 participants