Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PUBDEV-8460 fix _upload_python_object method #5999

Merged
merged 1 commit into from
Jan 14, 2022

Conversation

koniecsveta
Copy link
Contributor

@koniecsveta koniecsveta commented Jan 10, 2022

for py 3.5 and higher, the issue is fixed by adding parameter encoding='utf-8' to os.fdopen() call
for py 2.7, os.fdopen() doesn't support parameter encoding, but the issue is anyway not reproducible, by the same test that it is for py 3.8

@koniecsveta koniecsveta force-pushed the zuzana/PUBDEV-8460/fix_encoding_error branch 3 times, most recently from 7653e9d to 878d837 Compare January 10, 2022 20:02
@koniecsveta koniecsveta changed the title fix _upload_python_object method PUBDEV-8460 fix _upload_python_object method Jan 11, 2022
@koniecsveta koniecsveta requested review from sebhrusen, maurever and michalkurka and removed request for sebhrusen and maurever January 11, 2022 16:38
h2o-py/h2o/frame.py Outdated Show resolved Hide resolved
h2o-py/h2o/frame.py Outdated Show resolved Hide resolved
Comment on lines 94 to 95
fpublic = open(publicFileName, 'w', encoding='utf-8')
fprivate = open(privateFileName, 'w', encoding='utf-8')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably better, but not sure it belongs here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -72,7 +72,7 @@ def main():
# magic_files = {}
for filename in locate_files(ROOT_DIR):
print("Processing %s" % filename)
with open(filename, "rt") as f:
with open(filename, "rt", encoding='utf-8') as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather delete that entire file: unused, unclear about what its original intent (read some magoc comments and then what?), and no example of "magic comment" in h2o-py or h2o-bindings.
Thanks for the discovery :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I haven't found any example of usage of this.. @michalkurka do you know what this could be useful for?

h2o-py/h2o/h2o.py Outdated Show resolved Hide resolved
@@ -139,7 +140,7 @@ def _upload_python_object(self, python_obj, destination_frame=None, header=0, se

# create a temporary file that will be written to
tmp_handle, tmp_path = tempfile.mkstemp(suffix=".csv")
tmp_file = os.fdopen(tmp_handle, 'w')
tmp_file = os.fdopen(tmp_handle, 'w', encoding='utf-8') if sys.version_info >= (3, 5) else os.fdopen(tmp_handle, 'w')
Copy link
Contributor

@sebhrusen sebhrusen Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we apparently have to stick with os.fdopen here due to Py2 encoding inconsistencies, please use PY2 constant instead of sys.version as we should always do: much easier to search for usages then, we don't support Py3 < 3.5 anyway.
maybe a simpler way to avoid redundancies:

# as a global var
__fdopen_kwargs = {} if PY2 else {'encoding': 'utf-8'}

# each time we need fdopen:
tmp_file = os.fdopen(tmp_handle, 'w', **__fdopen_kwargs)

Copy link
Contributor

@sebhrusen sebhrusen Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully we don't write much on the filesystem, as the write support for strings in Py2 is a total disaster as it doesn't automatically convert str to proper strings (ie. unicode), hence this need to use fdopen apparently here as it seems to write without any check/transformation (returned file object is not a TextIOWrapper in Py2).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improved in 25e4ce9

@koniecsveta
Copy link
Contributor Author

koniecsveta commented Jan 13, 2022

the test is still failing because, in short,

>>> locale.setlocale(locale.LC_ALL, locale.getlocale())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/locale.py", line 608, in setlocale
    return _setlocale(category, locale)
locale.Error: unsupported locale setting

in dev-python-3.7

also

>>> locale.resetlocale()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/locale.py", line 618, in resetlocale
    _setlocale(category, _build_localename(getdefaultlocale()))
locale.Error: unsupported locale setting

There are these locales:

(h2o_env_python3.7) root@h2o:/h2o-3# locale -a
C
C.UTF-8
POSIX

so in h2o_env_python3.7 I will 'reset' manualy to C.UTF-8

Copy link
Contributor

@sebhrusen sebhrusen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @koniecsveta

@koniecsveta koniecsveta force-pushed the zuzana/PUBDEV-8460/fix_encoding_error branch from a7e518e to ec14aff Compare January 13, 2022 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants