New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode support in Python 2? #14446

Closed
Tagar opened this Issue Feb 16, 2018 · 4 comments

Comments

Projects
None yet
3 participants
@Tagar

Tagar commented Feb 16, 2018

A debug log from Apache Zeppelin python "interpreter" when code has an unicode character [1].
It seems grpc does't support unicode data in Python 2?
Related Zeppelin jira to investigate it on Zeppelin side -
https://issues.apache.org/jira/browse/ZEPPELIN-3239

Apache Zeppelin has recently switched to use grpc -> ipython so there might be something isn't good there?

Exception happens in grpc/_server.py's _take_response_from_response_iterator():

def _take_response_from_response_iterator(rpc_event, state, response_iterator):
    try:
        return next(response_iterator), True
    except StopIteration:
        return None, True
    except Exception as exception:  # pylint: disable=broad-except
        with state.condition:
            if exception is state.abortion:
                _abort(state, rpc_event.call, cygrpc.StatusCode.unknown,
                       b'RPC Aborted')
            elif exception not in state.rpc_errors:
                details = 'Exception iterating responses: {}'.format(exception)
                logging.exception(details)
                _abort(state, rpc_event.call, cygrpc.StatusCode.unknown,
                       _common.encode(details))
        return None, False

[1]

DEBUG [2018-02-15 00:39:45,628] ({pool-2-thread-2} IPythonClient.java[stream_execute]:87) - stream_execute code:
One following unicide character makes ipythonInterpreter not responding to Cancel commands –
DEBUG [2018-02-15 00:39:45,632] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: ERROR:root:Exception iterating responses: 'ascii' codec can't encode character u'\u2013' in position 91: ordinal not in range(128)
DEBUG [2018-02-15 00:39:45,632] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: Traceback (most recent call last):
DEBUG [2018-02-15 00:39:45,633] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: File "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/grpc/_server.py", line 401, in _take_response_from_response_iterator
ERROR [2018-02-15 00:39:45,633] ({grpc-default-executor-0} IPythonClient.java[onError]:138) - Fail to call IPython grpc
io.grpc.StatusRuntimeException: UNKNOWN: Exception iterating responses: 'ascii' codec can't encode character u'\u2013' in position 91: ordinal not in range(128)
at io.grpc.Status.asRuntimeException(Status.java:543)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:395)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:426)
at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:512)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:429)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:544)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:117)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
DEBUG [2018-02-15 00:39:45,633] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: return next(response_iterator), True
DEBUG [2018-02-15 00:39:45,633] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: File "/tmp/zeppelin_ipython1942535087961089556/ipython_server.py", line 54, in execute
DEBUG [2018-02-15 00:39:45,633] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: print(request.code)
DEBUG [2018-02-15 00:39:45,634] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 91: ordinal not in range(128)
INFO [2018-02-15 00:39:58,894] ({dispatcher-event-loop-23} Logging.scala[logInfo]:54) - Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.20.33.75:40434) with ID 2

@kpayson64

This comment has been minimized.

Show comment
Hide comment
@kpayson64

kpayson64 Feb 19, 2018

Contributor

I can confirm that Python 2 gRPC doesn't accept unicode characters. The Python gRPC API accepts the string type, which in Python 2 is equivalent to the byte type. Applications should do any encoding if they are using unicode characters.

Contributor

kpayson64 commented Feb 19, 2018

I can confirm that Python 2 gRPC doesn't accept unicode characters. The Python gRPC API accepts the string type, which in Python 2 is equivalent to the byte type. Applications should do any encoding if they are using unicode characters.

@nathanielmanistaatgoogle

This comment has been minimized.

Show comment
Hide comment
@nathanielmanistaatgoogle

nathanielmanistaatgoogle Feb 21, 2018

Member

I'm not yet clear on what is the story of this issue. What system or subsystem is it that is directly using gRPC Python, what specifically is meant by "unicode characters", and how are these unicode characters being passed to gRPC Python? The "Exception iterating responses: 'ascii' codec can't encode character u'\u2013' in position 91: ordinal not in range(128)" part of your report looks really, really odd to me. Not necessarily wrong, but very strange.

Member

nathanielmanistaatgoogle commented Feb 21, 2018

I'm not yet clear on what is the story of this issue. What system or subsystem is it that is directly using gRPC Python, what specifically is meant by "unicode characters", and how are these unicode characters being passed to gRPC Python? The "Exception iterating responses: 'ascii' codec can't encode character u'\u2013' in position 91: ordinal not in range(128)" part of your report looks really, really odd to me. Not necessarily wrong, but very strange.

@Tagar

This comment has been minimized.

Show comment
Hide comment
@Tagar

Tagar commented Feb 21, 2018

@kpayson64 kpayson64 closed this Feb 21, 2018

@nathanielmanistaatgoogle

This comment has been minimized.

Show comment
Hide comment
@nathanielmanistaatgoogle

nathanielmanistaatgoogle Feb 21, 2018

Member

@Tagar: thank you for following up!

Member

nathanielmanistaatgoogle commented Feb 21, 2018

@Tagar: thank you for following up!

@lock lock bot locked as resolved and limited conversation to collaborators Sep 30, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.