Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode support in Python 2? #14446

Closed
Tagar opened this issue Feb 16, 2018 · 4 comments
Closed

Unicode support in Python 2? #14446

Tagar opened this issue Feb 16, 2018 · 4 comments

Comments

@Tagar
Copy link

Tagar commented Feb 16, 2018

A debug log from Apache Zeppelin python "interpreter" when code has an unicode character [1].
It seems grpc does't support unicode data in Python 2?
Related Zeppelin jira to investigate it on Zeppelin side -
https://issues.apache.org/jira/browse/ZEPPELIN-3239

Apache Zeppelin has recently switched to use grpc -> ipython so there might be something isn't good there?

Exception happens in grpc/_server.py's _take_response_from_response_iterator():

def _take_response_from_response_iterator(rpc_event, state, response_iterator):
    try:
        return next(response_iterator), True
    except StopIteration:
        return None, True
    except Exception as exception:  # pylint: disable=broad-except
        with state.condition:
            if exception is state.abortion:
                _abort(state, rpc_event.call, cygrpc.StatusCode.unknown,
                       b'RPC Aborted')
            elif exception not in state.rpc_errors:
                details = 'Exception iterating responses: {}'.format(exception)
                logging.exception(details)
                _abort(state, rpc_event.call, cygrpc.StatusCode.unknown,
                       _common.encode(details))
        return None, False

[1]

DEBUG [2018-02-15 00:39:45,628] ({pool-2-thread-2} IPythonClient.java[stream_execute]:87) - stream_execute code:
One following unicide character makes ipythonInterpreter not responding to Cancel commands –
DEBUG [2018-02-15 00:39:45,632] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: ERROR:root:Exception iterating responses: 'ascii' codec can't encode character u'\u2013' in position 91: ordinal not in range(128)
DEBUG [2018-02-15 00:39:45,632] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: Traceback (most recent call last):
DEBUG [2018-02-15 00:39:45,633] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: File "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/grpc/_server.py", line 401, in _take_response_from_response_iterator
ERROR [2018-02-15 00:39:45,633] ({grpc-default-executor-0} IPythonClient.java[onError]:138) - Fail to call IPython grpc
io.grpc.StatusRuntimeException: UNKNOWN: Exception iterating responses: 'ascii' codec can't encode character u'\u2013' in position 91: ordinal not in range(128)
at io.grpc.Status.asRuntimeException(Status.java:543)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:395)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:426)
at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:512)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:429)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:544)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:117)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
DEBUG [2018-02-15 00:39:45,633] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: return next(response_iterator), True
DEBUG [2018-02-15 00:39:45,633] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: File "/tmp/zeppelin_ipython1942535087961089556/ipython_server.py", line 54, in execute
DEBUG [2018-02-15 00:39:45,633] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: print(request.code)
DEBUG [2018-02-15 00:39:45,634] ({Exec Stream Pumper} IPythonInterpreter.java[processLine]:388) - Process Output: UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 91: ordinal not in range(128)
INFO [2018-02-15 00:39:58,894] ({dispatcher-event-loop-23} Logging.scala[logInfo]:54) - Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.20.33.75:40434) with ID 2

@kpayson64
Copy link
Contributor

I can confirm that Python 2 gRPC doesn't accept unicode characters. The Python gRPC API accepts the string type, which in Python 2 is equivalent to the byte type. Applications should do any encoding if they are using unicode characters.

@nathanielmanistaatgoogle
Copy link
Member

I'm not yet clear on what is the story of this issue. What system or subsystem is it that is directly using gRPC Python, what specifically is meant by "unicode characters", and how are these unicode characters being passed to gRPC Python? The "Exception iterating responses: 'ascii' codec can't encode character u'\u2013' in position 91: ordinal not in range(128)" part of your report looks really, really odd to me. Not necessarily wrong, but very strange.

@Tagar
Copy link
Author

Tagar commented Feb 21, 2018

It was fixed on client side in PR apache/zeppelin#2810
with one simple change in line

https://github.com/apache/zeppelin/blob/a791fad5970905edd07bdb8afcc00497fb3540f5/python/src/main/resources/grpc/python/ipython_server.py#L53

by adding .encode('utf-8')

Thanks.

@nathanielmanistaatgoogle
Copy link
Member

@Tagar: thank you for following up!

@lock lock bot locked as resolved and limited conversation to collaborators Sep 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants