Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #9 Decode later, to avoid decoding in the middle of utf8 sequence #10

Merged
merged 1 commit into from Jun 11, 2017
Merged

Issue #9 Decode later, to avoid decoding in the middle of utf8 sequence #10

merged 1 commit into from Jun 11, 2017

Conversation

davedoesdev
Copy link
Contributor

And to make sure we only split on \r and \n (Unicode splitlines splits on a
number of other characters which is not in the SSE spec)

@mpetazzoni I tested on my own private project with Python 2.7 and 3.5.

This project could do with its own unit tests.

And to make sure we only split on \r and \n (Unicode splitlines splits on a
number of other characters which is not in the SSE spec)
@mpetazzoni
Copy link
Owner

Thanks! Could the same problem happen with chunk.splitlines(True) when reading chunks from the event source in _read() ?

@davedoesdev
Copy link
Contributor Author

No because in _read, the data is a byte string and according to https://docs.python.org/3/library/stdtypes.html#bytes.splitlines and https://docs.python.org/2/library/stdtypes.html#str.splitlines it breaks "at ASCII line boundaries".

Further, I think I'm right in saying UTF-8 of >=2 bytes can't include ASCII newline or carriage return (the MSB of all bytes will be 1).

@davedoesdev
Copy link
Contributor Author

@mpetazzoni what do you think?

@davedoesdev
Copy link
Contributor Author

OK to merge?

@mpetazzoni
Copy link
Owner

Sorry for the delay. I don't think you understood my previous comment. Lines 47-48:

for chunk in self._event_source:
  for line in chunk.splitlines(True):

I think it's possible here for chunk to be of type unicode depending on the event_source that was provided to the SSEClient. In this case, splitlines() would exhibit the same incorrect behavior.

@davedoesdev
Copy link
Contributor Author

I guess it depends what stream means in this comment:

        The event source is expected to provide a stream() generator method and
        a close() method.

Maybe rewrite the comment?

        The event source is expected to provide a binary stream generator method and
        a close() method.

(using terminology from https://docs.python.org/3/library/io.html#binary-i-o)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants