Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of binary data fails sometimes #37

Closed
Schnodderbalken opened this issue Jul 31, 2016 · 20 comments
Closed

Handling of binary data fails sometimes #37

Schnodderbalken opened this issue Jul 31, 2016 · 20 comments
Assignees
Labels

Comments

@Schnodderbalken
Copy link

Schnodderbalken commented Jul 31, 2016

After updating flask-socketio from 1.2 to 2.6 I have a strange error that comes up occasionally when there are more than a few threads that handle a binary message from a socket.io client. I get the following error:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/maRcbook/Library/Python/3.5/lib/python/site-packages/socketio/server.py", line 442, in _handle_eio_message
    self._binary_packet.reconstruct_binary(self._attachments)
AttributeError: 'NoneType' object has no attribute 'reconstruct_binary'

I just noticed is has nothing to do with the upgrade of flask-socketio. I upgraded python from 3.4 to 3.5 as well and this is probably the cause. Downgrading flask-socketio did not make it work. So it probably has to do with python-socketio for python 3.5

@miguelgrinberg miguelgrinberg self-assigned this Jul 31, 2016
@Schnodderbalken
Copy link
Author

If I change server.py l. 437 to if self._attachment_count > 0 and self._binary_packet != None: it crashes at another stage

Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 914, in _bootstrap_inner self.run() File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 862, in run self._target(*self._args, **self._kwargs) File "/Users/maRcbook/Library/Python/3.5/lib/python/site-packages/socketio/server.py", line 454, in _handle_eio_message pkt = packet.Packet(encoded_packet=data) File "/Users/maRcbook/Library/Python/3.5/lib/python/site-packages/socketio/packet.py", line 32, in __init__ self.attachment_count = self.decode(encoded_packet) File "/Users/maRcbook/Library/Python/3.5/lib/python/site-packages/socketio/packet.py", line 73, in decode self.packet_type = int(ep[0:1]) ValueError: invalid literal for int() with base 10: b'%'

I wonder why this only happens since I upgrade Python to 3.5

@miguelgrinberg
Copy link
Owner

miguelgrinberg commented Aug 2, 2016

Could you add a print statement for the variable ep right before line 73 in packet.py? Not sure if this is important or not, but that might provide a clue to this problem. And leave the modified if statement in server.py, of course, so that it crashes like in your second stack trace.

@Schnodderbalken
Copy link
Author

Schnodderbalken commented Aug 2, 2016

Sure! The content of ep is a byte-string of a pdf file. It looks like this:

b'%PDF-1.3\n%\xc4\xe5\xf2\xe5\xeb\xa7\xf3\xa0\xd0\xc4\xc6\n4 0 obj\n<< /Length 5 0 R /Filter /FlateDecode >>\nstream\nx\x01+T\x08T(T\xd0\x0fH-JN-()M\xccQ(\xca\x04\n\x98Z\x1a*\x18\x00\xa1\x85\x891\x98N\xceU\xd0\xf7\xcc5Tp\xc9\x07\xaa\x0f\x04\x00\x95\xb4\r\xfd\nendstream\nendobj\n5 0 obj\n54\nendobj\n2 0 obj\n<< /Type /Page /Parent 3 0 R /Resources 6 0 R /Contents 4 0 R /MediaBox [0 0 591 843]\n>>\nendobj\n6 0 obj\n<< /ProcSet [ /PDF /ImageB /ImageC /ImageI ] /XObject << /Im1 7 0 R >> >>\nendobj\n7 0 obj\n<< /Length 8 0 R /Type /XObject /Subtype /Image /Width 3284 /Height 4688 /ColorSpace\n9 0 R /BitsPerComponent 1 /Filter /FlateDecode >>\nstream\nx\x01\xec\xbd\xcd\x8e\xdbH\xda\xb6I\x95\x80f-\x1a\xc5m/<E\

and so on (becomes unreadable from there on)

the end looks like this if this is of any interest:

>>\nendobj\n13 0 obj\n(Mac OS X 10.11.3 Quartz PDFContext)\nendobj\n14 0 obj\n(ScanSnap Manager #iX500)\nendobj\n15 0 obj\n(D:20160217193438Z00\'00\')\nendobj\n1 0 obj\n<< /Producer 13 0 R /Creator 14 0 R /CreationDate 15 0 R /ModDate 15 0 R >>\nendobj\nxref\n0 16\n0000000000 65535 f \n0000256908 00000 n \n0000000167 00000 n \n0000256638 00000 n \n0000000022 00000 n \n0000000149 00000 n \n0000000271 00000 n \n0000000360 00000 n \n0000253844 00000 n \n0000256602 00000 n \n0000253866 00000 n \n0000256581 00000 n \n0000256721 00000 n \n0000256771 00000 n \n0000256824 00000 n \n0000256866 00000 n \ntrailer\n<< /Size 16 /Root 12 0 R /Info 1 0 R /ID [ <67a72ea616418cd07a2b16c999f63464>\n<67a72ea616418cd07a2b16c999f63464> ] >>\nstartxref\n256999\n%%EOF\n

@miguelgrinberg
Copy link
Owner

Are you doing anything with PDF files?

@Schnodderbalken
Copy link
Author

Yes, the client sends them via websocket in binary format to the server.

@miguelgrinberg
Copy link
Owner

miguelgrinberg commented Aug 2, 2016

Are you sending the PDF as a standalone payload, or is it part of a larger data structure, maybe something like {'foo': 'bar', 'pdf': <pdf-binary-data>}?

Actually, can you show me the portion of the client code that emits these PDFs?

@Schnodderbalken
Copy link
Author

Schnodderbalken commented Aug 2, 2016

This is the signature of the function that handles the client message.

def _declareUserCommandRoute(self, controller): @self._socketio.on(controller.client_command_payload) def response(json_data, type=None, nonce=None): [...]

If I print the content of the first parameter (json_data) it says:
{'data': <pdf-binary-data>, 'method': 'post'}

So to answer your question: as a part of a larger data structure.

And then again it only happens when I have multiple threads running that try to accept the client messages. If I only send one request at a time (and a few more are also ok) it works. However for 10 and more I get these errors.

@miguelgrinberg
Copy link
Owner

Yeah, sounds like when several of these PDF attachments are flying the server sometimes loses one and treats it as a standalone message. This is useful, I think I have enough to reproduce the problem. Thanks.

@Schnodderbalken
Copy link
Author

Schnodderbalken commented Aug 2, 2016

Okay cool! If you need further information like a concrete payload that causes the problem let me know. Thanks in advance.

Oh and do you have any clue why this only happens since I updated Python to 3.5?

@miguelgrinberg
Copy link
Owner

And then again it only happens when I have multiple threads running that try to accept the client messages.

Can you explain this statement in detail? Are you talking about server-side threads or client-side?

@Schnodderbalken
Copy link
Author

Yes, it's about server-side threads. I handle every request in a separate thread as I have resource hungry calculations running (OCR) that should not block further incoming requests.

@miguelgrinberg
Copy link
Owner

Not sure I understand how your use of threads affects the internals of the socket.io server. You are instantiating only one server, correct? Can you show me the code that spawns your threads?

@Schnodderbalken
Copy link
Author

Schnodderbalken commented Aug 4, 2016

If it helps you to reproduce the problem I could even send you a script that reproduces the problem (that will take some days though because I am currently not at home)
Yes, I am only instantiating one server. The "on"-Method that reacts on client requests spawns new threads.

@miguelgrinberg
Copy link
Owner

The "on"-Method that reacts on client requests spawns new threads.

Okay, this should be fine, in fact the functions attached to "on" decorators are executed in independent threads anyway.

If you can send me an example that reproduces the problem that would be most helpful. Thanks!

@Schnodderbalken
Copy link
Author

Schnodderbalken commented Aug 4, 2016

It was a hard task but I managed to boil everything down to a few lines of code (in js and in python as well). http://s000.tinyupload.com/index.php?file_id=69252726138999738284 here is a zip containing the js file, the py file and two client-side dependencies (jquery and socketio). If you have any problems launching it or anything, let me know. I used tinyupload the first time, I hope the file stays there long enough.

In order to reproduce it start main.py with python3. On the client side just click on the input element and choose 15-20 files at once (I tested it with pdf. I don't know if other filetypes work as well. Guess so)

@miguelgrinberg
Copy link
Owner

Thanks, got the file. How do I trigger the error? Select a file and keep doing it until it fails?

@Schnodderbalken
Copy link
Author

On my machine (MacBook Pro 2013) I had to just choose many (about 15) files (each at least 1mb big I think) in the file dialog. That triggers socketio to send it to the server and immediately led to the errors explained in my first post.

@Schnodderbalken
Copy link
Author

I chose the files all at once in the dialog

@miguelgrinberg
Copy link
Owner

Can I ask you to retest with the master branch? I think the fix that I just made will address the problem with the binary attachments. Thanks a lot, the example you provided was immensely helpful.

@Schnodderbalken
Copy link
Author

Awesome! It works. Thanks for the uncomplicated way of communicating and the tremendous speed! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants