New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python Interpreter shutdown when using multithreading #148
Comments
Experiencing the same issue here. OS and version used: UbuntuMate 16.04.02 Python runtime used: Python 2.7 Platform used: Raspberry Pi B+ Script uses PyQt4 and QThreads. I am spawning 3 QThreads in total, 1 of which handles Serial Comms, 1 handles a Phidgets USB device and 1 handles the Azure connection. Also using HTTP protocol and my Azure thread class looks similar to the one posted. Without spawning the Azure thread, the script has been running for over a week. As soon as I introduce the Azure thread I get: Fatal Python error: GC object already tracked When running through gdb, the error message is followed by: Thread 1 "python" received signal SIGABRT, Aborted. I can also confirm that I can reproduce it 100% of the time, but the time it takes for it to happen can vary from anywhere between a couple of minutes to an hour. I agree that the issue seems to lie somewhere with the confirmation callback in the send_event_async method |
I'm also experiencing this issue or something similar. Are there any updates to this? Is there a fix in the works? Or has anyone found any workarounds? |
same issue here: is there any solution or workaround? |
Just as a little update on this - since I didn't want to spend a lot of time digging around I resorted to switching to multiproccessing. I suppose I would recommend here do this until someone finds the actual issue. |
Context: I might have had a slightly different issue than is documented here actually. As a workaround I have a C# HTTP server in another module which my Python module forwards it's requests to. It works fine but obviously it would be nice to be able to use the IoTHubModuleClient class without reliability issues. |
It is funny that you say that because at some point I had exactly the same setup (a C# TCP server which acted as a broker) |
This is happening to me as well. I found that using "send_event_async" and waiting for the tx confirmation before using "send_event_async" again helps a bit, but still crashes every couple of hours. This is actually pathetic that after more than a year we still have no official response on this from the devs. What more is needed? Test code exhibiting the issue were provided. Do the devs think we play with the code for "hobby" projects. Maybe it's time to consider another provider for our services... |
* combine connection args ops to make pipeline simpler
@LouanDuToitS3 @Cnidarias @uitdam @bulletlink @jackt-moran While it may be "too little too late" at that point, i'm also hoping we're still "better late than never" at least for folks who've experienced this issue recently. We've reached a point where we feel confident pushing the v2 of the SDK into master and previewing it with customers. The API are a lot more pythonic, and if bugs are found they will be fixed much faster because the environment is much friendlier for pythonistas and we don't depend on another team to build the product. I hope that what you'll find in master today (and in the azure-iot-device package on PyPI) will suit you a lot more than what we had before. please do let us know. |
@pierreca |
@jackt-moran correct, at that point and even though it's still in preview, we feel it's such an improvement over v1 that it's worth pushing it to master already. the -preview repository is going to be archived after we're done moving all the builds and issues etc. |
@pierreca I'm unable to log issues against the V2 SDK, but here are a condensed list of the current issues preventing me from using V2 in production:
sys.exit(1) # TODO: raise an error instead Also there seems to be no way for me to query the connection state. I hope that I'm overlooking something. A comment in the code suggests that a connection will be established if needed, which does not happen, therefore leading me to believe that the code itself at the point of sending data is unaware of the connection state.
I would love to switch over to V2, but unfortunately, for me, it's not production ready. |
@LouanDuToitS3 thanks for the very valuable feedback. As to when we expect v2 to be production ready: we're incredibly close to feature complete but also very aware that we can't call it production ready until we squash bugs that haven't been found by end-to-end and unit tests. I'm guessing some time in November. |
@pierreca Thanks. |
Hi @LouanDuToitS3, Here is some specific feedback following the same numbering you used above.
|
Hi @BertKleewein , I logged issue: #294 for no 4. |
With the Python v2 reaching General Availability we are recommending users move away from the v1 SDK. We will be closing the v1 SDK issues whose features are covered by the v2 release. Feel free to open an new issue if you find an issue with the v2 SDK. Thank you for your patronage |
@bulletlink, @jackt-moran, @uitdam, @Cnidarias, @LouanDuToitS3, @BertKleewein, thank you for your contribution to our open-sourced project! Please help us improve by filling out this 2-minute customer satisfaction survey |
OS and version used: Ubuntu 16.04.3
Python runtime used: Python 3.5.2
SDK version used: azure-iothub-device-client==1.3.5
Description of the issue:
When sending data to an IoT-Hub using the azure-iothub-device-client lib the python interpreter crashes after some undefined period of time when using multithreading.
This behaviour is 100% reproducible but the time it takes for it to happen is varied. When using more threads and sending more frequently it seems that it will happen faster. When using 50 threads and sending every second it crashed after about 30 minutes.
The error one gets is simply:
Fatal Python error: GC object already tracked
Sometimes this comes with a SIGABRT but not always.
Since I ran python with gdb I have attached a backtrace will all the info
I think the most important parts are at the top - I also included a py-bt of all threads which most of them are just sleeping
gdb.txt
Code sample exhibiting the issue:
The code is pretty simple - just sets up the connection and sends random data.
The protocol we are using is HTTP - the rest of the config file are just the connection strings and shouldn't matter for the issue.
I think the issues lies with the callback handling in the send_event_async method as I am not sure if the GIL is done correctly - but I am more or less just blindly guessing
The text was updated successfully, but these errors were encountered: