-
-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support to run streams in a subprocess #126
Comments
if the streams become subprocesses, would that not hinder the main process ability to terminate the stream? how would you kill the stream? |
I need to test it out, I have no experience with that, but I have read some articles and when creating a new process, i think you can get the PID as result and it should be easy to kill/stop a process. And the processes can have a pipe to each other with https://www.cloudcity.io/blog/2019/02/27/things-i-wish-they-told-me-about-multiprocessing-in-python/ |
interesting. i wonder if benefits of multiprocessing is significant enough. i think parallel processing is mostly useful for computational scripts. in our case, its I/O network. |
if you do it not parallelo with multiprocessing than you are limited to one cpu core! even with 8 cores, your script can use only 12,5% of the cpu power of the system. |
youre 100% right. but does it need more than 1 core? its network I/O and most of the time, the process is not working. threading may be sufficient. (unless theres hundreds of connections) |
Try this and its just throwing away the receives.... not saving it to database or something else... I think async.io is very cpu intensive. |
But you are right, its more a theoretical problem, 99,99% of all users will be not affected. Its very uncommon to stream everything, this does not make sense excepted for testing this lib :) - but if I find have the time, I think I will try it. |
Actually it makes sense because sometimes in periods of high volatility, network traffic increases significantly along with cpu that also needs for app logic. In case of streaming multiple timeframes on binance futures, for instance. |
As far as I know, due to the GIL, there's actually only one thread actually running in Python even when you're multithreaded. (https://hackernoon.com/concurrent-programming-in-python-is-not-what-you-think-it-is-b6439c3f3e6a and https://tenthousandmeters.com/blog/python-behind-the-scenes-13-the-gil-and-its-effects-on-python-multithreading/) Unlike in other languages that have true multithreading unfortunately. Sadly right now its pretty hard to use the library to subscribe to orderbook updates every 500ms for example when "depth5" when you have like 30 subscriptions to symbols, the data that is received from the library lags by a minimum of 10 seconds or so (this figure goes up forever). Still thinking of how to handle it on my project, maybe I'll spawn one subprocess and one manager per channel and pool the messages using PyZMQ (since that uses Cython internally and should be fast and beats built in multiprocessing pipe or queue libraries) |
Hi! Indeed when starting to stream whatever, it would be absolutely great to be able to choose between Threads or Processes. For some reasons right now Is it even OK to try to pass the
|
The problem with subproccesses (and in your script) is that no memory objects can be shared. Example: I'm in a bit of a hurry right now and I'm not quite sure if this is really a solution, but I think this is the direction it should go. |
No longer fits directly into the concept - we work with AsyncIO and Kubernetes and no longer need sub-processes. If someone really urgently needs it, please contact us: https://www.lucit.tech/get-support.html |
Is your feature request related to a problem? Please describe.
This lib is using multi-threading which is not really parallel in python - through the GIL all threads are executed sequential in a cycle.
Describe the solution you'd like
wrap streams into processes instead of threads to bypass GIL
The text was updated successfully, but these errors were encountered: