Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences between quamash and asyncio event loop + processEvents #66

Open
Insoleet opened this issue Sep 4, 2016 · 21 comments
Open

Comments

@Insoleet
Copy link
Contributor

Insoleet commented Sep 4, 2016

Hello,

I was wondering what would be the implication of using the asyncio event loops for a Qt application instead of quamash Event loop.

We can use asyncio with Qt with the following simple code. The app must be killed to be stopped but you get the point :

    async def process_events(qapp):
        while True:
            await asyncio.sleep(0)
            qapp.processEvents()

    qapp = QApplication(sys.argv)
    loop = asyncio.get_event_loop()
    loop.run_until_complete(process_events(qapp))

So what does Quamash brings in terms of compatibility and stability with the Qt system compared to asyncio event loop ? Or is it only about a proof of concept ?

The first thing I can see is that asyncio + modal dialogs does not working easiliy. But I wonder if there are more things to know that I'm not aware of ?

Thanks

@Insoleet Insoleet changed the title Differences between quamash and standard event loop + processEvents Differences between quamash and asyncio event loop + processEvents Sep 4, 2016
@harvimt
Copy link
Owner

harvimt commented Sep 4, 2016

Primarily performance, but since you're in python YMMV a lot with regards to how much that matters.

With quamash you're only running one event loop, and plugging the python functions into it. with this you're running two event loops essentially. With quamash you get to use Qts event loop for socket handling which is potentially faster. (there's a re-implementation of the base event loop in C that's faster, I assume using Qts file-descriptor/socket handling is similar).

There's also potentially compatibility concerns if some parts of your code do things the Qt way and some do it using builtin python functions. Then again if everything is done the "python way" in your code instead of the "Qt way" then you probably wouldn't have any problems.

I suppose performance would be a stronger motivator if I could like show charts showing what was faster.

@Insoleet
Copy link
Contributor Author

Insoleet commented Sep 9, 2016

So I looked at the performances impact of Quamash compared to asyncio. To do this, I used aiohttp benchmarks ( https://github.com/KeepSafe/aiohttp/tree/master/benchmark ) and added a test using Quamash event loops.

The results do not look good at all. Maybe the overhead caused by PyQt5 is causing this much performance problem ?

Results for run_aiohttp
RPS: 1116,  mean: 434.351 ms,   standard deviation 42.503 ms    median 435.099 ms
Results for run_aiohttp_quamash
RPS: 410,   mean: 1184.216 ms,  standard deviation 320.029 ms   median 1164.301 ms
Results for run_tornado
RPS: 1023,  mean: 479.672 ms,   standard deviation 99.634 ms    median 448.451 ms
Results for run_twisted
RPS: 1096,  mean: 445.074 ms,   standard deviation 61.861 ms    median 435.580 ms

@harvimt
Copy link
Owner

harvimt commented Sep 9, 2016

Yeah that's not great. It'd be good to get profile information out, to see why it's running slowly.

@Insoleet
Copy link
Contributor Author

Insoleet commented Sep 9, 2016

Here are some dumps that are pretty interesting, comparing standard asyncio loop and quamash :

Standard Asyncio loop :

callgraph

Quamash loop :

callgraph-quamash

@harvimt
Copy link
Owner

harvimt commented Sep 10, 2016

ok, so most time is spent here: https://github.com/harvimt/quamash/blob/master/quamash/__init__.py#L187 and here: https://github.com/harvimt/quamash/blob/master/quamash/__init__.py#L332-L342

somewhat unsurprising.

possible improvements:

  • make self.__timers a set (may make perf worse since self.__timers is usually small)
  • change this elif block to an else (https://github.com/harvimt/quamash/blob/master/quamash/__init__.py#L191) since the assert is already checking the that the timerId is correct.
  • get rid of the assert since the timerId will always be correct unless there's a bug in Qt.
  • add a __slots__ to _SimpleTimer
  • might be able to get rid of self.__timers completely and just use the instance created, then we could eliminate upon_timeout
  • instead of creating/destroying _SimpleTimers a lot, have a pool that get reused.
  • use a single simple timer and key off the timerId more

@Insoleet
Copy link
Contributor Author

Insoleet commented Sep 10, 2016

I did most of the easy fix without much change :

  • make self.__timers a set (may make perf worse since self.__timers is usually small)
  • change this elif block to an else (https://github.com/harvimt/quamash/blob/master/quamash/__init__.py#L191) since the assert is already checking the that the timerId is correct.
  • get rid of the assert since the timerId will always be correct unless there's a bug in Qt.
  • add a slots to _SimpleTimer
  • might be able to get rid of self.__timers completely and just use the instance created, then we could eliminate upon_timeout
  • instead of creating/destroying _SimpleTimers a lot, have a pool that get reused.
  • use a single simple timer and key off the timerId more

If we compare the two results, it seems that the overhead is mostly caused by the upon_timeout.
I'm trying to replace the list of timers with one _SimpleTimer instance for all our callbacks with a simple mapping timerid -> callbacks.

Why do _add_callback returns a _SimpleTimer ? This is returned by call_soon so I'm wondering why a asyncio.Handle isn't returned instead ? (https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.AbstractEventLoop.call_soon )

@Insoleet
Copy link
Contributor Author

Insoleet commented Sep 10, 2016

Ok so I did a fast implementation of the option ( https://github.com/Insoleet/quamash/tree/perfs ) :

  • use a single simple timer and key off the timerId more

This implementation breaks Executor tests, and notably the ones about subprocesses , but it was enough to run the benchmark.

It looks a bit better but that still not as good as standard asyncio :

Results for run_aiohttp
RPS: 1017,  mean: 477.999 ms,   standard deviation 70.280 ms    median 468.330 ms
Results for run_aiohttp_quamash
RPS: 699,   mean: 694.200 ms,   standard deviation 105.264 ms   median 689.685 ms

callgraph-quamash

Any idea maybe ?

@harvimt
Copy link
Owner

harvimt commented Sep 10, 2016

Qt's IO subsystem just isn't that fast? Context switches between Qt & Python are so slow it doesn't matter? I think that's about as performant as we can get without using numba or cython.

The question about handles, is I think that code changed quite a bit in terms of fixing bugs. I wanted to just return the handle, but I needed to wrap the handle in another handle. I made _SimpleTimer a subclass of asyncio.Handle at one point, but couldn't have it be a subclass of both QObject and Handle. I think the PEP only requires that the returned object support a .cancel() method.

@Insoleet
Copy link
Contributor Author

Ok I understand. I think 30% in terms of performances gains is still really interesting. I'd love to fix the errors with subprocesses stuff.

I need some help because I don't understand some things. For example, why do this test ends even if there is a loop.run_forever just inside it without nothing which looks like it could stop ?
https://github.com/harvimt/quamash/blob/master/tests/test_qeventloop.py#L149

@harvimt
Copy link
Owner

harvimt commented Sep 10, 2016

ugh that's the low-level test. I only wrote the low-level tests because the
high level tests failed and I didn't know why.

this code is what stops the loop when run_forever is called:
https://github.com/harvimt/quamash/blob/master/tests/test_qeventloop.py#L30

On Sat, Sep 10, 2016 at 10:28 AM, Insoleet notifications@github.com wrote:

Ok I understand. I think 30% in terms of performances gains is still
really interesting. I'd love to fix the errors with subprocesses stuff.

I need some help because I don't understand some things. For example, why
do this test ends even if there is a loop.run_forever just inside it
without nothing which looks like it could stop ?
https://github.com/harvimt/quamash/blob/master/tests/
test_qeventloop.py#L149


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#66 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAHRQvolbv6jqjkILHOCcMr3-8OgkcT7ks5qoujKgaJpZM4J0jBs
.

@kamikaze
Copy link

kamikaze commented Nov 4, 2017

@Insoleet please provide an example code where you don't need to kill an app, but it quits when closing the main window. Thanks

@Insoleet Insoleet mentioned this issue Nov 16, 2017
Merged
@Insoleet
Copy link
Contributor Author

So, after reading again my code, I found out that the mistake was rather basic. I needed to implement a method called _add_callback.

Well, I suggest my pull request which is still a gain of 33% speed.

@kamikaze : see #65 (comment)

@cr1901
Copy link

cr1901 commented Jun 17, 2018

@harvimt I also have the same question as indicated in this issue (I admit I am not so great with asyncio either):

There's also potentially compatibility concerns if some parts of your code do things the Qt way and some do it using builtin python functions. Then again if everything is done the "python way" in your code instead of the "Qt way" then you probably wouldn't have any problems.

Can you elaborate on what you mean by "the differences between the python way and Qt way" (possibly w/ a code snippet)? Is this referring to how to send events back to the Qt event loop (and define callbacks) versus "how to do the same w/ python's async/await"?

@harvimt
Copy link
Owner

harvimt commented Jun 17, 2018

no. basically I mean if there's a Qt Library for something IO-related (like say QNetworkRequest) vs a python-native module that doesn't depend on Qt (like say aiohttp), you should use the python-native one.

@harvimt
Copy link
Owner

harvimt commented Jun 17, 2018

that would extend to python built-in functions as well, like local file access, and direct network sockets.

@enkore
Copy link

enkore commented Aug 20, 2018

A major difference I didn't see mentioned here is quamash only working for the main application event loop, i.e. QApplication's main/UI thread. Meanwhile the "let's put an event loop in an event loop so you can loop while you loop"-approach actually works for arbitrary QThreads. Sample code:

class ...(QObject):
    @pyqtSlot()
    def run(self):
        # Connected to QThread.started; thus will be called in the new thread.
        asyncio.set_event_loop(asyncio.new_event_loop())
        self.evloop = asyncio.get_event_loop()
        self.evloop.run_until_complete(self.run_event_loop())

    async def run_event_loop(self):
        """
        Runs the event loop. Will only complete if the thread receives an interruption request
        (QThread.requestInterruption()).

        Since we are using asyncio only for coordination, the loop will block for Qt events.
        """
        thread = self.thread()
        while not thread.isInterruptionRequested():
            await asyncio.sleep(0)
            thread.eventDispatcher().processEvents(QEventLoop.AllEvents|QEventLoop.WaitForMoreEvents)

The sequence to get such a thread to join is thus as follows:

thread.requestInteruption()  # Set interruption request flag
thread.eventDispatcher().wakeup()  # Causes processEvents() to return even if no real events arrive
thread.quit()  # Causes event loop to quit (the one running in QThread.run, not run_event_loop())
thread.join()  # Thread will exit now and can be joined.

I am using this in the context of QNetworkAccessManager, where I have the ability to retry requests and such things, but I also wanted to avoid the signal/callback-oriented approach, since that tends to get messy fast.

Thus, the actual events driving this and completing futures is the QNetworkReply status slots explicitly setting the results on futures previously created. The event loop of asyncio is only around to resume coroutines as the futures they're waiting on enter done status.

@erdewit
Copy link
Contributor

erdewit commented Sep 5, 2018

The process_events coroutine at the start of this thread has the slight drawback that it doesn't handle DefferedDelete events, as mentioned in the Qt docs. I noticed this when using QtCharts, with the charts not updating properly.

It can be remedied by using

async def process_events():
    qloop = qt.QEventLoop()
    timer = qt.QTimer()
    timer.timeout.connect(qloop.quit)
    while True:
        timer.start(0)
        qloop.exec_()
        timer.stop()
        await asyncio.sleep(0.001)

@enkore
Copy link

enkore commented Sep 7, 2018

The same applies to QAbstractEventDispatcher::processEvents as well.

Meanwhile, I wrote Asynker which for my use case completely removes the asyncio dependency (and thus its concurrent.futures and multiprocessing dependencies and some other modules, all of which add up to 1+ MB of bloat after compression if you are building a standalone app); it includes some examples showing how to mix Qt and async; they should apply, for the most part, to asyncio as well.

@erdewit
Copy link
Contributor

erdewit commented Sep 9, 2018

It turns out that the loop in my previous post has the drawback of not firing QApplication events, such as aboutToQuit and lastWindowClosed. What works very well for me as an alternative to quamash is

def run():
    def onTimer():
        loop.call_soon(loop.stop)
        loop.run_forever()

    loop = asyncio.get_event_loop()
    timer = qt.QTimer()
    timer.start(10)
    timer.timeout.connect(onTimer)
    qt.QApplication.instance().exec_()

Since this is in essence a braindead polling loop there is a compromise between idle efficiency and latency.

The reason for not using quamash is that on slow CPUs (cheap Celeron notebooks) it appears that there exists a point were it generates more events than can be handled. When the application ventures beyond this event horizon it freezes with 100% CPU utilization.

According to the docs it should be possible to use QAbstractEventDispatcher for a more efficient event loop integration but it's completely unclear to me how. There's a new gevent project that uses this approach.

@harvimt
Copy link
Owner

harvimt commented Sep 10, 2018 via email

@enkore
Copy link

enkore commented Sep 10, 2018

Subprocess handling should be fairly straightforward using QProcess (which is already asynchronous, so only a convenience adapter would be needed[1])... the asyncio.subprocess_exec API looks quite complicated to me.

[1] Care should be taken to do error handling properly. I'm thinking along the lines of e.g.

def start(self) -> Future:
  def started(): future.set_result(None)
  def errored(err): future.set_exception(SomeError)
  self.process.started.connect(started)
  self.process.errored.connect(errored)
  future = Future()
  future.add_done_callback(disconnect_both_signals)
  return future

Then

proc = Process(['ls', '-l', '/']
await proc.start()
output = ''
while proc.running():
  output += await proc.read_stdout()
exit_code = await proc.finish()  # <-- would require a bit more finesse than shown above for start()

It occurs to me here that there is a term needed to describe concurrency safety aspects of asynchronous programs that I don't know / haven't heard of. With threads, we have thread safety as the notion that we can interact with some object from multiple threads concurrently. A similar notion likely applies to complex asynchronous objects like the Process API I've shown above; it would likely be unsafe to manipulate a Process from multiple "lines" of execution concurrently.
Does somebody know the correct term for this property?

———— Edit

I wrote a first implementation of this idea. See https://github.com/enkore/asynker/blob/master/examples/subprocess.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants