New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

100% CPU usage with idle websocket #1955

Closed
naggie opened this Issue Jun 5, 2017 · 13 comments

Comments

Projects
None yet
7 participants
@naggie

naggie commented Jun 5, 2017

Long story short

Fairly standard websocket server/client setup, but with session, results in 100% CPU usage.

  1. Generic websocket server
  2. Generic websocket client, but with a session that does a get request (to the same endpoint) before a new request to start a websocket
  3. In a few minutes, CPU usage increases to 100%

Note that removing the get request seems to avoid the problem.

Expected behaviour

Low CPU usage when idle

Actual behaviour

Worker process suddenly goes to 100% CPU usage, confirmed with htop.

Steps to reproduce

Server

import asyncio
import aiohttp
from aiohttp import web

async def worker_websocket(request):
    ws = aiohttp.web.WebSocketResponse()
    await ws.prepare(request)

    async for msg in ws:
        pass

    return ws

app = aiohttp.web.Application()
app.router.add_get('/worker-websocket',worker_websocket)
web.run_app( app, port=int(8080))

worker

import aiohttp
from aiohttp import web
import asyncio

async def websocket_client(loop):
    url = 'http://localhost:8080/worker-websocket'

    session = aiohttp.ClientSession(loop=loop)

    # removing this prevents the problem!
    async with session.get(url) as resp:
        pass

    async with session.ws_connect(url=url) as ws:
        async for msg in ws:
            pass

loop = asyncio.get_event_loop()
client = websocket_client(loop)
loop.run_until_complete(client)

Run server + worker and wait a few minutes. Eventually the server process will eat one core. Remove the session.get and it does not happen.

Your environment

I've reproduced this, so far, on:

  • MacOS 10.12 (Sierra) + Python 3.6.1 with aiohttp 2.1.0
  • Ubuntu server 16.04 LTS + Python 3.5.2 with aiohttp 2.1.0

Using strace, all I can see is highly frequent epoll_wait calls returning 0. My attempts with pyrasite and gdb-python3 yielded nothing useful.

I'm still yet to try ipdb and try to look through the source.

The only thing I can find that is even remotely similar is https://groups.google.com/forum/#!topic/libuv/opFk74lww7k .

Otherwise my experience with aiohttp has been fantastic. Thanks for a great library.

@frillip

This comment has been minimized.

frillip commented Jun 6, 2017

Confirmed on Windows too.

Windows 7 Enterprise 64bit + Python 3.5.1 amd64 + aiohttp 2.1.0

Usage is show as 25% as it's running on a quad-core machine:
image

Oddly, usage spikes suddenly?

image

@naggie naggie changed the title from 100% CPU usage when otherwise idle to 100% CPU usage with idle websocket Jun 6, 2017

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Jun 8, 2017

@naggie @frillip could you test master branch

@naggie

This comment has been minimized.

naggie commented Jun 8, 2017

I can confirm it's still an issue on master, commit 6b85bcd

@pjknkda

This comment has been minimized.

Contributor

pjknkda commented Jun 10, 2017

It seems like a problem related to keep-alive. When I tested the code above, web_protocol._process_keepalive is called infinitely after 75 seconds later (which is default value of keepalive_timeout) from calling session.get.

    def _process_keepalive(self):
        if self._force_close:
            return

        # 1. "self._keepalive_time" is not updated during waiting "async for msg in ws:"
        # therefore the value of "next" is unchanged.
        next = self._keepalive_time + self._keepalive_timeout

        if len(self._request_handlers) == len(self._waiters):  # <-- 2. no waiters during "async for msg in ws:"
            now = self._time_service.loop_time
            if now + 1.0 > next:
                self.force_close(send_last_heartbeat=True)
                return

        # 3. Therefore, the value of "next" is current time, which calls "self._process_keepalive" immediately.
        self._keepalive_handle = self._loop.call_at(
            next, self._process_keepalive)

fafhrd91 added a commit that referenced this issue Jun 18, 2017

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Jun 18, 2017

should be fixed in master

@fafhrd91 fafhrd91 closed this Jun 18, 2017

@naggie

This comment has been minimized.

naggie commented Jun 18, 2017

Fantastic! Thank you all for your time!

@ericdevries

This comment has been minimized.

ericdevries commented Oct 27, 2017

I have found that this still results in a 100% cpu usage situation after the default timeout of 75 seconds, using it in combination with python-socketio. If needed, I can provide new evidence.

aiohttp version 2.3.1

@zTrix

This comment has been minimized.

zTrix commented Nov 18, 2017

Same issue happened to me, any hint on how to debug this issue?

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Nov 19, 2017

@ericdevries @zTrix could you provide reproducible example?

@zTrix do you use python-socketio as well?

@ericdevries

This comment has been minimized.

ericdevries commented Nov 19, 2017

@fafhrd91 sure, I will create one tomorrow and attach it to this ticket

@fafhrd91 fafhrd91 reopened this Nov 20, 2017

@fafhrd91 fafhrd91 added the bug label Nov 20, 2017

@fafhrd91 fafhrd91 assigned fafhrd91 and unassigned fafhrd91 Nov 20, 2017

@fafhrd91 fafhrd91 added this to the 3.0 milestone Nov 20, 2017

@zTrix

This comment has been minimized.

zTrix commented Nov 20, 2017

@fafhrd91 No I didn't use python-socketio.

In my case I just started an server with websocket, and diving into _process_keepalive function shows that next actually is less than _loop.time(), but len(self._request_handlers) == len(self._waiters) is False, so it's calling itself just like an infinite loop.

@asvetlov

This comment has been minimized.

Member

asvetlov commented Nov 20, 2017

Wild idea: disable HTTP keepalives after websocket connection establishment. They are useless, isn't it?

@asvetlov

This comment has been minimized.

Member

asvetlov commented Nov 20, 2017

Also looks like RequestHandler._process_keepalive misses self._keepalive_time = self._loop.time() line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment