Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"channels over capacity in group xxx" flood #181

Closed
dperetti opened this issue Feb 27, 2020 · 10 comments · Fixed by #199
Closed

"channels over capacity in group xxx" flood #181

dperetti opened this issue Feb 27, 2020 · 10 comments · Fixed by #199

Comments

@dperetti
Copy link

dperetti commented Feb 27, 2020

Since 2.42, we're getting tons of "channels over capacity in group xxx" messages, even though we make little use of django channels (just a couple admin users).
Still haven't found out exactly why, but the fact is: throwing this message in everyRedisChannelLayer.group_send() call can be extremely noisy.
It would be nice to find a way to throttle this!

@tarikki
Copy link
Contributor

tarikki commented Feb 27, 2020

Perhaps a good hotfix would be to tone down the logger from exception to warning.

Is It logging the exception on every call to group_send() even though no channels are over capacity?

@carltongibson
Copy link
Member

Yes, this was added in #172. You can adjust the settings for the channels_redis.core logger.

Is It logging the exception on every call to group_send() even though no channels are over capacity?

Yes, this too: is there an issue here?

@dperetti
Copy link
Author

@tarikki it seems so, but I haven't set any channel_capacity in CHANNEL_LAYERS. What's the default then?

@tarikki
Copy link
Contributor

tarikki commented Feb 28, 2020

The default capacity is 100 messages and the default message expiration time is 60 seconds. So if the the channel is never read within these capacity / time constraints, it will fill up.

I made the PR to log the channels over capacity event because I ran into an issue of messages not being delivered in production and it was only after logging redis queue lengths that I realised this was the root cause.

One reason why a channel might fill up is a client having spotty wifi and disconnecting but the connection is never properly closed. In this case the channel will remain in the group and eventually fill up if enough messages are pushed to the channel through group_send().

One way to mitigate this is to have enough capacity and a short timeout. This config in django settings solved the issue for me:

CHANNEL_LAYERS = {
    'default': {
        'BACKEND': 'channels_redis.core.RedisChannelLayer',
        'CONFIG': {
            "hosts": [(os.environ.get('REDIS_LOCATION', 'localhost'), 6379)],
            'capacity': 1500,
            'expiry': 10,
            }
        },
    },
}

Every channel has a capacity of 1500 and every message expires in 10 seconds if it's not read. If you only have a couple of users on your system then this will not even make a dent in redis' memory usage.

Hindsight being 20/20, a log level of exception might be a bit harsh here. The only reason why it should be logged is if you start wondering why some messages are not delivered for this reason, you can find out by digging through the logs.

@carltongibson: I could make a PR to drop the log level to warning or even info, which would be much more appropriate.

@carltongibson
Copy link
Member

Hi @tarikki. Yes info seems appropriate.

Maybe a note in README re expiry would be worthwhile too.

@astutejoe
Copy link
Contributor

@tarikki If you check #182 and Line 306 messages are actually not being discarded after the expire time if more messages come in for that group, which can lead to the explosions in memory usage we've been seeing.

@tarikki
Copy link
Contributor

tarikki commented Mar 6, 2020

Opened PR #183 to lower log level.

EDIT: @carltongibson Oh I missed your comment on the README. What did you have in mind?

@jberends
Copy link

jberends commented Apr 15, 2020

@tarikki maybe you can add the following to the readme (from you explanation)

The default capacity is 100 messages and the default message expiration time is 
60 seconds. So if the the channel is never read within these capacity / time 
constraints, it will fill up.

One reason why a channel might fill up is when the connection is never properly 
closed. In this case the channel will remain in the group and eventually fill up.

One way to mitigate this is to have enough capacity and a short timeout. You 
can alter the configuration in the Django settings in the following way:
CHANNEL_LAYERS = {
    "default": {
        "BACKEND": "channels_redis.core.RedisChannelLayer",
        "CONFIG": {
            "hosts": REDIS_URL,  # or where your redis server lives
            "capacity": 1500,  # default 100
            "expiry": 10,  # default 60
            }
        },
    },
}

@fogmoon
Copy link

fogmoon commented May 18, 2020

Hi @dperetti ,
Just used your method to fix this issue, but now there are also many channels over capacity in group error log.
We have almost 500 active channels now, each client need broadcast message to other clients very frequently, about 1~2 times/second.
So whether there are a way just discarding the channel from the group when checking the over capacity error (maybe with some more smart mechanism, such as a channel over capacity for some time), or I have to expand the capacity and/or reduce the expiry to fit our scenario, such as:

'capacity': 5000,
'expiry': 5,

Thanks for your time.

@tarikki
Copy link
Contributor

tarikki commented Jun 28, 2020

@jberends: good idea! I'll try to make a PR for that at some point, now unfortunately too busy IRL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants