Subscriber receives all cached messages upon initial connection on Redis-backed channel regardless of buffer lengt #445

danjbh · 2018-02-26T16:09:26Z

I'm currently testing nchan w/ a Redis back-end and noticed that my websocket clients will receive all messages cached by the app instead of the number I have set in nchan_message_buffer_length (with nchan_subscriber_first_message set to oldest).

Basically, I need clients who request a specific location to get just the 5 messages in a channel buffer. I'm not sure if it makes a difference, but I am running this inside of Docker (nchan version 1.1.14). If I use the backup storage mode, this problem goes away but then I can't scale this app horizontally.

Also, I noticed that if I run a flushall on Redis after hitting the buffer length, it seems to un-stick the messages somehow (perhaps that's a clue).

Let me know if there's anything I can do to help track this one down. This is a great piece of software and would be willing to send over a healthy donation if we could get this fixed and into production!

Here is my config and steps to reproduce...

    nchan_use_redis on;
    nchan_redis_url "redis://redis:6379";
    nchan_message_buffer_length 5;
    nchan_message_timeout 0;

    listen       80;
    server_name  localhost;

    location / {
        root   /usr/share/nginx/html;
        index  index.html index.htm;
    }

    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }

    location = /oldest {
      nchan_subscriber websocket;
      nchan_channel_id "foo";
      nchan_subscriber_first_message oldest;
    }

    location = /pub {
      nchan_publisher;
      nchan_channel_id "foo";
    }
}

Steps to reproduce

Use provided config above
Write 5 messages to the foo channel using the pub endpoint (via curl or preferred method)
Pull the channel via Websocket connection so that it gets cached
Write 5 additional messages to the foo channel
Pull the channel via Websocket and you get all 10 cached messages, from steps 2 and 4.

The text was updated successfully, but these errors were encountered:

danjbh · 2018-02-26T16:11:54Z

Also, I do realize that I could just set the first message setting to -5 and call it a day, however, I'm actually looking to set the buffer length to 100, which is beyond that maximum allowed value of -32 for that setting.

I've essentially reduced the example to use a buffer length of 5 to help simplify and speed up the steps to reproduce.

concreted · 2018-03-08T18:57:45Z

I'm also running into this issue. I was able to alleviate the issue for a while with nchan_subscriber_first_message -<num_messages>, but am still seeing it occasionally. From what I've seen It happens more often on resource constrained deployments - using AWS, I see it almost never when using 4 t2.small nodes vs. fairly consistently with 1 t2.small.

concreted · 2018-03-08T19:01:35Z

Restarting the Nchan servers also clears out the old messages. So it seems like they are cached to memory and getting 'stuck'.

danjbh · 2018-03-23T14:30:09Z

Any thoughts on this or anything we can do to help move it along?

slact · 2018-03-24T22:07:24Z

I already have a pretty good idea what's happening here, the fix is just a bit tricky. If you're interested, I wouldn't mind if you did some regression testing to see if an earlier version worked as expected.

danjbh · 2018-03-24T22:10:14Z

Definitely.. just let me know which version and I'll give it a shot. Thanks!

concreted · 2018-03-25T15:26:08Z

I'm also willing to do some testing, do you have an idea which range of older versions may have worked as expected?

danjbh · 2018-03-29T17:31:37Z

*nudge* :)

slact · 2018-06-27T20:05:33Z

@danjbh , @concreted : This issue should be fixed with d3a6557. Please rebuild Nchan from lastest master and let me know if the problem persists, as I have trouble replicating it with any consistency.

slact · 2018-06-27T20:14:06Z

Err, nope, I just reproduced it. Not fixed yet...

slact · 2018-06-27T21:24:28Z

aand fixed in a778114 . Please rebuild from master and give it a try.

danjbh · 2018-06-28T09:09:49Z

I'm heading out of town for the next week but I'll give it a shot and see if I can reproduce once I return. Thanks again for looking into this @slact !

danjbh · 2018-07-09T18:09:50Z

Alright, I'm back in action and was able to perform some limited testing this morning. So far I have been unable to reproduce and will do some extensive ("production style") testing later this afternoon/evening.

Anyhow, everything looks promising so far -- thanks again and stay tuned!

danjbh · 2018-07-10T00:42:03Z

Just finished rolling these changes into production and everything is looking great so far. I'll monitor closely over the next few days and will report any issues, but typically I would have seen a problem by now. I'm able to successfully scale the app horizontally into multiple containers w/ a shared redis back-end. Good stuff!

Thanks a bunch for getting this working -- this software has been extremely valuable and even more so now that we can scale it properly!

slact · 2018-07-10T02:05:53Z

Great, this code will be part of the upcoming release. Redis connection management got a big rewrite too, and you can now load-balance message delivery onto slaves, optimize for Redis CPU or Bandwidth, there's auto-failover for clusters and master-slave setups, and a bunch of other stuff. I'll be putting out an official release with all this stuff in a week or two.

ivanovv · 2020-06-19T13:52:53Z

@danjbh It would be aweosme if you share the horizontal scaling schema that you use.

From this thread and nginx.conf 2016 slides by Leo I have an impression that your config looks something like this:

several t2.small instances in AWS EC2 use the same nginx/nchan config
ELB accepts traffic and passes connections to a random instance

Am I right?

Also, I have a couple of questions:

why you decided to go with several smaller instances instead of one big one? Failover? if so, do you have any failover procedures / scripts?
What is your overall impression of running nchan in such setup? Issues, problems?

Thanks in advance!

slact added the testing label Jun 27, 2018

slact mentioned this issue Jun 28, 2018

Redis & Message Buffer #430

Closed

slact changed the title ~~Websocket client receives all cached messages upon initial connection regardless of buffer length~~ Subscriber receives all cached messages upon initial connection on Redis-backed channel regardless of buffer lengt Jun 28, 2018

slact closed this as completed Jul 10, 2018

slact removed the testing label Jul 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subscriber receives all cached messages upon initial connection on Redis-backed channel regardless of buffer lengt #445

Subscriber receives all cached messages upon initial connection on Redis-backed channel regardless of buffer lengt #445

danjbh commented Feb 26, 2018

danjbh commented Feb 26, 2018 •

edited

concreted commented Mar 8, 2018

concreted commented Mar 8, 2018

danjbh commented Mar 23, 2018

slact commented Mar 24, 2018

danjbh commented Mar 24, 2018

concreted commented Mar 25, 2018

danjbh commented Mar 29, 2018

slact commented Jun 27, 2018 •

edited

slact commented Jun 27, 2018

slact commented Jun 27, 2018

danjbh commented Jun 28, 2018

danjbh commented Jul 9, 2018

danjbh commented Jul 10, 2018

slact commented Jul 10, 2018

ivanovv commented Jun 19, 2020 •

edited

Subscriber receives all cached messages upon initial connection on Redis-backed channel regardless of buffer lengt #445

Subscriber receives all cached messages upon initial connection on Redis-backed channel regardless of buffer lengt #445

Comments

danjbh commented Feb 26, 2018

Steps to reproduce

danjbh commented Feb 26, 2018 • edited

concreted commented Mar 8, 2018

concreted commented Mar 8, 2018

danjbh commented Mar 23, 2018

slact commented Mar 24, 2018

danjbh commented Mar 24, 2018

concreted commented Mar 25, 2018

danjbh commented Mar 29, 2018

slact commented Jun 27, 2018 • edited

slact commented Jun 27, 2018

slact commented Jun 27, 2018

danjbh commented Jun 28, 2018

danjbh commented Jul 9, 2018

danjbh commented Jul 10, 2018

slact commented Jul 10, 2018

ivanovv commented Jun 19, 2020 • edited

danjbh commented Feb 26, 2018 •

edited

slact commented Jun 27, 2018 •

edited

ivanovv commented Jun 19, 2020 •

edited