Video Freezing in Chrome #156

stongo · 2016-03-01T20:59:38Z

Using new versions of the bridge, have been experiencing random freezing of the video channel. Doesn't occur for every participant.
I've included a webrtc-internals graph showing it happening just before 7:55pm. Nothing abnormal in bridge logs or our clients log (not using jitsi meet). We ended up rolling back to version 564 where this does not happen.

jitsi-developers · 2016-03-01T21:11:44Z

Just before 7:55 nacks and plis increase and received frame rate drops to
0, but I notice some peculiar behavior starting at 7:54. You don't seem to
be using simulcast. Are you using RTCP termination? Could you please enable
fine logging at the bridge and logging at the client and share the log
files with us?

On Tue, Mar 1, 2016 at 2:59 PM, Marcus Stong notifications@github.com
wrote:

Using new versions of the bridge, have been experiencing random freezing
of the video channel. Doesn't occur for every participant.
I've included a webrtc-internals graph showing it happening just before
7:55pm. Nothing abnormal in bridge logs or our clients log (not using jitsi
meet). We ended up rolling back to version 564 where this does not happen.
[image: talky-jvb-video-freeze]
https://cloud.githubusercontent.com/assets/1449748/13441660/683e8f46-dfc6-11e5-823e-4d6514f96681.png

—
Reply to this email directly or view it on GitHub
#156.

dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

stongo · 2016-03-01T21:15:03Z

We aren't using simulcast, it's true.

I'm glad you actually bring up RTCP termination. The documentation is outdated, and we aren't sure how to make it work.

We'd like to enable what was org.jitsi.impl.neomedia.rtcp.termination.strategies.HighestQualityRTCPTerminationStrategy but setting it according to the doc fails.

Maybe setting it correctly might fix the issue?

jitsi-developers · 2016-03-01T21:21:14Z

HQRTS no longer exists, I would suggest to either enable
BasicRTCPTerminationStrategy or disable RTCP termination completely (just
don't set anything). The correct way to set the BRTS is this:

org.jitsi.videobridge.rtcp.strategy=org.jitsi.impl.neomedia.rtcp.termination.strategies.BasicRTCPTerminationStrategy

On Tue, Mar 1, 2016 at 3:15 PM, Marcus Stong notifications@github.com
wrote:

We aren't using simulcast, it's true.

I'm glad you actually bring up RTCP termination. The documentation is
outdated, and we aren't sure how to make it work.

We'd like to enable what was
org.jitsi.impl.neomedia.rtcp.termination.strategies.HighestQualityRTCPTerminationStrategy
but setting it according to the doc fails.

Maybe setting it correctly might fix the issue?

—
Reply to this email directly or view it on GitHub
#156 (comment)
.

dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

stongo · 2016-03-01T22:03:57Z

We tried with BasicRTCPTermination, but it was throwing an index out of bounds exception.
One change I did make which seems to help was removing org.jitsi.impl.neomedia.transform.srtp.SRTPCryptoContext.checkReplay=false
Testing in our staging environment the issue seems to have gone away, but stage isn't always reliable for reproducing the bug. I'll plan a deploy tomorrow to production turning off rtcp termination and removing the checkReplay setting and update this issue again.
Thanks for the help!

stongo · 2016-03-03T16:34:52Z

Video is still freezing. It's pretty easy to reproduce right now on https://talky.io with 3+ callers as I haven't rolled back yet.
I also have a webrtc-internals dump if that would help

jitsi-developers · 2016-03-03T16:40:49Z

Hi Marcus, in order to help us understand the situation we need logs from
the bridge, the sip-communicator.properties file you use to configure the
bridge and screenshots from the webrtc-internals page (because it's much
quicker than having to graph the raw data).

Best,
George

On Thu, Mar 3, 2016 at 10:34 AM, Marcus Stong notifications@github.com
wrote:

Video is still freezing. It's pretty easy to reproduce right now on
https://talky.io as I haven't rolled back yet.
I also have a webrtc-internals dump if that would help

—
Reply to this email directly or view it on GitHub
#156 (comment)
.

dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

jitsi-developers · 2016-03-03T16:47:25Z

P.S. before you get any logs from the bridge, it would be helpful to set
the global log level to FINE by editing the logging.properties file.

On Thu, Mar 3, 2016 at 10:40 AM, George Politis gp@jitsi.org wrote:

Hi Marcus, in order to help us understand the situation we need logs from
the bridge, the sip-communicator.properties file you use to configure the
bridge and screenshots from the webrtc-internals page (because it's much
quicker than having to graph the raw data).

Best,
George

On Thu, Mar 3, 2016 at 10:34 AM, Marcus Stong notifications@github.com
wrote:

Video is still freezing. It's pretty easy to reproduce right now on
https://talky.io as I haven't rolled back yet.
I also have a webrtc-internals dump if that would help

—
Reply to this email directly or view it on GitHub
#156 (comment)
.

dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

stongo · 2016-03-03T16:52:57Z

George, thanks for the help! I think the graph above should suffice then.
Here's the sip-communicator.properties:

org.jitsi.videobridge.TCP_HARVESTER_MAPPED_PORT=443
org.jitsi.videobridge.TCP_HARVESTER_PORT=4443
org.jitsi.videobridge.STATISTICS_TRANSPORT=pubsub
org.jitsi.videobridge.ENABLE_STATISTICS=true
org.jitsi.videobridge.STATISTICS_INTERVAL=15000
org.jitsi.videobridge.PUBSUB_SERVICE=pubsub.foo.bar
org.jitsi.videobridge.PUBSUB_NODE=videobridge
org.jitsi.videobridge.SINGLE_PORT_HARVESTER_PORT=-1
org.ice4j.ice.harvest.ALLOWED_INTERFACES=bond0

Here's a sampling of the logs https://ghostbin.com/paste/adfg9

fippo · 2016-03-03T16:55:52Z

george: http://fippo.github.io/webrtc-dump-importer/ gives you nice graphs in a matter of seconds. Zoomable even.

damencho · 2016-03-03T16:59:04Z

Hey @fippo is there a way to make those dumps from js code, I'm asking whether it is possible to do the dumps while selenium testing? Thanks.

fippo · 2016-03-03T17:01:22Z

@damencho do you know your own code? :-p
traceablepeerconnection was built exactly for this. I suppose you can also open webrtc-internals in an extra tab in selenium but never tried it.

jitsi-developers · 2016-03-03T17:01:51Z

Thanks @fippo!

@marcus The log snapshot that you shared is filled with XMPP ping timeouts
and re-transmission requests from the clients. There could be something
wrong with our NACK termination implementation (which is enabled by default
in recent versions of the bridge) or it could be something wrong with the
network.

You can add the following 2 lines to the sip-communicator.properties file
to disable NACK termination :

org.jitsi.service.neomedia.VideoMediaStream.REQUEST_RETRANSMISSIONS=false
org.jitsi.videobridge.DISABLE_NACK_TERMINATION=true

Try that and let us know how it goes.

Best,
George

On Thu, Mar 3, 2016 at 10:53 AM, Marcus Stong notifications@github.com
wrote:

George, thanks for the help! I think the graph above should suffice then.
Here's the sip-communicator.properties:

org.jitsi.videobridge.TCP_HARVESTER_MAPPED_PORT=443
org.jitsi.videobridge.TCP_HARVESTER_PORT=4443
org.jitsi.videobridge.STATISTICS_TRANSPORT=pubsub
org.jitsi.videobridge.ENABLE_STATISTICS=true
org.jitsi.videobridge.STATISTICS_INTERVAL=15000
org.jitsi.videobridge.PUBSUB_SERVICE=pubsub.foo.bar
org.jitsi.videobridge.PUBSUB_NODE=videobridge
org.jitsi.videobridge.SINGLE_PORT_HARVESTER_PORT=-1
org.ice4j.ice.harvest.ALLOWED_INTERFACES=bond0

Here's a sampling of the logs https://ghostbin.com/paste/adfg9

—
Reply to this email directly or view it on GitHub
#156 (comment)
.

dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

bgrozev · 2016-03-03T17:02:00Z

Here's a sampling of the logs https://ghostbin.com/paste/adfg9

2016-03-03 11:46:34.065 WARNING: [90782] org.jitsi.videobridge.transform.RtxTransformer.warn() Cannot find SSRC for RTX, retransmitting plain.

This could well indicate a problem with packet retransmissions, which could explain the freeze. We don't run into it, because we haven't yet enabled RTX.

One reason for the bridge not finding the SSRC could be that it wasn't signaled to it. Can you include more of the logs? Specifically the RECV/SENT lines. Also make sure you are using a recent bridge version (which includes Lance's fix).

bgrozev · 2016-03-03T17:06:48Z

Just found a little bug, preparing a fix. You may want to delay your testing a bit.

stongo · 2016-03-03T17:15:28Z

okay great. I'll test on our stage site as soon as you let me know.
we had been using one of the latest versions with Lance's fix too, just so you know.

bgrozev · 2016-03-03T18:08:33Z

Videobridge 672 includes the fix.

stongo · 2016-03-04T04:03:57Z

Deployed 672 with and without suggested NACK settings, and unfortunately it doesn't work at all now

org.jitsi.impl.osgi.framework.launch.FrameworkImpl.startLevelChanged() Error changing start level
org.osgi.framework.BundleException: BundleActivator.start
    at org.jitsi.impl.osgi.framework.BundleImpl.start(BundleImpl.java:313)
    at org.jitsi.impl.osgi.framework.launch.FrameworkImpl.startLevelChanged(FrameworkImpl.java:460)
    at org.jitsi.impl.osgi.framework.startlevel.FrameworkStartLevelImpl$Command.run(FrameworkStartLevelImpl.java:126)
    at org.jitsi.impl.osgi.framework.AsyncExecutor.runInThread(AsyncExecutor.java:111)
    at org.jitsi.impl.osgi.framework.AsyncExecutor.access$000(AsyncExecutor.java:17)
    at org.jitsi.impl.osgi.framework.AsyncExecutor$1.run(AsyncExecutor.java:220)
Caused by: java.lang.NoClassDefFoundError: net/java/sip/communicator/impl/protocol/jabber/extensions/colibri/HealthCheckIQ
    at org.jitsi.videobridge.VideobridgeBundleActivator.start(VideobridgeBundleActivator.java:59)
    at org.jitsi.impl.osgi.framework.BundleImpl.start(BundleImpl.java:293)
    ... 5 more
Caused by: java.lang.ClassNotFoundException: net.java.sip.communicator.impl.protocol.jabber.extensions.colibri.HealthCheckIQ
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 7 more
2016-03-03 23:00:00.766 SEVERE: [17] org.jitsi.videobridge.stats.PubSubStatsTransport.publishStatistics().282 Failed to publish to PubSub node: videobridge - it does not exist yet

Going to rollback one version and test with NACK settings as well.

stongo · 2016-03-04T04:12:18Z

670 with suggested NACK settings passes staging tests.
Will deploy to production tomorrow morning and report back

bgrozev · 2016-03-04T05:10:24Z

Not sure what the problem with 672 is, possibly just the package was not properly built. In any case, if you get a chance to test this on 672+ without disabling NACK termination, please let us know.

stongo · 2016-03-04T18:39:30Z

Still freezing in production on 670 with NACK changes. Will give a release > 672 a try

jitsi-developers · 2016-03-08T18:37:53Z

Hi Marcus, did you have any better luck with jvb > 672? I've had a
discussion with Boris and please note that it is NOT a good idea to disable
NACK termination as I suggested initially, so if you still have problems,
please remove the two NACK termination related configuration options from
sip-communicator.properties file and try again.

On Fri, Mar 4, 2016 at 12:39 PM, Marcus Stong notifications@github.com
wrote:

Still freezing in production on 670 with NACK changes. Will give a release

672 a try

—
Reply to this email directly or view it on GitHub
#156 (comment)
.

dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

stongo · 2016-03-11T15:01:52Z

Still freezing in 681

stongo · 2016-03-11T21:51:31Z

Disabling RTX seems like a promising fix. Wasn't able to reproduce freezing on stage. Will confirm for sure with production deploy Monday.

stongo · 2016-03-18T16:43:23Z

Been running 681 for most of the week with RTX disabled in production.
Our feedback form didn't have one report of freezing and our friday update conference also was good.
Seems to be fixed!

bgrozev · 2016-03-19T04:47:35Z

Thanks for the feedback, @stongo! We should be looking into enabling RTX in jitsi-meet in the next couple of weeks, will let you know if we find any issues.

davidertel · 2016-04-14T19:24:05Z

@stongo how did you disable RTX as you mentioned above?

bgrozev · 2016-04-14T19:35:12Z

An update on this: we've been working on RTX in the last couple of weeks. We fixed multiple issues, and as far as we know current videobridge versions work correctly with RTX. So, I think this is ready for testing.

We are not yet enabling it in jitsi-meet, because we are running into some problems managing SDP when muting/unmuting (these are jitsi-meet specific issues).

stongo · 2016-04-14T19:39:03Z

@bgrozev awesome, I'll give it a try on staging again and let you know

@davidertel are you using Jitsi Meet or something else?

jitsi-developers · 2016-04-14T19:47:09Z

We've been looking at retransmission in general (not necessarily
out-of-band RTX, but in band via nack as well) and have noticed that, when
we limit the bandwidth on clients, chrome seems to do a poor job of obeying
the detected bandwidth. Because of this, when loss occurs (due to chrome
sending more bits than it should be), lots of nacks start up and chrome can
refuse to retransmit the lost packets due to it detecting that it's sending
too much data via retransmitting. We thought maybe this was h264 specific
but we were able to repro with vp8 as well. From looking at the
bweforvideo graphs, chrome seems to properly detect the correct amount of
bandwidth, but regularly sends more. The way this gets played out is lots
of periods of frozen video on the receiver.

Just a heads up on something we've seen...we want to gather some more data
and get a bug filed on chrome.

On Thu, Apr 14, 2016 at 12:39 PM, Marcus Stong notifications@github.com
wrote:

@bgrozev https://github.com/bgrozev awesome, I'll give it a try on
staging again and let you know

@davidertel https://github.com/davidertel are you using Jitsi Meet or
something else?

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#156 (comment)

dev mailing list
dev@jitsi.org
Unsubscribe instructions and other list options:
http://lists.jitsi.org/mailman/listinfo/dev

xdumaine · 2016-04-14T21:28:10Z

@stongo We (Dave and I and company) are using a web client with jingle.js via a focus controller a la talky (but it's our own node.js focus controller).

bgrozev · 2016-04-14T21:41:48Z

You need to remove the "a=rtpmap:XXX RTX/90000" lines from the SDP you pass to your clients.

xdumaine · 2016-04-14T22:24:14Z

I've found that more often than not, filing the bug early leads to quicker results. The Chrome team is helpful in identifying workarounds and fixes. We can nudge them for feedback. If you have a dump showing that behavior, let's get it filed with all the info we have. I'll try to get one as well.

fippo · 2016-04-15T06:03:55Z

no chrome bug here...

xdumaine · 2016-04-15T12:36:55Z

We're getting freezing without including rtx in the sdp.

type: offer, sdp: v=0
o=- 1460724079835 1460724079848 IN IP4 0.0.0.0
s=-
t=0 0
a=group:BUNDLE video audio data
m=video 1 UDP/TLS/RTP/SAVPF 100 116 117
c=IN IP4 0.0.0.0
a=rtcp:1 IN IP4 0.0.0.0
a=ice-ufrag:apt0u1agcv17b5
a=ice-pwd:1tpk3nc33btjm76p0h6mmk0pt7
a=fingerprint:sha-1 86:1A:F4:87:D2:95:59:62:38:6C:54:E0:A1:0C:C1:AA:08:B4:DF:0F
a=setup:actpass
a=sendrecv
a=mid:video
a=rtcp-mux
a=rtpmap:100 VP8/90000
a=rtcp-fb:100 ccm fir
a=rtcp-fb:100 nack
a=rtcp-fb:100 nack pli
a=rtcp-fb:100 goog-remb
a=rtpmap:116 red/90000
a=rtpmap:117 ulpfec/90000
a=extmap:2 urn:ietf:params:rtp-hdrext:toffset
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=candidate:1 1 SSLTCP 2130706431 172.18.27.111 4443 typ host generation 0
a=candidate:3 1 UDP 2130706431 172.18.27.111 10000 typ host generation 0
a=candidate:2 1 SSLTCP 1694498815 54.165.101.244 4443 typ srflx raddr 172.18.27.111 rport 4443 generation 0
a=candidate:4 1 UDP 1677724415 54.165.101.244 10000 typ srflx raddr 172.18.27.111 rport 10000 generation 0
m=audio 1 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8
c=IN IP4 0.0.0.0
a=rtcp:1 IN IP4 0.0.0.0
a=ice-ufrag:apt0u1agcv17b5
a=ice-pwd:1tpk3nc33btjm76p0h6mmk0pt7
a=fingerprint:sha-1 86:1A:F4:87:D2:95:59:62:38:6C:54:E0:A1:0C:C1:AA:08:B4:DF:0F
a=setup:actpass
a=sendrecv
a=mid:audio
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=fmtp:111 minptime=10
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=candidate:1 1 SSLTCP 2130706431 172.18.27.111 4443 typ host generation 0
a=candidate:3 1 UDP 2130706431 172.18.27.111 10000 typ host generation 0
a=candidate:2 1 SSLTCP 1694498815 54.165.101.244 4443 typ srflx raddr 172.18.27.111 rport 4443 generation 0
a=candidate:4 1 UDP 1677724415 54.165.101.244 10000 typ srflx raddr 172.18.27.111 rport 10000 generation 0
m=application 1 DTLS/SCTP 5000
c=IN IP4 0.0.0.0
a=ice-ufrag:apt0u1agcv17b5
a=ice-pwd:1tpk3nc33btjm76p0h6mmk0pt7
a=fingerprint:sha-1 86:1A:F4:87:D2:95:59:62:38:6C:54:E0:A1:0C:C1:AA:08:B4:DF:0F
a=setup:actpass
a=sctpmap:5000 webrtc-datachannel 1024
a=mid:data
a=candidate:1 1 SSLTCP 2130706431 172.18.27.111 4443 typ host generation 0
a=candidate:3 1 UDP 2130706431 172.18.27.111 10000 typ host generation 0
a=candidate:2 1 SSLTCP 1694498815 54.165.101.244 4443 typ srflx raddr 172.18.27.111 rport 4443 generation 0
a=candidate:4 1 UDP 1677724415 54.165.101.244 10000 typ srflx raddr 172.18.27.111 rport 10000 generation 0

brianh5 · 2016-04-15T18:15:50Z

We just repro'd freezes on apprtc by limiting uplink bandwidth on one sender to 1.5mbps. We see it with h264 and vp8...it detects the available send bandwidth correctly, but with rtx regularly goes over it which causes more loss and freezes (chrome will also refuse to send rtx if it's bandwidth is too high, so I think there's a bad cycle here that causes problems). We just got some screenshots from webrtc-internals and are going to file something today.

brianh5 · 2016-04-15T18:45:04Z

Filed against chrome here https://bugs.chromium.org/p/webrtc/issues/detail?id=5797

bradrlaw · 2016-07-18T16:45:39Z

Where do we stand on this issue? The linked issue to chrome appears closed without anything resolved? We are running into this issue constantly making jitsi unusable for any production type use. This is happening with our own installs, regardless of patch level, as well as the demo at http://meet.jit.si.

A symptom is extremely high packet loss once an endpoint has less than 1.5mbps available. The video will intermittently freeze for upwards of 5 to 15 (or more) seconds.

joelbrewer · 2017-01-09T18:24:41Z

Any update on this? We are considering a switch to jitsi-videobridge -- however, random freezing on Chrome could be a non-starter.

bbaldino · 2017-01-10T16:00:40Z

In regards to my previous comment about the chrome issue ("brianh5" above), we found that the way we were simulating low bandwidth links was inaccurate (network simulator on mac, for example, will do loss to simulate a lower-bandwidth link, but not add any delay--chrome keys quite a bit on the delay to lower the bandwidth estimation), once we properly simulated things we no longer saw that issue so told them they could close that bug.

I also found a bug a couple months ago in the porting of the webrtc bandwidth estimation logic on the bridge that was failing to take delay into account (jitsi/libjitsi#212). Fixing that resulted in much better performance on links with high delay (common for poor links that also have low bw). Other than those 2 scenarios, I wasn't aware of any other freezing issues with chrome.

* add ability to parse bandwidth from a string * allow space between amount and unit * tweak test * change units string to lower case * add case test

dbkr mentioned this issue Mar 19, 2019

Video dropouts when clients on poor connections join #777

Open

This was referenced Jun 3, 2022

Fix local endpoint count (prevent premature shutdown). #1906

Merged

feat: Add a stat for endpoints with suspended sources. #1910

Merged

bgrozev mentioned this issue Jun 28, 2022

Log change in connectivity status at INFO. #1914

Merged

This was referenced Jul 15, 2022

Remove FeatureState (unused). #1921

Merged

Update jitsi-utils #1925

Merged

bgrozev mentioned this issue Jul 25, 2022

feat: Silence detection #1926

Merged

Video Freezing in Chrome #156

Video Freezing in Chrome #156

Comments

stongo commented Mar 1, 2016

jitsi-developers commented Mar 1, 2016

stongo commented Mar 1, 2016

jitsi-developers commented Mar 1, 2016

stongo commented Mar 1, 2016

stongo commented Mar 3, 2016

jitsi-developers commented Mar 3, 2016

jitsi-developers commented Mar 3, 2016

stongo commented Mar 3, 2016

fippo commented Mar 3, 2016

damencho commented Mar 3, 2016

fippo commented Mar 3, 2016

jitsi-developers commented Mar 3, 2016

bgrozev commented Mar 3, 2016

bgrozev commented Mar 3, 2016

stongo commented Mar 3, 2016

bgrozev commented Mar 3, 2016

stongo commented Mar 4, 2016

stongo commented Mar 4, 2016

bgrozev commented Mar 4, 2016

stongo commented Mar 4, 2016

jitsi-developers commented Mar 8, 2016

stongo commented Mar 11, 2016

stongo commented Mar 11, 2016

stongo commented Mar 18, 2016

bgrozev commented Mar 19, 2016

davidertel commented Apr 14, 2016

bgrozev commented Apr 14, 2016

stongo commented Apr 14, 2016

jitsi-developers commented Apr 14, 2016

xdumaine commented Apr 14, 2016

bgrozev commented Apr 14, 2016

xdumaine commented Apr 14, 2016

fippo commented Apr 15, 2016

xdumaine commented Apr 15, 2016 • edited Loading

brianh5 commented Apr 15, 2016

brianh5 commented Apr 15, 2016

bradrlaw commented Jul 18, 2016

joelbrewer commented Jan 9, 2017

bbaldino commented Jan 10, 2017

xdumaine commented Apr 15, 2016 •

edited

Loading