LiquidSoap Segfaulting #876

multi023 · 2018-10-20T21:12:11Z

Installation method
I am using "Docker"

Every day the disk space, -without uploading new files-, decreases approximately 2 GB.

I attached capture from one day to the next.

Why does this happen? How can I solve it?

Thank you. Greetings,

BusterNeece · 2018-10-20T21:59:09Z

@multi023 Seems like something is flooding your logs. Try running docker-compose logs -f stations and seeing if it's just spitting out error messages all the time. If so, an easy solution is to either update your installation or visit the station causing the errors, click "Utilities", then "Restart Broadcasting".

multi023 · 2018-10-21T13:31:21Z

@SlvrEagle23 We have looked and are the .coreXXXX files

This happens with each radio station:

Apparently, it happens with the latest version with docker, with the 18.x of Ubuntu. With 16.x, it did not happen. Nor did it happen with the previous version of the docker without updating.

Thank you. Greetings,

CodeSteele · 2018-10-21T14:48:29Z

Hmm those look like core dump files, my first guess is something is up with our fork of Icecast but:

Music would be stopping as Icecast crashes and core dumps.
Haven't seen core dumps in any other docker instances.
Not entirely sure but Icecast probably won't restart.

Only seems to be happening a couple times a day though...

Can you check the output of the command @SlvrEagle23 gave above? (docker-compose logs -f stations). I'm interested if anything useful is being printed during the time stamps for those files (I can't see the times on them so if you could expand that column and get us those too that would be best).

multi023 · 2018-10-22T02:05:58Z

We only use ShoutCast.

I'm going to put screenshots with the hours

CodeSteele · 2018-10-22T11:34:04Z

Hmm yeah should have noticed from those log files. Well that rules out that our fork of Icecast is the problem.

We'll need the output of docker-compose logs -f stations to hopefully determine what process is core dumping.

multi023 · 2018-10-23T19:59:02Z

Hi, have you one email for send you all this?

CodeSteele · 2018-10-24T02:59:11Z

[0m 2018-10-21 23:16:28,166 INFO exited: station_7_backend (terminated by SIGSEGV (core dumped); not expected)

Ok it appears that Liquidsoap is segfaulting, we've found some recent issues detailing segfaults:

savonet/liquidsoap#640
savonet/liquidsoap#635

So what you can do for now:

Get the core dump files to the Liquidsoap developers
Regularly delete the coredump files
Station will go silent when Liquidsoap crashes but it's quickly recovered

What I can maybe guess is to look at your memory usage over time, I know some configurations will cause a segfault instead of a SIGKILL and I'm not sure how docker behaves in OOM situations.

I noticed that basically things seem to run fine, then ~4 stations will segfault within 30 minutes, then be fine for another like 4 hours. Kind of odd.

CodeSteele · 2018-11-02T11:35:57Z

#884 also has to do with LiquidSoap segfaulting.

Possibly do to very large playlists? I'll see if I can't build a very large playlist (1800+ songs).

Edit: my entire music library is only 1534 songs... it'll have to do.

vdeville · 2018-11-02T14:01:18Z

@CodeSteele If you need large library i can give you this. But i'm not sure that was libquidSoap. Are you sse this ? #884 (comment)

CodeSteele · 2018-11-03T14:33:25Z

@MyTheValentinus hmm 14 hours in no crash/disconnect for me. LiquidSoap seems to be hovering around 90mb of memory usage for me.

I did have disconnects when connecting over ngrok, but no crashes (so either ngrok isn't kind to long-held connections or we're sensitive to internet quality issues).

How many stations are you running? Do they all have 1700+ songs?

vdeville · 2018-11-03T14:51:15Z

Hello,
I running 2 stations on my instance, one for test only and one for production. Only one station has more than 1500 songs. For my test, to isolate network issue, i stream directly from icecast IP without any reverse proxy.
Screen in attchment for my ram usage.

Thanks

vdeville · 2018-11-08T21:57:29Z

Hello,
Any news and this issue @CodeSteele ?
Thanks

CodeSteele · 2018-11-08T22:01:27Z

@MyTheValentinus no luck in replicating segfaults... Been running an entire week with some 1324 tracks with no problem.

CodeSteele · 2018-11-13T01:08:00Z

#946 reported this:

All the listeners were disconnected from the stream. When de Live-DJ accidentally his UTP cable pulled out of his laptop.

Interruption of the stream caused LiquidSoap to segfault? That's an interesting one...

RemBdev · 2018-11-13T11:01:01Z

@CodeSteele Update was on 6-11-2018

CodeSteele · 2018-11-13T13:09:07Z

We're awaiting ocaml to publish ocaml-duppy 0.7.4, once that happens we'll be pushing a patch that updates to the latest duppy to hopefully resolve this issue.

vdeville · 2018-11-13T13:11:34Z

Hello,
For information, I don't use the live dj system

CodeSteele · 2018-11-13T14:15:16Z

@MyTheValentinus I'm hoping that the segfault being reported is causing both problems (entirely possible), and we're not chasing multiple segfaults.

vdeville · 2018-11-13T18:52:55Z

Okay, i waiting the update ;) Can you send me a message on telegram when you are ready to test ? I can test en production server

CodeSteele · 2018-11-13T19:00:35Z

I got two PRs available for when 0.7.4 becomes available here: https://opam.ocaml.org/packages/duppy/

vdeville · 2018-11-13T19:48:16Z

0.8.0 are available x) I dont understand duppy project architecture.

CodeSteele · 2018-11-13T21:34:58Z

Yep just saw that pop... hmm OK, going to update to that then.

vdeville · 2018-11-13T21:36:37Z

I have been make docker.sh update. It's done ? Juste this is good ? Where the duppy library was download ? Via composer ?

toots · 2018-11-13T23:39:42Z

The 0.7.4 release had some issues in opam with the camlp4 pre-processing code. I've recently removed this functionality in liquidsoap and I don't think anybody else ever used the camlp4 pre-processor so I just went ahead and released an updated ocaml-duppy stripped off that support. Had to bump the API version for it..

toots · 2018-11-14T14:14:43Z

If the updates takes too long to come in, it should be possible to provide your own package definition, opam accepts local repositories and pinned packages..

toots · 2018-11-15T14:59:49Z

The PR for 1.2 has been merged. Hopefully it should be coming up soon. Long term, tho, it looks like new updates, specially those that aren't just bugfixes, will have to go through opam 2.0..

toots · 2018-11-15T17:03:40Z

Yeah, it's in now :-)

CodeSteele · 2018-11-15T20:55:35Z

Oh doh, that makes sense we're on Opam 1.2 (sorry, this is all new to me), considering we'll have to be looking at Opam 2.0 in the future I'll see what we have to do to upgrade that both for our docker installs (shouldn't be a big deal) and our traditional installs (we only really support Ubuntu 16.04 and 18.04 on that so it shouldn't be too bad).

May do that on another pass though. thanks for getting that backported to Opam 1.2. :D

BusterNeece · 2018-11-16T09:27:12Z

@multi023 and others who may have been experiencing this: I've tested the changes made upstream to the duppy library and Liquidsoap and verified that everything works fine, so I've merged the pull requests from @CodeSteele on both the Docker container and Traditional installations.

Note: Make sure that your docker-compose.yml is using the newer azuracast/azuracast_radio:latest image for the stations container, instead of the older azuracast/azuracast_stations:latest. Recent fixes have been applied exclusively to the radio container, as this is the version distributed without SHOUTcast; the older stations container is frozen in time due to SHOUTcast removing the older version of their binary from their servers. If you always update your docker-compose.yml when running ./docker.sh update then you're already on the newest version.

Thanks to @CodeSteele for helping investigate this elusive issue, and a big thanks to @toots for working with us and OPAM to make sure the fixes made it out (and in general for Liquidsoap <3). We'll be updating to OPAM 2.0 shortly, which should make future fixes of this nature faster.

vdeville · 2018-11-16T09:33:27Z

Hello,
I updated my instance, now i'm testing the latest update ;)

Thanks for the work

multi023 · 2018-11-16T17:00:04Z

Hi,

Updated and testing.

Thank you very much for the work :)

vdeville · 2018-11-16T20:07:53Z

Hello,
First crash at 9pm this day :/
2018-11-16 20:03:13,086 INFO exited: station_1_backend (terminated by SIGSEGV (core dumped); not expected)
...
2018/11/16 20:03:14 [dummy:3] Source failed (no more tracks) stopping output...

Crash on all stations

multi023 · 2018-11-16T21:17:19Z

I also have two crashes on all the radios

Now they occur more or less at six-hour intervals.

toots · 2018-11-16T22:38:43Z

Sorry to hear. If y'all have logs I'm available to look at them.

vdeville · 2018-11-16T22:42:06Z

Hello @toots
What logs you need to analyse ?
Azuracast logs stations or specific file ?
Thanks

toots · 2018-11-16T22:45:27Z

Hey @MyTheValentinus! Liquidsoap logs, yeah for sure although it looks like your logs don't have much apparently. The usual info would be liquidsoap version and, here, to make sure that you're using duppy version 0.8.0. Then, for a segfault, a gdb trace usually is great. I'm not familiar with how AzuraCast works but the way to do it from the command line is:

gdb /path/to/liquidsoap
> run <options>
(crashes)
> thread apply all bt

That should give you the stacktrace for all threads at crash time..

vdeville · 2018-11-16T22:56:47Z

when exporting libquidsoap.log:
Crash of this day do not output anything but 3 days ago:

2018/11/13 21:00:51 [dummy:3] Source failed (no more tracks) stopping output...
2018/11/13 21:00:51 [switch_6026:3] Switch to random_6016.
2018/11/13 21:00:51 [random_6016:3] Switch to audio_to_stereo_6014.
2018/11/13 21:00:51 [lang:3] AzuraCast Raw Response: 
2018/11/13 21:00:51 [lang:3] AzuraCast Error: Delaying subsequent requests...
2018/11/13 21:00:55 [main:3] Shutdown started!
2018/11/13 21:00:55 [main:3] Waiting for threads to terminate...
2018/11/13 21:00:55 [threads:3] Shuting down thread wallclock_main
2018/11/13 21:00:55 [radio_5050_local_1:3] Closing connection...
2018/11/13 21:01:07 >>> LOG START
2018/11/13 21:01:08 >>> LOG START
2018/11/13 21:01:10 >>> LOG START
2018/11/13 21:01:13 >>> LOG START
2018/11/13 21:02:17 >>> LOG START
2018/11/13 21:02:17 [main:3] Liquidsoap 1.3.4
2018/11/13 21:02:17 [main:3] Using: bytes=[distributed with OCaml 4.02 or above] pcre=7.3.4 dtools=0.4.1 duppy=0.7.3 duppy.syntax=0.7.3 cry=0.6.0 mm=0.4.0 ogg=0.5.2 vorbis=0.7.0 opus=0.1.2 mad=0.4.5 flac=0.1.2 flac.ogg=0.1.2 dynlink=[distributed with Ocaml] lame=0.3.3 fdkaac=0.2.1 taglib=0.3.3 camomile=1.0.1 faad=0.4.0
2018/11/13 21:02:17 [frame:3] Using 44100Hz audio, 25Hz video, 44100Hz master.
2018/11/13 21:02:17 [frame:3] Frame size must be a multiple of 1764 ticks = 1764 audio samples = 1 video samples.
2018/11/13 21:02:17 [frame:3] Targetting 'frame.duration': 0.04s = 1764 audio samples = 1764 ticks.
2018/11/13 21:02:17 [frame:3] Frames last 0.04s = 1764 audio samples = 1 video samples = 1764 ticks.
2018/11/13 21:02:17 [threads:3] Created thread "generic queue #1".
2018/11/13 21:02:17 [threads:3] Created thread "generic queue #2".
2018/11/13 21:02:17 [threads:3] Created thread "non-blocking queue #1".
2018/11/13 21:02:17 [threads:3] Created thread "non-blocking queue #2".
2018/11/13 21:02:17 [harbor:3] Adding mountpoint '/' on port 8005
2018/11/13 21:02:17 [playlist_default(dot)m3u:3] Loading playlist...
2018/11/13 21:02:17 [playlist_default(dot)m3u:3] No mime type specified, trying autodetection.
2018/11/13 21:02:17 [playlist_default(dot)m3u:3] Playlist treated as format application/x-mpegURL
2018/11/13 21:02:17 [playlist_default(dot)m3u:3] Successfully loaded a playlist of 1547 tracks.
2018/11/13 21:02:17 [radio_5050_local_1:3] Connecting mount /radio.mp3 for source@127.0.0.1...

vdeville · 2018-11-16T23:00:13Z

More azuracast oriented i think:
From icecast_error.log @CodeSteele

[2018-11-16  09:32:15] WARN slave/slave_startup process has 1048576 max file descriptor limit
[2018-11-16  09:32:17] WARN source/source_set_intro Cannot open intro for /radio.mp3 "/usr/local/share/icecast/web//radio.mp3": No such file or directory
[2018-11-16  20:03:15] WARN source/source_set_intro Cannot open intro for /radio.mp3 "/usr/local/share/icecast/web//radio.mp3": No such file or directory

toots · 2018-11-16T23:03:05Z

Ok so the liquidsoap logs indicate that it's still using duppy version 0.7.3. First you should make sure that you are running a liquidsoap built with duppy 0.8.0, which is the version with the recent fix. Then, if it still crashes we can look further. Thanks for your patience!

vdeville · 2018-11-16T23:12:07Z

Ho yeah, i didn't see. Sorry for this fake report.
I will testing with a real 0.8.0 version

vdeville · 2018-11-16T23:26:06Z

Ok, this time it was the good version:

2018/11/16 23:15:41 [main:3] Using: bytes=[distributed with OCaml 4.02 or above] pcre=7.3.5 dtools=0.4.1 duppy=0.8.0 cry=0.6.0 mm=0.4.0 ogg=0.5.2 vorbis=0.7.1 opus=0.1.2 mad=0.4.5 flac=0.1.3 flac.ogg=0.1.3 dynlink=[distributed with Ocaml] lame=0.3.3 fdkaac=0.2.1 taglib=0.3.3 camomile=1.0.1 faad=0.4.0

Test in progress...

CodeSteele · 2018-11-17T01:11:52Z

@multi023 can you confirm that you're on 0.8.0 too? Curious if both of you ended up on duppy 0.7.3 somehow.

BusterNeece · 2018-11-17T01:34:51Z

@multi023 @MyTheValentinus Just as a reminder, you should check your docker-compose.yml files and make sure they're up-to-date on the following lines.

Your compose file SHOULD say this:

services:
  web:
  # ...many lines...

  stations:
    container_name: azuracast_stations
    image: azuracast/azuracast_radio:latest

And not this:

services:
  web:
  # ...many lines...

  stations:
    container_name: azuracast_stations
    image: azuracast/azuracast_stations:latest

vdeville · 2018-11-17T08:35:51Z

Yeah, sure @SlvrEagle23.
More than 12 hours without crash..

vdeville · 2018-11-17T23:48:43Z

24 hours without crash :D

Continue testing...

vdeville · 2018-11-19T10:45:10Z

60 hours after the last update... no crash !

Amazing ! I think it is fixed for real !

Thanks @CodeSteele @SlvrEagle23 @toots

A little pic for fun! :

BusterNeece · 2018-11-19T15:16:18Z

Excellent! Closing this issue as resolved. Thanks to all involved for the excellent collaborative effort.

multi023 · 2018-11-19T21:24:19Z

Now is all perfect!

Thank youuu @SlvrEagle23 @CodeSteele @toots

github-actions · 2022-02-13T02:15:40Z

This issue has not been updated in over a year, so it is being closed for further discussion. If you are experiencing a similar issue, please create a new issue. Thank you!

BusterNeece added the error An error encountered when running the software. label Oct 20, 2018

CodeSteele mentioned this issue Oct 21, 2018

Stream stop working after 6-8 hours #884

Closed

BusterNeece added the upstream Involves software that is "upstream" from AzuraCast, i.e. broadcasting software or OS software. label Oct 24, 2018

CodeSteele changed the title ~~Every day disk space decreases 2 GB~~ LiquidSoap Segfaulting Nov 2, 2018

CodeSteele self-assigned this Nov 2, 2018

CodeSteele mentioned this issue Nov 13, 2018

All listeners are disconnected from the stream #946

Closed

CodeSteele mentioned this issue Nov 15, 2018

Upgrade to Opam 2.0 #955

Closed

BusterNeece closed this as completed Nov 16, 2018

CodeSteele reopened this Nov 17, 2018

BusterNeece closed this as completed Nov 19, 2018

github-actions bot locked as resolved and limited conversation to collaborators Feb 13, 2022

LiquidSoap Segfaulting #876

LiquidSoap Segfaulting #876

Comments

multi023 commented Oct 20, 2018

BusterNeece commented Oct 20, 2018

multi023 commented Oct 21, 2018

CodeSteele commented Oct 21, 2018

multi023 commented Oct 22, 2018

CodeSteele commented Oct 22, 2018

multi023 commented Oct 23, 2018

CodeSteele commented Oct 24, 2018

CodeSteele commented Nov 2, 2018 • edited

vdeville commented Nov 2, 2018

CodeSteele commented Nov 3, 2018

vdeville commented Nov 3, 2018

vdeville commented Nov 8, 2018

CodeSteele commented Nov 8, 2018 • edited

CodeSteele commented Nov 13, 2018

RemBdev commented Nov 13, 2018

CodeSteele commented Nov 13, 2018 • edited

vdeville commented Nov 13, 2018

CodeSteele commented Nov 13, 2018

vdeville commented Nov 13, 2018

CodeSteele commented Nov 13, 2018

vdeville commented Nov 13, 2018

CodeSteele commented Nov 13, 2018

vdeville commented Nov 13, 2018

toots commented Nov 13, 2018

toots commented Nov 14, 2018

toots commented Nov 15, 2018

toots commented Nov 15, 2018

CodeSteele commented Nov 15, 2018

BusterNeece commented Nov 16, 2018

vdeville commented Nov 16, 2018 • edited

multi023 commented Nov 16, 2018

vdeville commented Nov 16, 2018 • edited

multi023 commented Nov 16, 2018

toots commented Nov 16, 2018

vdeville commented Nov 16, 2018

toots commented Nov 16, 2018

vdeville commented Nov 16, 2018

vdeville commented Nov 16, 2018

toots commented Nov 16, 2018

vdeville commented Nov 16, 2018

vdeville commented Nov 16, 2018

CodeSteele commented Nov 17, 2018

BusterNeece commented Nov 17, 2018

vdeville commented Nov 17, 2018 • edited

vdeville commented Nov 17, 2018

vdeville commented Nov 19, 2018 • edited

BusterNeece commented Nov 19, 2018

multi023 commented Nov 19, 2018

github-actions bot commented Feb 13, 2022

CodeSteele commented Nov 2, 2018 •

edited

CodeSteele commented Nov 8, 2018 •

edited

CodeSteele commented Nov 13, 2018 •

edited

vdeville commented Nov 16, 2018 •

edited

vdeville commented Nov 16, 2018 •

edited

vdeville commented Nov 17, 2018 •

edited

vdeville commented Nov 19, 2018 •

edited