Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LiquidSoap Segfaulting #876

Closed
multi023 opened this issue Oct 20, 2018 · 55 comments
Closed

LiquidSoap Segfaulting #876

multi023 opened this issue Oct 20, 2018 · 55 comments
Assignees
Labels
error An error encountered when running the software. upstream Involves software that is "upstream" from AzuraCast, i.e. broadcasting software or OS software.

Comments

@multi023
Copy link

Installation method
I am using "Docker"

Every day the disk space, -without uploading new files-, decreases approximately 2 GB.

I attached capture from one day to the next.
c4_espacio_disco_2018-10-20_01
c4_espacio_disco_2018-10-19_03

Why does this happen? How can I solve it?

Thank you. Greetings,

@BusterNeece
Copy link
Member

@multi023 Seems like something is flooding your logs. Try running docker-compose logs -f stations and seeing if it's just spitting out error messages all the time. If so, an easy solution is to either update your installation or visit the station causing the errors, click "Utilities", then "Restart Broadcasting".

@BusterNeece BusterNeece added the error An error encountered when running the software. label Oct 20, 2018
@multi023
Copy link
Author

@SlvrEagle23 We have looked and are the .coreXXXX files

This happens with each radio station:

image 1

Apparently, it happens with the latest version with docker, with the 18.x of Ubuntu. With 16.x, it did not happen. Nor did it happen with the previous version of the docker without updating.

Thank you. Greetings,

@CodeSteele
Copy link
Contributor

Hmm those look like core dump files, my first guess is something is up with our fork of Icecast but:

  1. Music would be stopping as Icecast crashes and core dumps.
  2. Haven't seen core dumps in any other docker instances.
  3. Not entirely sure but Icecast probably won't restart.

Only seems to be happening a couple times a day though...

Can you check the output of the command @SlvrEagle23 gave above? (docker-compose logs -f stations). I'm interested if anything useful is being printed during the time stamps for those files (I can't see the times on them so if you could expand that column and get us those too that would be best).

@multi023
Copy link
Author

We only use ShoutCast.

I'm going to put screenshots with the hours

@CodeSteele
Copy link
Contributor

Hmm yeah should have noticed from those log files. Well that rules out that our fork of Icecast is the problem.

We'll need the output of docker-compose logs -f stations to hopefully determine what process is core dumping.

@multi023
Copy link
Author

Hi, have you one email for send you all this?

@BusterNeece BusterNeece added the upstream Involves software that is "upstream" from AzuraCast, i.e. broadcasting software or OS software. label Oct 24, 2018
@CodeSteele
Copy link
Contributor

[0m 2018-10-21 23:16:28,166 INFO exited: station_7_backend (terminated by SIGSEGV (core dumped); not expected)

Ok it appears that Liquidsoap is segfaulting, we've found some recent issues detailing segfaults:

savonet/liquidsoap#640
savonet/liquidsoap#635


So what you can do for now:

  • Get the core dump files to the Liquidsoap developers
  • Regularly delete the coredump files
  • Station will go silent when Liquidsoap crashes but it's quickly recovered

What I can maybe guess is to look at your memory usage over time, I know some configurations will cause a segfault instead of a SIGKILL and I'm not sure how docker behaves in OOM situations.

I noticed that basically things seem to run fine, then ~4 stations will segfault within 30 minutes, then be fine for another like 4 hours. Kind of odd.

@CodeSteele CodeSteele changed the title Every day disk space decreases 2 GB LiquidSoap Segfaulting Nov 2, 2018
@CodeSteele
Copy link
Contributor

CodeSteele commented Nov 2, 2018

#884 also has to do with LiquidSoap segfaulting.

Possibly do to very large playlists? I'll see if I can't build a very large playlist (1800+ songs).

Edit: my entire music library is only 1534 songs... it'll have to do.

@CodeSteele CodeSteele self-assigned this Nov 2, 2018
@vdeville
Copy link

vdeville commented Nov 2, 2018

@CodeSteele If you need large library i can give you this. But i'm not sure that was libquidSoap. Are you sse this ? #884 (comment)

@CodeSteele
Copy link
Contributor

@MyTheValentinus hmm 14 hours in no crash/disconnect for me. LiquidSoap seems to be hovering around 90mb of memory usage for me.

I did have disconnects when connecting over ngrok, but no crashes (so either ngrok isn't kind to long-held connections or we're sensitive to internet quality issues).

How many stations are you running? Do they all have 1700+ songs?

@vdeville
Copy link

vdeville commented Nov 3, 2018

Hello,
I running 2 stations on my instance, one for test only and one for production. Only one station has more than 1500 songs. For my test, to isolate network issue, i stream directly from icecast IP without any reverse proxy.
Screen in attchment for my ram usage.
capture d ecran 2018-11-03 a 15 47 16
Thanks

@vdeville
Copy link

vdeville commented Nov 8, 2018

Hello,
Any news and this issue @CodeSteele ?
Thanks

@CodeSteele
Copy link
Contributor

CodeSteele commented Nov 8, 2018

@MyTheValentinus no luck in replicating segfaults... Been running an entire week with some 1324 tracks with no problem.

@CodeSteele
Copy link
Contributor

#946 reported this:

All the listeners were disconnected from the stream. When de Live-DJ accidentally his UTP cable pulled out of his laptop.

Interruption of the stream caused LiquidSoap to segfault? That's an interesting one...

@RemBdev
Copy link

RemBdev commented Nov 13, 2018

@CodeSteele Update was on 6-11-2018

@CodeSteele
Copy link
Contributor

CodeSteele commented Nov 13, 2018

We're awaiting ocaml to publish ocaml-duppy 0.7.4, once that happens we'll be pushing a patch that updates to the latest duppy to hopefully resolve this issue.

@vdeville
Copy link

Hello,
For information, I don't use the live dj system

@CodeSteele
Copy link
Contributor

@MyTheValentinus I'm hoping that the segfault being reported is causing both problems (entirely possible), and we're not chasing multiple segfaults.

@vdeville
Copy link

Okay, i waiting the update ;) Can you send me a message on telegram when you are ready to test ? I can test en production server

@CodeSteele
Copy link
Contributor

I got two PRs available for when 0.7.4 becomes available here: https://opam.ocaml.org/packages/duppy/

@vdeville
Copy link

0.8.0 are available x) I dont understand duppy project architecture.

@CodeSteele
Copy link
Contributor

Yep just saw that pop... hmm OK, going to update to that then.

@vdeville
Copy link

I have been make docker.sh update. It's done ? Juste this is good ? Where the duppy library was download ? Via composer ?

@toots
Copy link

toots commented Nov 13, 2018

The 0.7.4 release had some issues in opam with the camlp4 pre-processing code. I've recently removed this functionality in liquidsoap and I don't think anybody else ever used the camlp4 pre-processor so I just went ahead and released an updated ocaml-duppy stripped off that support. Had to bump the API version for it..

@toots
Copy link

toots commented Nov 14, 2018

If the updates takes too long to come in, it should be possible to provide your own package definition, opam accepts local repositories and pinned packages..

@toots
Copy link

toots commented Nov 15, 2018

The PR for 1.2 has been merged. Hopefully it should be coming up soon. Long term, tho, it looks like new updates, specially those that aren't just bugfixes, will have to go through opam 2.0..

@toots
Copy link

toots commented Nov 15, 2018

Yeah, it's in now :-)

@CodeSteele
Copy link
Contributor

Oh doh, that makes sense we're on Opam 1.2 (sorry, this is all new to me), considering we'll have to be looking at Opam 2.0 in the future I'll see what we have to do to upgrade that both for our docker installs (shouldn't be a big deal) and our traditional installs (we only really support Ubuntu 16.04 and 18.04 on that so it shouldn't be too bad).

May do that on another pass though. thanks for getting that backported to Opam 1.2. :D

@BusterNeece
Copy link
Member

@multi023 and others who may have been experiencing this: I've tested the changes made upstream to the duppy library and Liquidsoap and verified that everything works fine, so I've merged the pull requests from @CodeSteele on both the Docker container and Traditional installations.

Note: Make sure that your docker-compose.yml is using the newer azuracast/azuracast_radio:latest image for the stations container, instead of the older azuracast/azuracast_stations:latest. Recent fixes have been applied exclusively to the radio container, as this is the version distributed without SHOUTcast; the older stations container is frozen in time due to SHOUTcast removing the older version of their binary from their servers. If you always update your docker-compose.yml when running ./docker.sh update then you're already on the newest version.

Thanks to @CodeSteele for helping investigate this elusive issue, and a big thanks to @toots for working with us and OPAM to make sure the fixes made it out (and in general for Liquidsoap <3). We'll be updating to OPAM 2.0 shortly, which should make future fixes of this nature faster.

@vdeville
Copy link

vdeville commented Nov 16, 2018

Hello,
I updated my instance, now i'm testing the latest update ;)

Thanks for the work

@multi023
Copy link
Author

Hi,

Updated and testing.

Thank you very much for the work :)

@vdeville
Copy link

vdeville commented Nov 16, 2018

Hello,
First crash at 9pm this day :/
2018-11-16 20:03:13,086 INFO exited: station_1_backend (terminated by SIGSEGV (core dumped); not expected)
...
2018/11/16 20:03:14 [dummy:3] Source failed (no more tracks) stopping output...

Crash on all stations

@multi023
Copy link
Author

I also have two crashes on all the radios

Now they occur more or less at six-hour intervals.

@toots
Copy link

toots commented Nov 16, 2018

Sorry to hear. If y'all have logs I'm available to look at them.

@vdeville
Copy link

Hello @toots
What logs you need to analyse ?
Azuracast logs stations or specific file ?
Thanks

@toots
Copy link

toots commented Nov 16, 2018

Hey @MyTheValentinus! Liquidsoap logs, yeah for sure although it looks like your logs don't have much apparently. The usual info would be liquidsoap version and, here, to make sure that you're using duppy version 0.8.0. Then, for a segfault, a gdb trace usually is great. I'm not familiar with how AzuraCast works but the way to do it from the command line is:

gdb /path/to/liquidsoap
> run <options>
(crashes)
> thread apply all bt

That should give you the stacktrace for all threads at crash time..

@vdeville
Copy link

when exporting libquidsoap.log:
Crash of this day do not output anything but 3 days ago:

2018/11/13 21:00:51 [dummy:3] Source failed (no more tracks) stopping output...
2018/11/13 21:00:51 [switch_6026:3] Switch to random_6016.
2018/11/13 21:00:51 [random_6016:3] Switch to audio_to_stereo_6014.
2018/11/13 21:00:51 [lang:3] AzuraCast Raw Response: 
2018/11/13 21:00:51 [lang:3] AzuraCast Error: Delaying subsequent requests...
2018/11/13 21:00:55 [main:3] Shutdown started!
2018/11/13 21:00:55 [main:3] Waiting for threads to terminate...
2018/11/13 21:00:55 [threads:3] Shuting down thread wallclock_main
2018/11/13 21:00:55 [radio_5050_local_1:3] Closing connection...
2018/11/13 21:01:07 >>> LOG START
2018/11/13 21:01:08 >>> LOG START
2018/11/13 21:01:10 >>> LOG START
2018/11/13 21:01:13 >>> LOG START
2018/11/13 21:02:17 >>> LOG START
2018/11/13 21:02:17 [main:3] Liquidsoap 1.3.4
2018/11/13 21:02:17 [main:3] Using: bytes=[distributed with OCaml 4.02 or above] pcre=7.3.4 dtools=0.4.1 duppy=0.7.3 duppy.syntax=0.7.3 cry=0.6.0 mm=0.4.0 ogg=0.5.2 vorbis=0.7.0 opus=0.1.2 mad=0.4.5 flac=0.1.2 flac.ogg=0.1.2 dynlink=[distributed with Ocaml] lame=0.3.3 fdkaac=0.2.1 taglib=0.3.3 camomile=1.0.1 faad=0.4.0
2018/11/13 21:02:17 [frame:3] Using 44100Hz audio, 25Hz video, 44100Hz master.
2018/11/13 21:02:17 [frame:3] Frame size must be a multiple of 1764 ticks = 1764 audio samples = 1 video samples.
2018/11/13 21:02:17 [frame:3] Targetting 'frame.duration': 0.04s = 1764 audio samples = 1764 ticks.
2018/11/13 21:02:17 [frame:3] Frames last 0.04s = 1764 audio samples = 1 video samples = 1764 ticks.
2018/11/13 21:02:17 [threads:3] Created thread "generic queue #1".
2018/11/13 21:02:17 [threads:3] Created thread "generic queue #2".
2018/11/13 21:02:17 [threads:3] Created thread "non-blocking queue #1".
2018/11/13 21:02:17 [threads:3] Created thread "non-blocking queue #2".
2018/11/13 21:02:17 [harbor:3] Adding mountpoint '/' on port 8005
2018/11/13 21:02:17 [playlist_default(dot)m3u:3] Loading playlist...
2018/11/13 21:02:17 [playlist_default(dot)m3u:3] No mime type specified, trying autodetection.
2018/11/13 21:02:17 [playlist_default(dot)m3u:3] Playlist treated as format application/x-mpegURL
2018/11/13 21:02:17 [playlist_default(dot)m3u:3] Successfully loaded a playlist of 1547 tracks.
2018/11/13 21:02:17 [radio_5050_local_1:3] Connecting mount /radio.mp3 for source@127.0.0.1...

@vdeville
Copy link

More azuracast oriented i think:
From icecast_error.log @CodeSteele

[2018-11-16  09:32:15] WARN slave/slave_startup process has 1048576 max file descriptor limit
[2018-11-16  09:32:17] WARN source/source_set_intro Cannot open intro for /radio.mp3 "/usr/local/share/icecast/web//radio.mp3": No such file or directory
[2018-11-16  20:03:15] WARN source/source_set_intro Cannot open intro for /radio.mp3 "/usr/local/share/icecast/web//radio.mp3": No such file or directory

@toots
Copy link

toots commented Nov 16, 2018

Ok so the liquidsoap logs indicate that it's still using duppy version 0.7.3. First you should make sure that you are running a liquidsoap built with duppy 0.8.0, which is the version with the recent fix. Then, if it still crashes we can look further. Thanks for your patience!

@vdeville
Copy link

Ho yeah, i didn't see. Sorry for this fake report.
I will testing with a real 0.8.0 version

@vdeville
Copy link

Ok, this time it was the good version:

2018/11/16 23:15:41 [main:3] Using: bytes=[distributed with OCaml 4.02 or above] pcre=7.3.5 dtools=0.4.1 duppy=0.8.0 cry=0.6.0 mm=0.4.0 ogg=0.5.2 vorbis=0.7.1 opus=0.1.2 mad=0.4.5 flac=0.1.3 flac.ogg=0.1.3 dynlink=[distributed with Ocaml] lame=0.3.3 fdkaac=0.2.1 taglib=0.3.3 camomile=1.0.1 faad=0.4.0

Test in progress...

@CodeSteele
Copy link
Contributor

@multi023 can you confirm that you're on 0.8.0 too? Curious if both of you ended up on duppy 0.7.3 somehow.

@BusterNeece
Copy link
Member

@multi023 @MyTheValentinus Just as a reminder, you should check your docker-compose.yml files and make sure they're up-to-date on the following lines.

Your compose file SHOULD say this:

services:
  web:
  # ...many lines...

  stations:
    container_name: azuracast_stations
    image: azuracast/azuracast_radio:latest

And not this:

services:
  web:
  # ...many lines...

  stations:
    container_name: azuracast_stations
    image: azuracast/azuracast_stations:latest

@CodeSteele CodeSteele reopened this Nov 17, 2018
@vdeville
Copy link

vdeville commented Nov 17, 2018

Yeah, sure @SlvrEagle23.
More than 12 hours without crash..

@vdeville
Copy link

24 hours without crash :D
image
Continue testing...

@vdeville
Copy link

vdeville commented Nov 19, 2018

60 hours after the last update... no crash !

Amazing ! I think it is fixed for real !

Thanks @CodeSteele @SlvrEagle23 @toots

A little pic for fun! :
image

@BusterNeece
Copy link
Member

Excellent! Closing this issue as resolved. Thanks to all involved for the excellent collaborative effort.

@multi023
Copy link
Author

Now is all perfect!

Thank youuu @SlvrEagle23 @CodeSteele @toots

@github-actions
Copy link

This issue has not been updated in over a year, so it is being closed for further discussion. If you are experiencing a similar issue, please create a new issue. Thank you!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
error An error encountered when running the software. upstream Involves software that is "upstream" from AzuraCast, i.e. broadcasting software or OS software.
Projects
None yet
Development

No branches or pull requests

6 participants