New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloaded m4a podcast's failing to import with "can't sync to an MPEG frame" error #519

Closed
ned-kelly opened this Issue Sep 18, 2018 · 12 comments

Comments

Projects
None yet
4 participants
@ned-kelly
Copy link
Member

ned-kelly commented Sep 18, 2018

As the subject suggests, am having issues with podcasts that are in M4A format importing to Libretime.

Summary:

  • Files are being created in /tmp when adding podcast (that's m4a format) and being downloaded
  • File appears to be fully downloaded ok, but then failing to import to library.
  • Files in /tmp/ are owned by "celery:celery"
  • Example Podcast: http://podcast.djhardwell.com/podcast.xml
  • I'm seeing the following in one of the celery worker logs:
[2018-09-18 11:41:03,657: INFO/Worker-1] podcast-download[php_5ba0e4352a6c74.76761676]: Error during file download: can't sync to an MPEG frame
[2018-09-18 11:41:03,718: INFO/MainProcess] Task podcast-download[php_5ba0e4352a6c74.76761676] succeeded in 26.512139958s: '{"status": 0, "episodeid": 223, "error": "can\'t sync to an MPEG frame"}'

I've read that the python mutagen package required by airtime_analyzer-0.1-py2.7.egg could be a cause (it's currently mutagen==1.31) - I've tried upgrading it to the latest (mutagen 1.41.1) and it's no help unfortunately.

System details:

  • Running Ubuntu Xenial
  • silan 0.3.2-2build3
  • Python 2.7.12
  • Celery args are: --time-limit=1200 --concurrency=4 --config=celeryconfig -l INFO

Would welcome suggestions on where to look / potential fixes - Anyone know if merging in #464 will help here?

Regular MP3's are fine - Could we perhaps just find a way to pipe any other formats through something like ffmpeg before having the files imported for now which should fix this (not sure where to look, or I'd already be trying this now)

I believe what may be cause of the problem is that the podcast's "album art" actually changes multiple times throughout the mix (as the track names change, the album art updates to reflect the track names) - This is pretty common with EDM podcasts from big name artists/shows (ASOT, Tiesto etc) when you look at the podcasts they have on iTunes.

@ned-kelly ned-kelly changed the title Downloaded m4a podcast's failing to import Downloaded m4a podcast's failing to import with "can't sync to an MPEG frame" error Sep 18, 2018

@Robbt

This comment has been minimized.

Copy link
Member

Robbt commented Sep 18, 2018

Ahh yeah this is definitely an issue with the podcast processing sending a non-audio track. I'd say test JohnnyC's code and see if it works. I tried it and it didn't work for some of the test cases I threw at it.

@ned-kelly

This comment has been minimized.

Copy link
Member

ned-kelly commented Sep 18, 2018

@Robbt no it's not working - I've ended up just adding in ffmpeg to the equation to convert ANY file that comes in that's not mp3 - it's quick and dirty but does the trick. Can submit a PR if you're happy to put it in the main codebase.

airtime-celery/tasks.py:

import subprocess
from subprocess import Popen, PIPE


# Add to line 165 before: m = MP3(audiofile.name, ID3=EasyID3)

p = Popen(['ffprobe', '-v', 'error', '-select_streams', 'a:0', '-show_entries', 'stream=codec_name', '-of', 'default=nokey=1:noprint_wrappers=1', audiofile.name], stdin=PIPE, stdout=PIPE, stderr=PIPE)
rc = p.returncode

if rc != 'mp3':
    subprocess.call(['mv', audiofile.name, audiofile.name + '.old'])
    subprocess.call(['ffmpeg', '-hide_banner', '-loglevel', 'panic', '-i', audiofile.name + '.old', '-f', 'mp3', audiofile.name])
    subprocess.call(['rm', '-rf', audiofile.name + '.old'])
@Robbt

This comment has been minimized.

Copy link
Member

Robbt commented Sep 20, 2018

So I saw this code is being added to your previous PR because you did a PR of your master vs. creating a branch. I'm not sure if this will fix all of the problems for instance what happens if ffmpeg attempts to convert a png file into a mp3, we don't want to import the mp3. I think that we need to iterate over the items and only try to import the audio files. But I can see how this might be useful for people who want all of their files to be mp3 files.

@ned-kelly

This comment has been minimized.

Copy link
Member

ned-kelly commented Sep 20, 2018

So I saw this code is being added to your previous PR because you did a PR of your master vs. creating a branch. I'm not sure if this will fix all of the problems for instance what happens if ffmpeg attempts to convert a png file into a mp3, we don't want to import the mp3. I think that we need to iterate over the items and only try to import the audio files. But I can see how this might be useful for people who want all of their files to be mp3 files.

Yep, sorry it's a quick fix - in the example you provided ffmpeg just won't work because there's no "audio channel" in an image file, really there should be some more smarts around it so users can chose to convert all files or only some formats, however from what I can tell podcasts only work with MP3 files (currently) which is why it's just converting everything to MP3

@Robbt

This comment has been minimized.

Copy link
Member

Robbt commented Sep 20, 2018

Cool, yeah if you could try to submit your PRs as separate branches that would be helpful for merging them, because otherwise we'll get whatever hot fixes you are doing to your master branch which will make it impossible to merge them. I can't tell you exactly what steps to do but it's probably a good idea to do it sooner than later. You can still merge the PRs into your master and thus have a complete master.

Here's a link to the C4 - https://rfc.zeromq.org/spec:22/C4/ which has the basic framework for how to contribute - the fork + pull method etc where each PR is tied to an issue.

@frecuencialibre

This comment has been minimized.

Copy link
Contributor

frecuencialibre commented Oct 26, 2018

I'm seeing standard mp3 podcasts fail in my docker-multicontainer-libretime instance. Using a feed that @Robbt has confirmed to work, i'm seeing the following in celery logs:

[2018-10-26 19:11:59,374: INFO/Worker-4] podcast-download[php_5bd36450672200.31686480]: Error during file download: [Errno 2] No such file or directory
[2018-10-26 19:11:59,391: INFO/MainProcess] Task podcast-download[php_5bd36450672200.31686480] succeeded in 686.961189598s: '{"status": 0, "episodeid": 50, "error": ""}'

also identical:

  • File appears to be fully downloaded ok, but then failing to import to library.
  • Files in /tmp/ are owned by "celery:celery"
@Robbt

This comment has been minimized.

Copy link
Member

Robbt commented Dec 13, 2018

So this is an issue with our podcast ingest and not with liquidsoap or libretime. I was able to download a m4a file and import it via the web upload process and it imports and plays just fine. So I'm going to try to figure out a way of diagnosing what is causing it to fail.

@JohnnyC1951

This comment has been minimized.

Copy link

JohnnyC1951 commented Dec 13, 2018

Oddly, I had an issue with that, this week. They used to import perfectly until last week. Now, new ones do not import. They play perfectly externally. These files originate from anchor.fm which is a podcast service and were uploaded to anchor as mp3. I worked around it by converting them to mp3 in my podcast-proxy (and added youtube for good measure). Is there an updated m4a version, maybe?

@Robbt

This comment has been minimized.

Copy link
Member

Robbt commented Dec 13, 2018

Just pushed a fix for this issue without requiring a reworking of m4a to mp3. When I did the podcast album rename code I only supported mp3 files for no good reason but using mutagen library as a whole seems to fix it and now m4a podcasts are downloadable without any interception via ffmpeg.

@JohnnyC1951

This comment has been minimized.

@Robbt

This comment has been minimized.

Copy link
Member

Robbt commented Dec 13, 2018

@JohnnyC1951 I just tried that file directly with the newest version of mutagen via web upload and it imported fine on my ubuntu-xenial vagrant box.

@JohnnyC1951

This comment has been minimized.

Copy link

JohnnyC1951 commented Dec 13, 2018

Excellent :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment