Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output info_dict reports incorrect extension when formats are merged into a mkv video #8349

Closed
zhuyifei1999 opened this issue Jan 28, 2016 · 2 comments

Comments

@zhuyifei1999
Copy link

@zhuyifei1999 zhuyifei1999 commented Jan 28, 2016

I am developing a tool called video2commons and in its backend it has to programmatically download the video and determine the filename downloaded. However, for mkv videos there's nowhere in the output dict that mentions mkv at all:

>>> params = {
...     'format': 'bestvideo+bestaudio/best',
...     'outtmpl': '/srv/v2coutput/7c47160c0da8e30c/dl.%(ext)s',
...     'writedescription': True,
...     'writeinfojson': True,
...     'writesubtitles': True,
...     'writeautomaticsub': False,
...     'allsubtitles': True,
...     'subtitlesformat': 'srt/ass/vtt/best',
...     'cachedir': '/tmp/',
...     'noplaylist': True, # not implemented in video2commons
...     'postprocessors': [{
...         'key': 'FFmpegSubtitlesConvertor',
...         'format': 'srt',
...     }],
...     'max_filesize': 5 * (1 << 30),
...     'prefer_ffmpeg': True, # avconv do not have srt encoder
...     'prefer_free_formats': True,
...     'verbose': True
... }
>>> 
>>> import youtube_dl
>>> dl = youtube_dl.YoutubeDL(params)
WARNING: Parameter outtmpl is bytes, but should be a unicode string. Put  from __future__ import unicode_literals  at the top of your code file or consider switching to Python 3.x.
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.01.23
[debug] Python version 2.7.9 - Linux-3.16.0-4-amd64-x86_64-with-debian-8.1
[debug] exe versions: ffmpeg 2.8.4-1, ffprobe 2.8.4-1
[debug] Proxy map: {}
>>> dl.extract_info('https://www.youtube.com/watch?v=J2736UvG4-M', download=True)
[youtube] J2736UvG4-M: Downloading webpage
[youtube] J2736UvG4-M: Downloading video info webpage
[youtube] J2736UvG4-M: Extracting video information
WARNING: video doesn't have subtitles
[youtube] J2736UvG4-M: Downloading DASH manifest
[youtube] J2736UvG4-M: Downloading DASH manifest
[info] Writing video description to: /srv/v2coutput/7c47160c0da8e30c/dl.description
[info] Writing video description metadata as JSON to: /srv/v2coutput/7c47160c0da8e30c/dl.info.json
WARNING: Requested formats are incompatible for merge and will be merged into mkv.
[download] /srv/v2coutput/7c47160c0da8e30c/dl.mkv has already been downloaded and merged
[ffmpeg] There aren't any subtitles to convert
{u'upload_date': u'20130916', u'creator': None, u'height': 272, u'like_count': 454, u'duration': 7790, u'id': 'J2736UvG4-M', u'requested_formats': ({u'asr': None, u'tbr': 472, u'protocol': u'https', u'format': u'243 - 484x272 (DASH video)', u'url': u'https://r6---sn-p5qlsnsd.googlevideo.com/videoplayback?id=276ef7e94bc6e3e3&itag=243&source=youtube&requiressl=yes&mm=31&mv=u&ms=au&pl=22&nh=IgpwcjAxLmlhZDI2KgkxMjcuMC4wLjE&mn=sn-p5qlsnsd&ratebypass=yes&mime=video/webm&gir=yes&clen=261286057&lmt=1397994903624422&dur=7790.273&fexp=9416126,9418203,9420452,9421665,9421977,9422596,9423292,9423662,9424979,9426059,9426717,9427840&mt=1453981088&upn=GWvGNe4edAg&key=dg_yt0&signature=44736DC98CBA5D5EA1579E33503B91570B6837C7.1A76CC52EAB73094052823DEFFF7962052DEB072&sver=3&ip=208.80.155.255&ipbits=0&expire=1454003098&sparams=ip,ipbits,expire,id,itag,source,requiressl,mm,mv,ms,pl,nh,mn,ratebypass,mime,gir,clen,lmt,dur', u'filesize': 261286057, u'vcodec': u'vp9', u'format_note': u'DASH video', u'height': 272, u'width': 484, u'ext': u'webm', u'preference': -40, u'fps': 1, u'format_id': u'243', u'http_headers': {u'Accept-Charset': u'ISO-8859-1,utf-8;q=0.7,*;q=0.7', u'Accept-Language': u'en-us,en;q=0.5', u'Accept-Encoding': u'gzip, deflate', u'Accept': u'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', u'User-Agent': u'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)'}, u'acodec': u'none'}, {u'asr': 44100, u'tbr': 263, u'container': u'm4a_dash', u'format': u'141 - audio only (DASH audio)', u'url': u'https://r6---sn-p5qlsnsd.googlevideo.com/videoplayback?id=276ef7e94bc6e3e3&itag=141&source=youtube&requiressl=yes&mm=31&mv=u&ms=au&pl=22&nh=IgpwcjAxLmlhZDI2KgkxMjcuMC4wLjE&mn=sn-p5qlsnsd&ratebypass=yes&mime=audio/mp4&gir=yes&clen=248375357&lmt=1390139825440411&dur=7790.433&fexp=9416126,9418203,9420452,9421665,9421977,9422596,9423292,9423662,9424979,9426059,9426717,9427840&mt=1453981088&upn=GWvGNe4edAg&key=dg_yt0&signature=4ED7707C7E48B91B47E46EBE8DFD877E4FAD7537.35B671522DF17858CE34FE9FAA1F6D4E440EB0F9&sver=3&ip=208.80.155.255&ipbits=0&expire=1454003098&sparams=ip,ipbits,expire,id,itag,source,requiressl,mm,mv,ms,pl,nh,mn,ratebypass,mime,gir,clen,lmt,dur', u'filesize': 248375357, u'vcodec': u'none', u'format_note': u'DASH audio', u'abr': 256, u'height': None, u'width': None, u'ext': u'm4a', u'preference': -50, u'fps': None, u'protocol': u'https', u'format_id': u'141', u'http_headers': {u'Accept-Charset': u'ISO-8859-1,utf-8;q=0.7,*;q=0.7', u'Accept-Language': u'en-us,en;q=0.5', u'Accept-Encoding': u'gzip, deflate', u'Accept': u'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', u'User-Agent': u'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)'}, u'acodec': u'aac'}), u'view_count': 136558, u'playlist': None, u'title': u'Sant Dnyaneshwar 1940) Full Movie', u'format': u'243 - 484x272 (DASH video)+141 - audio only (DASH audio)', u'ext': u'webm', u'playlist_index': None, u'dislike_count': 51, u'average_rating': 4.59603977203, u'abr': 256, u'categories': [u'Film & Animation'], u'fps': 1, u'stretched_ratio': None, u'age_limit': 0, u'annotations': None, u'webpage_url_basename': u'watch', u'acodec': u'aac', u'display_id': 'J2736UvG4-M', u'automatic_captions': {}, u'description': u'', u'tags': [], u'requested_subtitles': None, u'start_time': None, u'uploader': u'Sundari', u'format_id': u'243+141', u'uploader_id': u'sufiSutra', u'subtitles': {}, u'thumbnails': [{u'url': u'https://i.ytimg.com/vi/J2736UvG4-M/hqdefault.jpg', u'id': u'0'}], u'alt_title': None, u'extractor_key': u'Youtube', u'vcodec': u'vp9', u'thumbnail': u'https://i.ytimg.com/vi/J2736UvG4-M/hqdefault.jpg', u'vbr': None, u'is_live': None, u'extractor': u'youtube', u'end_time': None, u'webpage_url': u'https://www.youtube.com/watch?v=J2736UvG4-M', u'formats': ['<A TON OF IRRELEVANT DATA REMOVED>'], u'resolution': None, u'width': 484}
>>>

The mentioned extension seems to be webm: "u'ext': u'webm'", and trying to determine the filename with self.outtmpl % {'ext':self.info['ext']} gives /srv/v2coutput/7c47160c0da8e30c/dl.webm instead of the expected /srv/v2coutput/7c47160c0da8e30c/dl.mkv

Looking into the source, there is, in fact, an attempt to set the extension in info_dict to "mkv": (YoutubeDL.py#L1601):
info_dict['ext'] = 'mkv'

However, this info_dict it working on a copy of the original info_dict that is returned: (YoutubeDL.py#L1361):

new_info = dict(info_dict)
new_info.update(format)
self.process_info(new_info)

And the attempted change in the dict creates no effect at all to the returned dict; thus, the correct extension is unable to determined programmatically.

@zhuyifei1999
Copy link
Author

@zhuyifei1999 zhuyifei1999 commented Jan 28, 2016

If there's no easy way to fix this, are there some possible temporary workarounds instead of a "for file in os.listdir"?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jan 28, 2016

Thanks for the detailed report. This problem shares the same cause with #5710. There are some discussions.

@yan12125 yan12125 closed this Jan 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.