Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Facebook] title, uploader and view_count are not being extracted properly #23180

Open
5 tasks done
mdawar opened this issue Nov 23, 2019 · 0 comments · May be fixed by #30700
Open
5 tasks done

[Facebook] title, uploader and view_count are not being extracted properly #23180

mdawar opened this issue Nov 23, 2019 · 0 comments · May be fixed by #30700

Comments

@mdawar
Copy link
Contributor

mdawar commented Nov 23, 2019

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2019.11.22
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-j', '-v', 'https://www.facebook.com/cnn/videos/10155529876156509/']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.11.22
[debug] Python version 3.6.8 (CPython) - Linux-4.9.0-9-amd64-x86_64-with-debian-9.11
[debug] exe versions: ffmpeg 3.2.14-1, ffprobe 3.2.14-1, rtmpdump 2.4
[debug] Proxy map: {}
[debug] Default format spec: bestvideo+bestaudio/best
{"id": "10155529876156509", "title": "She survived the holocaust \u2014 and years later, she\u2019s getting her citizenship s...", "formats": [{"format_id": "10155529878241509ad", "manifest_url": null, "ext": "m4a", "width": null, "height": null, "tbr": 49.237, "asr": 48000, "fps": null, "language": "eng", "format_note": "DASH audio", "filesize": null, "container": "m4a_dash", "vcodec": "none", "acodec": "mp4a.40.29", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14926769_10155529878251509_3232337965438992384_n.mp4?_nc_cat=111&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfNDI2X2NyZl8yM19tYWluXzMuMF9mcmFnXzJfYXVkaW8ifQ==&_nc_ohc=DjO0kBpr8AsAQnnbuCWqptsgaJ5_FSpnlE-ji6WgZENfsBjWrSwismkqw&_nc_ht=video-atl3-1.xx&oh=b1a654d901276177cdfe1069ce1dca77&oe=5DD96178", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529878241509ad - audio only (DASH audio)", "protocol": "https"}, {"format_id": "10155529878246509vd", "manifest_url": null, "ext": "mp4", "width": 426, "height": 426, "tbr": 496.321, "asr": null, "fps": null, "language": "eng", "format_note": "DASH video", "filesize": null, "container": "mp4_dash", "vcodec": "avc1.4d401e", "acodec": "none", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14933438_10155529878256509_9213679665762271232_n.mp4?_nc_cat=100&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfNDI2X2NyZl8yM19tYWluXzMuMF9mcmFnXzJfdmlkZW8ifQ==&_nc_ohc=n7C00zjFxqoAQkkjSmkoRuaU0p-cC1oVXWvw-578C8pjc8-TfCQDp9b5Q&_nc_ht=video-atl3-1.xx&oh=024bb033e37bac74e32e0753d4c56840&oe=5DD96145", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529878246509vd - 426x426 (DASH video)", "protocol": "https"}, {"format_id": "10155529878366509v", "manifest_url": null, "ext": "mp4", "width": 640, "height": 640, "tbr": 912.315, "asr": null, "fps": null, "language": "eng", "format_note": "DASH video", "filesize": null, "container": "mp4_dash", "vcodec": "avc1.4d401e", "acodec": "none", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14926773_10155529878371509_2262825188007608320_n.mp4?_nc_cat=110&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfNjQwX2NyZl8yM19tYWluXzMuMF9mcmFnXzJfdmlkZW8ifQ==&_nc_ohc=6ifuQaGFPfgAQlFtYeuLLG-gajIBxXRqqe801_CnHBH8Mn50edMb091Ug&_nc_ht=video-atl3-1.xx&oh=3bd926d78efa542b76810ef7f54b2aba&oe=5DD9499E", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529878366509v - 640x640 (DASH video)", "protocol": "https"}, {"format_id": "10155529880386509v", "manifest_url": null, "ext": "mp4", "width": 1080, "height": 1080, "tbr": 2305.949, "asr": null, "fps": null, "language": "eng", "format_note": "DASH video", "filesize": null, "container": "mp4_dash", "vcodec": "avc1.64001f", "acodec": "none", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14928551_10155529880391509_5432109631927222272_n.mp4?_nc_cat=104&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfMTI4MF9jcmZfMjNfaGlnaF8zLjFfZnJhZ18yX3ZpZGVvIn0=&_nc_ohc=7oTUQ5ydn_kAQmo9qfX8xaa9_V0ToGp5-4L80m1QL9GMxWpI3MrPbXQYg&_nc_ht=video-atl3-1.xx&oh=65be49789f56138a05e20bbdaaf25168&oe=5DD94E61", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529880386509v - 1080x1080 (DASH video)", "protocol": "https"}, {"format_id": "dash_sd_src", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14927236_1106979459423130_6536409035841732608_n.mp4?_nc_cat=109&efg=eyJybHIiOjU5MCwicmxhIjo1MTIsInZlbmNvZGVfdGFnIjoic3ZlX3NkIn0%3D&_nc_ohc=xiEKG4HsaxQAQlcH9w3T6FZyakep5wdhwEyYbCfEfk_XTDWM3UVSf-WlA&rl=590&vabr=328&_nc_ht=video-atl3-1.xx&oh=cf8d47e18c65ba4e0539f32533a560b7&oe=5DD94F13", "preference": 0, "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "ext": "mp4", "format": "dash_sd_src - unknown", "protocol": "https"}, {"format_id": "dash_sd_src_no_ratelimit", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14927236_1106979459423130_6536409035841732608_n.mp4?_nc_cat=109&efg=eyJ2ZW5jb2RlX3RhZyI6InN2ZV9zZCJ9&_nc_ohc=xiEKG4HsaxQAQlcH9w3T6FZyakep5wdhwEyYbCfEfk_XTDWM3UVSf-WlA&_nc_ht=video-atl3-1.xx&oh=cf8d47e18c65ba4e0539f32533a560b7&oe=5DD94F13", "preference": 0, "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "ext": "mp4", "format": "dash_sd_src_no_ratelimit - unknown", "protocol": "https"}, {"format_id": "dash_hd_src", "url": "https://video-atl3-1.xx.fbcdn.net/v/t39.24130-2/10000000_2112313042406194_4422723318924873276_n.mp4?_nc_cat=108&efg=eyJ2ZW5jb2RlX3RhZyI6Im9lcF9oZCJ9&_nc_ohc=2u6aBt4QCbYAQlqFC0jBRvAHsDc5EjE8li9gteVlj3U72PR6BYX7MTA3g&_nc_ht=video-atl3-1.xx&oh=09f76ad07b27718575a8a01531a58f49&oe=5E8A4732", "preference": 5, "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "ext": "mp4", "format": "dash_hd_src - unknown", "protocol": "https"}], "uploader": "Holocaust survivor becomes US citizen", "timestamp": 1523161890, "thumbnail": "https://scontent-atl3-1.xx.fbcdn.net/v/t15.5256-10/p200x200/14900093_10155529877461509_2120259349254242304_n.jpg?_nc_cat=105&_nc_ohc=fMSUdUTdVRsAQmiXmHod1USxJzB6Bc1FWEJEkeCg7x3CUcrJQY2Q3-kiA&_nc_ht=scontent-atl3-1.xx&oh=57eadb5be355d5c576417fafd46f0bd8&oe=5E8971EC", "view_count": null, "subtitles": {}, "extractor": "facebook", "webpage_url": "https://www.facebook.com/cnn/videos/10155529876156509/", "webpage_url_basename": "10155529876156509", "extractor_key": "Facebook", "playlist": null, "playlist_index": null, "thumbnails": [{"url": "https://scontent-atl3-1.xx.fbcdn.net/v/t15.5256-10/p200x200/14900093_10155529877461509_2120259349254242304_n.jpg?_nc_cat=105&_nc_ohc=fMSUdUTdVRsAQmiXmHod1USxJzB6Bc1FWEJEkeCg7x3CUcrJQY2Q3-kiA&_nc_ht=scontent-atl3-1.xx&oh=57eadb5be355d5c576417fafd46f0bd8&oe=5E8971EC", "id": "0"}], "display_id": "10155529876156509", "upload_date": "20180408", "requested_subtitles": null, "requested_formats": [{"format_id": "10155529880386509v", "manifest_url": null, "ext": "mp4", "width": 1080, "height": 1080, "tbr": 2305.949, "asr": null, "fps": null, "language": "eng", "format_note": "DASH video", "filesize": null, "container": "mp4_dash", "vcodec": "avc1.64001f", "acodec": "none", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14928551_10155529880391509_5432109631927222272_n.mp4?_nc_cat=104&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfMTI4MF9jcmZfMjNfaGlnaF8zLjFfZnJhZ18yX3ZpZGVvIn0=&_nc_ohc=7oTUQ5ydn_kAQmo9qfX8xaa9_V0ToGp5-4L80m1QL9GMxWpI3MrPbXQYg&_nc_ht=video-atl3-1.xx&oh=65be49789f56138a05e20bbdaaf25168&oe=5DD94E61", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529880386509v - 1080x1080 (DASH video)", "protocol": "https"}, {"format_id": "10155529878241509ad", "manifest_url": null, "ext": "m4a", "width": null, "height": null, "tbr": 49.237, "asr": 48000, "fps": null, "language": "eng", "format_note": "DASH audio", "filesize": null, "container": "m4a_dash", "vcodec": "none", "acodec": "mp4a.40.29", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14926769_10155529878251509_3232337965438992384_n.mp4?_nc_cat=111&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfNDI2X2NyZl8yM19tYWluXzMuMF9mcmFnXzJfYXVkaW8ifQ==&_nc_ohc=DjO0kBpr8AsAQnnbuCWqptsgaJ5_FSpnlE-ji6WgZENfsBjWrSwismkqw&_nc_ht=video-atl3-1.xx&oh=b1a654d901276177cdfe1069ce1dca77&oe=5DD96178", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529878241509ad - audio only (DASH audio)", "protocol": "https"}], "format": "10155529880386509v - 1080x1080 (DASH video)+10155529878241509ad - audio only (DASH audio)", "format_id": "10155529880386509v+10155529878241509ad", "width": 1080, "height": 1080, "resolution": null, "fps": null, "vcodec": "avc1.64001f", "vbr": null, "stretched_ratio": null, "acodec": "mp4a.40.29", "abr": null, "ext": "mp4", "fulltitle": "She survived the holocaust \u2014 and years later, she\u2019s getting her citizenship s...", "_filename": "She survived the holocaust \u2014 and years later, she\u2019s getting her citizenship s...-10155529876156509.mp4"}

Description

While testing some Facebook videos I've noticed that the title, uploader and view_count are not being extracted properly, I have checked the Facebook extractor tests and I've tried one of them:

{
  # have 1080P, but only up to 720p in swf params
  'url': 'https://www.facebook.com/cnn/videos/10155529876156509/',
  'md5': '9571fae53d4165bbbadb17a94651dcdc',
  'info_dict': {
      'id': '10155529876156509',
      'ext': 'mp4',
      'title': 'She survived the holocaust — and years later, she’s getting her citizenship s...',
      'timestamp': 1477818095,
      'upload_date': '20161030',
      'uploader': 'CNN',
      'thumbnail': r're:^https?://.*',
      'view_count': int,
  },
}

The info extraction using the latest version 2019.11.22 returns these results:

{
  "id": "10155529876156509",
  "title": "She survived the holocaust \u2014 and years later, she\u2019s getting her citizenship s...",
  "uploader": "Holocaust survivor becomes US citizen",
  "view_count": null,
   ...
}

Another example from the tests:

{
  'url': 'https://www.facebook.com/video.php?v=274175099429670',
  'info_dict': {
      'id': '274175099429670',
      'ext': 'mp4',
      'title': 're:^Asif Nawab Butt posted a video',
      'uploader': 'Asif Nawab Butt',
      'upload_date': '20140506',
      'timestamp': 1399398998,
      'thumbnail': r're:^https?://.*',
  },
  'expected_warnings': [
      'title'
  ]
}

Returns these results:

{
  "id": "274175099429670",
  "title": "Facebook video #274175099429670",
  "uploader": null,
  "view_count": null,
  ...
}
Lesmiscore added a commit to ytdl-patched/ytdl-patched that referenced this issue Aug 23, 2021
* 'master' of https://github.com/yt-dlp/yt-dlp:
  [facebook] Fix metadata extraction Original PR: ytdl-org/youtube-dl#29796 Closes #453, ytdl-org/youtube-dl#29421, ytdl-org/youtube-dl#23627, ytdl-org/youtube-dl#23180, ytdl-org/youtube-dl#14156
  [TV2] Fix extractor (#766)
  [GabTV] Add extractor (#768)
  [tiktok] Add TikTokUserIE (#756)
  [TikTok] Fix metadata extraction
@dirkf dirkf linked a pull request Feb 28, 2022 that will close this issue
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant