Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Facebook] title, uploader and view_count are not being extracted properly #23180

Open
mdawar opened this issue Nov 23, 2019 · 0 comments
Open

[Facebook] title, uploader and view_count are not being extracted properly #23180

mdawar opened this issue Nov 23, 2019 · 0 comments

Comments

@mdawar
Copy link
Contributor

@mdawar mdawar commented Nov 23, 2019

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2019.11.22
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-j', '-v', 'https://www.facebook.com/cnn/videos/10155529876156509/']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.11.22
[debug] Python version 3.6.8 (CPython) - Linux-4.9.0-9-amd64-x86_64-with-debian-9.11
[debug] exe versions: ffmpeg 3.2.14-1, ffprobe 3.2.14-1, rtmpdump 2.4
[debug] Proxy map: {}
[debug] Default format spec: bestvideo+bestaudio/best
{"id": "10155529876156509", "title": "She survived the holocaust \u2014 and years later, she\u2019s getting her citizenship s...", "formats": [{"format_id": "10155529878241509ad", "manifest_url": null, "ext": "m4a", "width": null, "height": null, "tbr": 49.237, "asr": 48000, "fps": null, "language": "eng", "format_note": "DASH audio", "filesize": null, "container": "m4a_dash", "vcodec": "none", "acodec": "mp4a.40.29", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14926769_10155529878251509_3232337965438992384_n.mp4?_nc_cat=111&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfNDI2X2NyZl8yM19tYWluXzMuMF9mcmFnXzJfYXVkaW8ifQ==&_nc_ohc=DjO0kBpr8AsAQnnbuCWqptsgaJ5_FSpnlE-ji6WgZENfsBjWrSwismkqw&_nc_ht=video-atl3-1.xx&oh=b1a654d901276177cdfe1069ce1dca77&oe=5DD96178", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529878241509ad - audio only (DASH audio)", "protocol": "https"}, {"format_id": "10155529878246509vd", "manifest_url": null, "ext": "mp4", "width": 426, "height": 426, "tbr": 496.321, "asr": null, "fps": null, "language": "eng", "format_note": "DASH video", "filesize": null, "container": "mp4_dash", "vcodec": "avc1.4d401e", "acodec": "none", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14933438_10155529878256509_9213679665762271232_n.mp4?_nc_cat=100&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfNDI2X2NyZl8yM19tYWluXzMuMF9mcmFnXzJfdmlkZW8ifQ==&_nc_ohc=n7C00zjFxqoAQkkjSmkoRuaU0p-cC1oVXWvw-578C8pjc8-TfCQDp9b5Q&_nc_ht=video-atl3-1.xx&oh=024bb033e37bac74e32e0753d4c56840&oe=5DD96145", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529878246509vd - 426x426 (DASH video)", "protocol": "https"}, {"format_id": "10155529878366509v", "manifest_url": null, "ext": "mp4", "width": 640, "height": 640, "tbr": 912.315, "asr": null, "fps": null, "language": "eng", "format_note": "DASH video", "filesize": null, "container": "mp4_dash", "vcodec": "avc1.4d401e", "acodec": "none", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14926773_10155529878371509_2262825188007608320_n.mp4?_nc_cat=110&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfNjQwX2NyZl8yM19tYWluXzMuMF9mcmFnXzJfdmlkZW8ifQ==&_nc_ohc=6ifuQaGFPfgAQlFtYeuLLG-gajIBxXRqqe801_CnHBH8Mn50edMb091Ug&_nc_ht=video-atl3-1.xx&oh=3bd926d78efa542b76810ef7f54b2aba&oe=5DD9499E", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529878366509v - 640x640 (DASH video)", "protocol": "https"}, {"format_id": "10155529880386509v", "manifest_url": null, "ext": "mp4", "width": 1080, "height": 1080, "tbr": 2305.949, "asr": null, "fps": null, "language": "eng", "format_note": "DASH video", "filesize": null, "container": "mp4_dash", "vcodec": "avc1.64001f", "acodec": "none", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14928551_10155529880391509_5432109631927222272_n.mp4?_nc_cat=104&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfMTI4MF9jcmZfMjNfaGlnaF8zLjFfZnJhZ18yX3ZpZGVvIn0=&_nc_ohc=7oTUQ5ydn_kAQmo9qfX8xaa9_V0ToGp5-4L80m1QL9GMxWpI3MrPbXQYg&_nc_ht=video-atl3-1.xx&oh=65be49789f56138a05e20bbdaaf25168&oe=5DD94E61", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529880386509v - 1080x1080 (DASH video)", "protocol": "https"}, {"format_id": "dash_sd_src", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14927236_1106979459423130_6536409035841732608_n.mp4?_nc_cat=109&efg=eyJybHIiOjU5MCwicmxhIjo1MTIsInZlbmNvZGVfdGFnIjoic3ZlX3NkIn0%3D&_nc_ohc=xiEKG4HsaxQAQlcH9w3T6FZyakep5wdhwEyYbCfEfk_XTDWM3UVSf-WlA&rl=590&vabr=328&_nc_ht=video-atl3-1.xx&oh=cf8d47e18c65ba4e0539f32533a560b7&oe=5DD94F13", "preference": 0, "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "ext": "mp4", "format": "dash_sd_src - unknown", "protocol": "https"}, {"format_id": "dash_sd_src_no_ratelimit", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14927236_1106979459423130_6536409035841732608_n.mp4?_nc_cat=109&efg=eyJ2ZW5jb2RlX3RhZyI6InN2ZV9zZCJ9&_nc_ohc=xiEKG4HsaxQAQlcH9w3T6FZyakep5wdhwEyYbCfEfk_XTDWM3UVSf-WlA&_nc_ht=video-atl3-1.xx&oh=cf8d47e18c65ba4e0539f32533a560b7&oe=5DD94F13", "preference": 0, "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "ext": "mp4", "format": "dash_sd_src_no_ratelimit - unknown", "protocol": "https"}, {"format_id": "dash_hd_src", "url": "https://video-atl3-1.xx.fbcdn.net/v/t39.24130-2/10000000_2112313042406194_4422723318924873276_n.mp4?_nc_cat=108&efg=eyJ2ZW5jb2RlX3RhZyI6Im9lcF9oZCJ9&_nc_ohc=2u6aBt4QCbYAQlqFC0jBRvAHsDc5EjE8li9gteVlj3U72PR6BYX7MTA3g&_nc_ht=video-atl3-1.xx&oh=09f76ad07b27718575a8a01531a58f49&oe=5E8A4732", "preference": 5, "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "ext": "mp4", "format": "dash_hd_src - unknown", "protocol": "https"}], "uploader": "Holocaust survivor becomes US citizen", "timestamp": 1523161890, "thumbnail": "https://scontent-atl3-1.xx.fbcdn.net/v/t15.5256-10/p200x200/14900093_10155529877461509_2120259349254242304_n.jpg?_nc_cat=105&_nc_ohc=fMSUdUTdVRsAQmiXmHod1USxJzB6Bc1FWEJEkeCg7x3CUcrJQY2Q3-kiA&_nc_ht=scontent-atl3-1.xx&oh=57eadb5be355d5c576417fafd46f0bd8&oe=5E8971EC", "view_count": null, "subtitles": {}, "extractor": "facebook", "webpage_url": "https://www.facebook.com/cnn/videos/10155529876156509/", "webpage_url_basename": "10155529876156509", "extractor_key": "Facebook", "playlist": null, "playlist_index": null, "thumbnails": [{"url": "https://scontent-atl3-1.xx.fbcdn.net/v/t15.5256-10/p200x200/14900093_10155529877461509_2120259349254242304_n.jpg?_nc_cat=105&_nc_ohc=fMSUdUTdVRsAQmiXmHod1USxJzB6Bc1FWEJEkeCg7x3CUcrJQY2Q3-kiA&_nc_ht=scontent-atl3-1.xx&oh=57eadb5be355d5c576417fafd46f0bd8&oe=5E8971EC", "id": "0"}], "display_id": "10155529876156509", "upload_date": "20180408", "requested_subtitles": null, "requested_formats": [{"format_id": "10155529880386509v", "manifest_url": null, "ext": "mp4", "width": 1080, "height": 1080, "tbr": 2305.949, "asr": null, "fps": null, "language": "eng", "format_note": "DASH video", "filesize": null, "container": "mp4_dash", "vcodec": "avc1.64001f", "acodec": "none", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14928551_10155529880391509_5432109631927222272_n.mp4?_nc_cat=104&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfMTI4MF9jcmZfMjNfaGlnaF8zLjFfZnJhZ18yX3ZpZGVvIn0=&_nc_ohc=7oTUQ5ydn_kAQmo9qfX8xaa9_V0ToGp5-4L80m1QL9GMxWpI3MrPbXQYg&_nc_ht=video-atl3-1.xx&oh=65be49789f56138a05e20bbdaaf25168&oe=5DD94E61", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529880386509v - 1080x1080 (DASH video)", "protocol": "https"}, {"format_id": "10155529878241509ad", "manifest_url": null, "ext": "m4a", "width": null, "height": null, "tbr": 49.237, "asr": 48000, "fps": null, "language": "eng", "format_note": "DASH audio", "filesize": null, "container": "m4a_dash", "vcodec": "none", "acodec": "mp4a.40.29", "url": "https://video-atl3-1.xx.fbcdn.net/v/t42.1790-2/14926769_10155529878251509_3232337965438992384_n.mp4?_nc_cat=111&efg=eyJ2ZW5jb2RlX3RhZyI6ImRhc2hfdjNfNDI2X2NyZl8yM19tYWluXzMuMF9mcmFnXzJfYXVkaW8ifQ==&_nc_ohc=DjO0kBpr8AsAQnnbuCWqptsgaJ5_FSpnlE-ji6WgZENfsBjWrSwismkqw&_nc_ht=video-atl3-1.xx&oh=b1a654d901276177cdfe1069ce1dca77&oe=5DD96178", "http_headers": {"User-Agent": "facebookexternalhit/1.1", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-us,en;q=0.5"}, "format": "10155529878241509ad - audio only (DASH audio)", "protocol": "https"}], "format": "10155529880386509v - 1080x1080 (DASH video)+10155529878241509ad - audio only (DASH audio)", "format_id": "10155529880386509v+10155529878241509ad", "width": 1080, "height": 1080, "resolution": null, "fps": null, "vcodec": "avc1.64001f", "vbr": null, "stretched_ratio": null, "acodec": "mp4a.40.29", "abr": null, "ext": "mp4", "fulltitle": "She survived the holocaust \u2014 and years later, she\u2019s getting her citizenship s...", "_filename": "She survived the holocaust \u2014 and years later, she\u2019s getting her citizenship s...-10155529876156509.mp4"}

Description

While testing some Facebook videos I've noticed that the title, uploader and view_count are not being extracted properly, I have checked the Facebook extractor tests and I've tried one of them:

{
  # have 1080P, but only up to 720p in swf params
  'url': 'https://www.facebook.com/cnn/videos/10155529876156509/',
  'md5': '9571fae53d4165bbbadb17a94651dcdc',
  'info_dict': {
      'id': '10155529876156509',
      'ext': 'mp4',
      'title': 'She survived the holocaust — and years later, she’s getting her citizenship s...',
      'timestamp': 1477818095,
      'upload_date': '20161030',
      'uploader': 'CNN',
      'thumbnail': r're:^https?://.*',
      'view_count': int,
  },
}

The info extraction using the latest version 2019.11.22 returns these results:

{
  "id": "10155529876156509",
  "title": "She survived the holocaust \u2014 and years later, she\u2019s getting her citizenship s...",
  "uploader": "Holocaust survivor becomes US citizen",
  "view_count": null,
   ...
}

Another example from the tests:

{
  'url': 'https://www.facebook.com/video.php?v=274175099429670',
  'info_dict': {
      'id': '274175099429670',
      'ext': 'mp4',
      'title': 're:^Asif Nawab Butt posted a video',
      'uploader': 'Asif Nawab Butt',
      'upload_date': '20140506',
      'timestamp': 1399398998,
      'thumbnail': r're:^https?://.*',
  },
  'expected_warnings': [
      'title'
  ]
}

Returns these results:

{
  "id": "274175099429670",
  "title": "Facebook video #274175099429670",
  "uploader": null,
  "view_count": null,
  ...
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.