Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Samsung Newsroom (mb.miniAudioPlayer) #4994

Closed
10 tasks done
Jadoo4QFan opened this issue Sep 22, 2022 · 2 comments · Fixed by #5087
Closed
10 tasks done

Samsung Newsroom (mb.miniAudioPlayer) #4994

Jadoo4QFan opened this issue Sep 22, 2022 · 2 comments · Fixed by #5087
Assignees
Labels
site-request Request to support a new website

Comments

@Jadoo4QFan
Copy link

Jadoo4QFan commented Sep 22, 2022

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I remove or skip any mandatory* field

Checklist

  • I'm reporting a new site support request
  • I've verified that I'm running yt-dlp version 2022.09.01 (update instructions) or later (specify commit)
  • I've checked that all provided URLs are playable in a browser with the same IP and same login details
  • I've checked that none of provided URLs violate any copyrights or contain any DRM to the best of my knowledge
  • I've searched the bugtracker for similar issues including closed ones. DO NOT post duplicates
  • I've read the guidelines for opening an issue
  • I've read about sharing account credentials and am willing to share it if required

Region

California, USA

Example URLs

https://news.samsung.com/global/over-the-horizon-the-evolution-of-the-samsung-galaxy-brand-sound
https://news.samsung.com/global/design-story-joy-of-listening-sound-design-at-samsung
https://news.samsung.com/global/from-rock-to-classical-pop-crossovers-new-ocean-inspired-over-the-horizon-celebrates-10th-anniversary-of-galaxy-series (over the horizon 2019 history only)

Provide a description that is worded well enough to be understood

Please add support for the Samsung newsroom extractor, including its metadata. The thing downloaded by YT-dlp was the voiceover of the blog, not the audio itself.

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

yt-dlp is up to date (2022.09.01)
[Documents]$ yt-dlp -vU https://news.samsung.co
m/global/over-the-horizon-the-evolution-of-the-
samsung-galaxy-brand-sound
[debug] Command-line config: ['-vU', 'https://n
ews.samsung.com/global/over-the-horizon-the-evo
lution-of-the-samsung-galaxy-brand-sound']
[debug] Encodings: locale UTF-8, fs utf-8, pref
 UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.09.01 [5d7c7d6] (pi
p)
[debug] Python 3.9.10+ (CPython 64bit) - macOS-
16.0-iPhone14,5-arm-64bit 
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg n5.0.1-4-ga5ebb3d2
5e (setts), ffprobe n5.0.1-4-ga5ebb3d25e, phant
omjs present, rtmpdump present
[debug] Optional libraries: Crypto-3.15.0, cert
ifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, 
websockets-10.3
[debug] Proxy map: {}
[debug] Loaded 1670 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.09.01)
[debug] [generic] Extracting URL: https://news.samsung.com/global/over-the-horizon-the-evolution-of-the-samsung-galaxy-brand-sound
[generic] over-the-horizon-the-evolution-of-the-samsung-galaxy-brand-sound: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] over-the-horizon-the-evolution-of-the-samsung-galaxy-brand-sound: Extracting information
[debug] Looking for Brightcove embeds
[debug] Looking for embeds
ERROR: Unsupported URL: https://news.samsung.com/global/over-the-horizon-the-evolution-of-the-samsung-galaxy-brand-sound
Traceback (most recent call last):
  File "/var/mobile/Containers/Data/Application/87BE3170-A2E1-4D77-98F2-3C7FE2FA6E42/Library/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 1459, in wrapper
    return func(self, *args, **kwargs)
  File "/var/mobile/Containers/Data/Application/87BE3170-A2E1-4D77-98F2-3C7FE2FA6E42/Library/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py", line 1535, in __extract_info
    ie_result = ie.extract(url)
  File "/var/mobile/Containers/Data/Application/87BE3170-A2E1-4D77-98F2-3C7FE2FA6E42/Library/lib/python3.9/site-packages/yt_dlp/extractor/common.py", line 670, in extract
    ie_result = self._real_extract(url)
  File "/var/mobile/Containers/Data/Application/87BE3170-A2E1-4D77-98F2-3C7FE2FA6E42/Library/lib/python3.9/site-packages/yt_dlp/extractor/generic.py", line 3078, in _real_extract
    raise UnsupportedError(url)
yt_dlp.utils.UnsupportedError: Unsupported URL: https://news.samsung.com/global/over-the-horizon-the-evolution-of-the-samsung-galaxy-brand-sound

[Documents]$ yt-dlp https://news.samsung.com/global/design-story-joy-of-listening-sound-design-at-samsung -vU
[debug] Command-line config: ['https://news.samsung.com/global/design-story-joy-of-listening-sound-design-at-samsung', '-vU']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.09.01 [5d7c7d6] (pip)
[debug] Python 3.9.10+ (CPython 64bit) - macOS-16.0-iPhone14,5-arm-64bit 
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg n5.0.1-4-ga5ebb3d25e (setts), ffprobe n5.0.1-4-ga5ebb3d25e, phantomjs present, rtmpdump present
[debug] Optional libraries: Crypto-3.15.0, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Loaded 1670 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.09.01)
[debug] [generic] Extracting URL: https://news.samsung.com/global/design-story-joy-of-listening-sound-design-at-samsung
[generic] design-story-joy-of-listening-sound-design-at-samsung: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] design-story-joy-of-listening-sound-design-at-samsung: Extracting information
[debug] Looking for Brightcove embeds
[debug] Looking for embeds
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, filesize, fs_approx, tbr, vbr, abr, asr, proto, vext, aext, hasaud, source, id
[debug] Identified a html5 embed
[debug] Default format spec: bestvideo*+bestaudio/best
[info] design-story-joy-of-listening-sound-design-at-samsung-1: Downloading 1 format(s): 0
[debug] Invoking http downloader on "https://s3.us-west-2.amazonaws.com/audio-for-wordpress-95118628321a0e5c8cae3668e9448f3804d8fa0c/2016/11/amazon_polly_79894.mp3?version=1527066889"
[download] design-story-joy-of-listening-sound-design-at-samsung (1) [design-story-joy-of-listening-sound-design-at-samsung-1].mp3 has already been downloaded
[download] 100% of 2.09MiB
[Documents]$ �

https://news.samsung.com/global/samsungs-over-the-horizon-gets-digitally-re-worked-by-elementary-school-students-in-the-netherlands
another link

@Jadoo4QFan Jadoo4QFan added site-request Request to support a new website triage Untriaged issue labels Sep 22, 2022
@coletdjnz
Copy link
Member

For the following url

All the audio is already being extracted as expected. If you want only the YouTube embed, you can use --use-extractors youtube,generic or other methods of filtering. (The HTML5 embed detection is its own extractor so you can filter it out by that)


For the following urls:

The audio seems to be using mb.miniAudioPlayer player:

A generic embed extractor could possibly be added for this, at a minimum.

Demo: https://pupunzi.com/mb.components/mb.miniAudioPlayer/demo/demo.html

Snippet from newsroom:

<p><a id="mbmaplayer_1429253973287" class="mb_map {skin:'black', animate:false, width:'400', volume:0.7, autoplay:false, loop:false, showVolumeLevel:true, showTime:true, showRew:true, addGradientOverlay:false, downloadable:false, downloadablesecurity:false, id3: false}" href="http://news.samsung.com/global/wp-content/uploads/ringtones/over_the_horizon_2013.mp3">Over the Horizon 2013</a></p>
<p>&nbsp;</p>

Note that the id may not be generic, so might need to figure out some other way to uniquely detect it without any false positives (read the player usage and source code maybe).

@coletdjnz coletdjnz removed the triage Untriaged issue label Sep 23, 2022
@coletdjnz coletdjnz changed the title Samsung Newsroom Samsung Newsroom (mb.miniAudioPlayer) Sep 23, 2022
@coletdjnz
Copy link
Member

The version on this site appears to be the wordpress plugin version (when I say version, I mean how it is initialized). For this we may be able to write a generic extractor.

The initalization that we want to extract looks somewhat like the following:

function initializeMiniAudioPlayer(){
         jQuery(".mejs-container a").addClass(miniAudioPlayer_excluded);
         jQuery("a[href *= '.mp3']").not(".map_excuded").not(".wp-playlist-caption").not("[download]").mb_miniPlayer(miniAudioPlayer_defaults);
    }
 
    function initializeMiniAudioPlayer(){
         jQuery(".mejs-container a").addClass(miniAudioPlayer_excluded);
         jQuery("a[href*='.mp3'] ,a[href*='.m4a']").not(".map_excluded").not(".wp-playlist-caption").mb_miniPlayer(miniAudioPlayer_defaults);
    }
   
    function initializeMiniAudioPlayer(){
         jQuery("a[href*='.mp3'] ,a[href*='.m4a']").not(".map_excluded").mb_miniPlayer({ <cut>
    

From this we can probably do a very basic selector parser.
Looking at the code it suggests this is somewhat fixed: https://plugins.svn.wordpress.org/wp-miniaudioplayer/trunk/miniAudioPlayer.php

function initializeMiniAudioPlayer(){
         jQuery(".mejs-container a").addClass(miniAudioPlayer_excluded);
         jQuery("a' . ($miniAudioPlayer_active_all != 'true' ? '.mb_map' : '') . '[href *= \'.mp3\']' . ($miniAudioPlayer_active_all != 'true' ? '.mb_map' : '') . '")' . miniAudioPlayer_getExcluded() . 'mb_miniPlayer(miniAudioPlayer_defaults);
    }

The miniAudioPlayer_getExcluded shows how the excluded is generated.
By the docs the custom excludes is just css class names to ignore https://wordpress.org/plugins/wp-miniaudioplayer/#description

Note that it's possible these have changed over plugin versions hence the slightly different variations.
(https://github.com/wp-plugins/wp-miniaudioplayer/blob/master/wp-miniaudioplayer/miniAudioPlayer.php#L283-L286 is another version from 2015 that contains the m4a)

We could possibly get away with just parsing the css excludes, as a[href*='.mp3'] ,a[href*='.m4a'] or a[href *= '.mp3'] is likely a result of being hardcoded in the wordpress plugin itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-request Request to support a new website
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants