Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternate Audoiomack URL scheme #29800

Closed
abdullah-if opened this issue Aug 15, 2021 · 12 comments · Fixed by #29810
Closed

Alternate Audoiomack URL scheme #29800

abdullah-if opened this issue Aug 15, 2021 · 12 comments · Fixed by #29810

Comments

@abdullah-if
Copy link

abdullah-if commented Aug 15, 2021

Here is the current audiomack regex

https?://(?:www\.)?audiomack\.com/song/(?P<id>[\w/-]+)

But audiomack has also this kind of URL
https://audiomack.com/\<uploader name>/song/<song name>

Using this type of URL load generic extractor and ultimately fails.
The pattern need to be updated

@dirkf
Copy link
Contributor

dirkf commented Aug 16, 2021

Please suggest actual URL examples, both successful and failing with yt-dl, that you can play in your browser.

@abdullah-if
Copy link
Author

abdullah-if commented Aug 16, 2021

@dirkf

Please suggest actual URL examples, both successful and failing with yt-dl, that you can play in your browser.

The successful URL is redirected to failing URL when opened in browser.
Failing URL:
https://audiomack.com/islamiclibrary/song/abdul-rahman-al-sudais-sura-1-al-fatiha
Debug log:

$ youtube-dl https://audiomack.com/islamiclibrary/song/abdul-rahman-al-sudais-sura-1-al-fatiha -v  
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://audiomack.com/islamiclibrary/song/abdul-rahman-al-sudais-sura-1-al-fatiha', '-v']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Python version 3.9.6 (CPython) - Linux-5.13.10-zen1-1-zen-x86_64-with-glibc2.33
[debug] exe versions: ffmpeg 4.4, ffprobe 4.4
[debug] Proxy map: {}
[generic] abdul-rahman-al-sudais-sura-1-al-fatiha: Requesting header
WARNING: Falling back on generic information extractor.
[generic] abdul-rahman-al-sudais-sura-1-al-fatiha: Downloading webpage
[generic] abdul-rahman-al-sudais-sura-1-al-fatiha: Extracting information
ERROR: Unsupported URL: https://audiomack.com/islamiclibrary/song/abdul-rahman-al-sudais-sura-1-al-fatiha
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 534, in extract
    ie_result = self._real_extract(url)
  File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/generic.py", line 3520, in _real_extract
    raise UnsupportedError(url)
youtube_dl.utils.UnsupportedError: Unsupported URL: https://audiomack.com/islamiclibrary/song/abdul-rahman-al-sudais-sura-1-al-fatiha

Supported URL:
https://audiomack.com/song/islamiclibrary/abdul-rahman-al-sudais-sura-1-al-fatiha
Debug log:

$ youtube-dl https://audiomack.com/song/islamiclibrary/abdul-rahman-al-sudais-sura-1-al-fatiha -v 
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://audiomack.com/song/islamiclibrary/abdul-rahman-al-sudais-sura-1-al-fatiha', '-v']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Python version 3.9.6 (CPython) - Linux-5.13.10-zen1-1-zen-x86_64-with-glibc2.33
[debug] exe versions: ffmpeg 4.4, ffprobe 4.4
[debug] Proxy map: {}
[audiomack] islamiclibrary/abdul-rahman-al-sudais-sura-1-al-fatiha: Downloading JSON metadata
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://music.audiomack.com/albums/islamiclibrary/complete-quran/abdul-rahman-al-sudais-sura-1-al-fatiha.mp3?Expires=1629124522&Signature=F1Rqc3AEFcuwkeleTy2mGGnNZDrKwrN2mPJD08rqgEkw2xycAaTIbsRSeqT4bbu-rUTID15oPnUaOLrgR8b4Y0nkAEMB6IIqGDlBYMzfAjy0JFGduxzqay49w8sQniqqL494D3gqLlVmZMd6dnSrZz6jN95klI5mrwnWxY3byX4_&Key-Pair-Id=APKAIKAIRXBA2H7FXITA'
[download] Destination: Abdul Rahman Al Sudais_Sura  1_ Al Fatiha-13599433.mp3
[download] 100% of 287.96KiB in 00:06

@dirkf
Copy link
Contributor

dirkf commented Aug 16, 2021

So we can make the pattern in extractor/audiomack.com (line 18)
_VALID_URL = 'https?://(?:www\.)?audiomack\.com/(?:song/)?(?P<id>[\w/-]+)'
and remove any '/song/' component from the resulting ID (line 51)
album_url_tag = self._match_id(url).replace('/song/', '/')

Then:

# youtube-dl -v -F 'https://audiomack.com/islamiclibrary/song/abdul-rahman-
al-sudais-sura-1-al-fatiha'
[debug] System config: [u'--restrict-filenames', u'--prefer-ffmpeg', u'-f', u'best[height<=?1080][fps<=?60]', u'-o', u'/media/drive1/Video/%(title)s.%(ext)s']
[debug] User config: [u'-f', u'(best/bestvideo+bestaudio)[height<=?1080][fps<=?60][tbr<=?1900]']
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'-F', u'https://audiomack.com/islamiclibrary/song/abdul-rahman-al-sudais-sura-1-al-fatiha']
[debug] Encodings: locale ASCII, fs ASCII, out ASCII, pref ASCII
[debug] youtube-dl version 2021.06.06.1
[debug] Python version 2.7.1 (CPython) - Linux-2.6.18-7.1-7405b0-smp-with-libc0
[debug] exe versions: ffmpeg 4.1, ffprobe 4.1
[debug] Proxy map: {}
[audiomack] islamiclibrary/abdul-rahman-al-sudais-sura-1-al-fatiha: Downloading JSON metadata
[info] Available formats for 13599433:
format code  extension  resolution note
0            mp3        unknown    
#

@abdullah-if
Copy link
Author

I found the problem extend to albums, too. New album URL scheme is https://audiomack.com/<uploader name>/album/<song name>.
Failing URL:
https://audiomack.com/islamiclibrary/album/complete-quran-part-1-suras-1-70-89
Debug log:

youtube-dl https://audiomack.com/islamiclibrary/album/complete-quran-part-1-suras-1-70-89 -F -v
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://audiomack.com/islamiclibrary/album/complete-quran-part-1-suras-1-70-89', '-F', '-v']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Python version 3.9.6 (CPython) - Linux-5.13.10-zen1-1-zen-x86_64-with-glibc2.33
[debug] exe versions: ffmpeg 4.4, ffprobe 4.4
[debug] Proxy map: {}
[generic] complete-quran-part-1-suras-1-70-89: Requesting header
WARNING: Falling back on generic information extractor.
[generic] complete-quran-part-1-suras-1-70-89: Downloading webpage
[generic] complete-quran-part-1-suras-1-70-89: Extracting information
ERROR: Unsupported URL: https://audiomack.com/islamiclibrary/album/complete-quran-part-1-suras-1-70-89
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 534, in extract
    ie_result = self._real_extract(url)
  File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/generic.py", line 3520, in _real_extract
    raise UnsupportedError(url)
youtube_dl.utils.UnsupportedError: Unsupported URL: https://audiomack.com/islamiclibrary/album/complete-quran-part-1-suras-1-70-89

Successful URL:
youtube-dl https://audiomack.com/album/islamiclibrary/complete-quran-part-1-suras-1-70-89
Debug log:

youtube-dl https://audiomack.com/album/muslimummah/muslim-ummah-6 -F -v
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://audiomack.com/album/muslimummah/muslim-ummah-6', '-F', '-v']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Python version 3.9.6 (CPython) - Linux-5.13.10-zen1-1-zen-x86_64-with-glibc2.33
[debug] exe versions: ffmpeg 4.4, ffprobe 4.4
[debug] Proxy map: {}
[audiomack:album] muslimummah/muslim-ummah-6: Querying song information (1)
...............................................................................................................................................................................
[audiomack:album] muslimummah/muslim-ummah-6: Querying song information (38)
[download] Downloading playlist: muslim ummah
[audiomack:album] playlist muslim ummah: Collected 37 video ids (downloading 37 of them)
[download] Downloading video 1 of 37
[info] Available formats for 6402958:
format code  extension  resolution note
0            mp3        unknown    
..................................................................................
[download] Downloading video 37 of 37
[info] Available formats for 6403033:
format code  extension  resolution note
0            mp3        unknown    
[download] Finished downloading playlist: muslim ummah

@dirkf
Copy link
Contributor

dirkf commented Aug 17, 2021

Clearly a bit more sophistication is needed to avoid finding songs as albums or vice versa.

In the pattern in extractor/audiomack.com (line 18) the id must either follow 'song/...' or contain '.../song/...':

    _VALID_URL = 'https?://(?:www\.)?audiomack\.com/(?:song/|(?=.+/song/))(?P<id>[\w/-]+)'

and remove any '/song/' component from the resulting ID (line 51) as before:

        album_url_tag = self._match_id(url).replace('/song/', '/')

Similarly for the album extractor, at line 77, the id must either follow 'album/...' or contain '.../album/...':

    _VALID_URL = 'https?://(?:www\.)?audiomack\.com/(?:album/|(?=.+/album/))(?P<id>[\w/-]+)'

and remove any '/album/' component from the resulting ID (line 116):

        album_url_tag = self._match_id(url).replace('/album/', '/')`

@abdullah-if
Copy link
Author

abdullah-if commented Aug 17, 2021

As I was going to open a PR, turns out python test/test_download.py TestDownload.test_Audiomack_1 is throwing errors. (Without any modification). Turns out the song no longer exists. Is it just me, or it is another true issue ? Here is the URL http://www.audiomack.com/song/hip-hop-daily/black-mamba-freestyle

@abdullah-if
Copy link
Author

abdullah-if commented Aug 17, 2021

So does python test/test_download.py TestDownload.test_AudiomackAlbum and python test/test_download.py TestDownload.test_AudiomackAlbum_1, trying to compare string with int.

$ python test/test_download.py TestDownload.test_AudiomackAlbum
python test/test_download.py TestDownload.test_AudiomackAlbum      
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (1)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (2)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (3)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (4)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (5)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (6)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (7)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (8)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (9)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (10)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (11)
[audiomack:album] flytunezcom/tha-tour-part-2-mixtape: Querying song information (12)
[download] Downloading playlist: Tha Tour: Part 2 (Official Mixtape)
[audiomack:album] playlist Tha Tour: Part 2 (Official Mixtape): Collected 11 video ids (downloading 11 of them)
[download] Downloading video 1 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812265.info.json
[download] Downloading video 2 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812262.info.json
[download] Downloading video 3 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812266.info.json
[download] Downloading video 4 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812261.info.json
[download] Downloading video 5 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812256.info.json
[download] Downloading video 6 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812263.info.json
[download] Downloading video 7 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812259.info.json
[download] Downloading video 8 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812257.info.json
[download] Downloading video 9 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812253.info.json
[download] Downloading video 10 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812254.info.json
[download] Downloading video 11 of 11
[info] Writing video description metadata as JSON to: test_AudiomackAlbum_812260.info.json
[download] Finished downloading playlist: Tha Tour: Part 2 (Official Mixtape)
F
======================================================================
FAIL: test_AudiomackAlbum (__main__.TestDownload):
----------------------------------------------------------------------
Traceback (most recent call last):
  File "#/youtube-dl/test/test_download.py", line 178, in test_template
    expect_info_dict(self, res_dict, test_case.get('info_dict', {}))
  File "#/youtube-dl/test/helper.py", line 190, in expect_info_dict
    expect_dict(self, got_dict, expected_dict)
  File "#/youtube-dl/test/helper.py", line 186, in expect_dict
    expect_value(self, got, expected, info_field)
  File "#/youtube-dl/test/helper.py", line 178, in expect_value
    self.assertEqual(
AssertionError: '812251' != 812251 : Invalid value for field id, expected '812251', got 812251

----------------------------------------------------------------------
Ran 1 test in 134.827s

FAILED (failures=1)

@abdullah-if
Copy link
Author

Now what? Will I wait for these issues to resolve or make a PR with new regex ? @dirkf

@dirkf
Copy link
Contributor

dirkf commented Aug 17, 2021

As I was going to open a PR, turns out python test/test_download.py TestDownload.test_Audiomack_1 is throwing errors. (Without any modification). Turns out the song no longer exists. Is it just me, or it is another true issue ? Here is the URL http://www.audiomack.com/song/hip-hop-daily/black-mamba-freestyle

Same for me and the site says "This song cannot be found or has been removed." The failing test can be disabled with 'only_matching': True, instead of the info_dict block (which could be commented out with a note that the test song is no longer live). Ideally, a new URL would be found for this test case.

Apparently the playlist ID for an album can be retrieved as a number but should be a string, which can be fixed with this revised line 138:

                       result[resultkey] = compat_str(api_response[apikey])    

The project admins (when they're about) like a PR to address one specific set of changes, so the dead URL can be ignored for your PR. You could justify including the line 138 change, which matches what's done for 'video' IDs.

@abdullah-if
Copy link
Author

Now I have to correct a whole other stuff. Turns out the album in test case has 11 songs but expects 15 URL

@abdullah-if
Copy link
Author

Everything done except........................ , python test/test_download.py TestDownload.test_AudiomackAlbum_1 is throwing error, it is trying to get value for a key from a empty list. The empty list, followed by 0 are good ol' print for erring values.

$ python test/test_download.py TestDownload.test_AudiomackAlbum_1
[audiomack:album] fakeshoredrive/ppp-pistol-p-project: Querying song information (1)
[audiomack:album] fakeshoredrive/ppp-pistol-p-project: Querying song information (2)
[audiomack:album] fakeshoredrive/ppp-pistol-p-project: Querying song information (3)
[download] Downloading playlist: PPP (Pistol P Project)
[audiomack:album] playlist PPP (Pistol P Project): Collected 2 video ids (downloading 0 of them)
[download] Finished downloading playlist: PPP (Pistol P Project)
[]
0
E
======================================================================
ERROR: test_AudiomackAlbum_1 (__main__.TestDownload):
----------------------------------------------------------------------
Traceback (most recent call last):
  File "#/youtube-dl/test/test_download.py", line 210, in test_template
    tc_res_dict = res_dict['entries'][tc_num]
IndexError: list index out of range

----------------------------------------------------------------------
Ran 1 test in 34.541s

FAILED (errors=1)

@dirkf
Copy link
Contributor

dirkf commented Aug 17, 2021

The test is trying to match the 9th playlist item but there are only 2. I formatted the JSON output below.

# youtube-dl -J 'http://www.audiomack.com/album/fakeshoredrive/ppp-pistol-p-project'
{
  'extractor': 'audiomack:album',
  '_type': 'playlist',
  'title': 'PPP (Pistol P Project)',
  'extractor_key': 'AudiomackAlbum',
  'webpage_url': 'http://www.audiomack.com/album/fakeshoredrive/ppp-pistol-p-project',
  'entries': [
    {
      'extractor': 'audiomack:album',
      'protocol': 'https',
      'playlist_index': 1,
      'playlist': 'PPP (Pistol P Project)',
      'title': 'PPP (Pistol P Project) - 8. Real (prod by SYK SENSE  )',
      'id': '837576',
      'playlist_id': '837572',
      'webpage_url_basename': 'ppp-pistol-p-project',
      'display_id': '837576',
      'format': '0 - unknown',
      'requested_subtitles': null,
      'playlist_uploader': null,
      'uploader': 'Lil Herb a.k.a. G Herbo',
      'format_id': '0',
      'playlist_title': 'PPP (Pistol P Project)',
      'url': 'https://music.audiomack.com/albums/fakeshoredrive/ppp-pistol-p-project/8.-real-prod-by-syk-sense-.mp3?Expires=1629185917&Signature=S~z7zK991mqIBqC~mmgkV447ZhZpLbyKFiUw9SjKfsu9Q1VTr5iZnrxepehQRH8sPBj2KmRbKnJYyeJnPcKp0wl3irbjsDvh-Zr~~1J0KqjHEtmGkwZdaBzvc1GSSrFwc1I1XE9ogRqLkz-ZeRrUfFNCQ9WsmIw4GrKh6bg4vY8_&Key-Pair-Id=APKAIKAIRXBA2H7FXITA',
      'extractor_key': 'AudiomackAlbum',
      'http_headers': {
        'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
        'Accept-Language': 'en-us,en;q=0.5',
        'Accept-Encoding': 'gzip, deflate',
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.115 Safari/537.36'
      },
      'ext': 'mp3',
      'webpage_url': 'http://www.audiomack.com/album/fakeshoredrive/ppp-pistol-p-project',
      'playlist_uploader_id': null,
      'n_entries': 2
    },
    {
      'extractor': 'audiomack:album',
      'protocol': 'https',
      'playlist_index': 2,
      'playlist': 'PPP (Pistol P Project)',
      'title': 'PPP (Pistol P Project) - 10. 4 Minutes Of Hell Part 4 (prod by DY OF 808 MAFIA)',
      'id': '837580',
      'playlist_id': '837572',
      'webpage_url_basename': 'ppp-pistol-p-project',
      'display_id': '837580',
      'format': '0 - unknown',
      'requested_subtitles': null,
      'playlist_uploader': null,
      'uploader': 'Lil Herb a.k.a. G Herbo',
      'format_id': '0',
      'playlist_title': 'PPP (Pistol P Project)',
      'url': 'https://music.audiomack.com/albums/fakeshoredrive/ppp-pistol-p-project/10.-4-minutes-of-hell-part-4-prod-by-dy-of-808-mafia.mp3?Expires=1629185918&Signature=OT717WdFq0v4O6ZAxl5jN8Lim8QG-VM~AOqEDGj89EON53cKAt5g6yRCbh71briDgK8-dmdTsATEerUhE2wQr2oCSGsb2skfmM9B5bAWUWTXxbPyIEIh31oeM0~LhQOEiOm4XRihOzVqgZdyOpLFPSjkWyrV-vAJ1z0zu9mhP2w_&Key-Pair-Id=APKAIKAIRXBA2H7FXITA',
      'extractor_key': 'AudiomackAlbum',
      'http_headers': {
        'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
        'Accept-Language': 'en-us,en;q=0.5',
        'Accept-Encoding': 'gzip, deflate',
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.115 Safari/537.36'
      },
      'ext': 'mp3',
      'webpage_url': 'http://www.audiomack.com/album/fakeshoredrive/ppp-pistol-p-project',
      'playlist_uploader_id': null,
      'n_entries': 2
    }
  ],
  'id': '837572',
  'webpage_url_basename': 'ppp-pistol-p-project'
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants