Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Audm feeds? #28043

Open
bat999 opened this issue Feb 1, 2021 · 4 comments
Open

Support for Audm feeds? #28043

bat999 opened this issue Feb 1, 2021 · 4 comments
Labels
site-support-request Add extractor(s) for a new domain

Comments

@bat999
Copy link

bat999 commented Feb 1, 2021

Checklist

  • [x ] I'm reporting a new site support request
  • [x ] I've verified that I'm running youtube-dl version 2021.01.24.1
  • [x ] I've checked that all provided URLs are alive and playable in a browser
  • [x ] I've checked that none of provided URLs violate any copyrights
  • [x ] I've searched the bugtracker for similar site support requests including closed ones

Example URLs

https://www.newyorker.com/magazine/2021/02/08/the-sputnik-v-vaccine-and-russias-race-to-immunity

Description

WRITE DESCRIPTION HERE
Hi
Some magazine websites allow listening to articles in the browser.
And for smartphones it offers also Audm app.

Here is an example page...
https://www.newyorker.com/magazine/2021/02/08/the-sputnik-v-vaccine-and-russias-race-to-immunity

Any chance that support for this type of page can be added to youtube-dl?

@mint ~ $ youtube-dl --update
youtube-dl is up-to-date (2021.01.24.1)
@mint ~ $ youtube-dl --verbose https://www.newyorker.com/magazine/2021/02/08/the-sputnik-v-vaccine-and-russias-race-to-immunity
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--verbose', u'https://www.newyorker.com/magazine/2021/02/08/the-sputnik-v-vaccine-and-russias-race-to-immunity']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.01.24.1
[debug] Python version 2.7.12 (CPython) - Linux-4.4.0-21-generic-x86_64-with-LinuxMint-18-sarah
[debug] exe versions: ffmpeg N-100405-gbf4b9e9, ffprobe N-100405-gbf4b9e9, rtmpdump 2.4
[debug] Proxy map: {}
[generic] the-sputnik-v-vaccine-and-russias-race-to-immunity: Requesting header
WARNING: Falling back on generic information extractor.
[generic] the-sputnik-v-vaccine-and-russias-race-to-immunity: Downloading webpage
[generic] the-sputnik-v-vaccine-and-russias-race-to-immunity: Extracting information
ERROR: Unsupported URL: https://www.newyorker.com/magazine/2021/02/08/the-sputnik-v-vaccine-and-russias-race-to-immunity
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2469, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2562, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2551, in _XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1653, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1517, in _raiseerror
    raise err
ParseError: not well-formed (invalid token): line 224, column 14
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 806, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 827, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 532, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 3467, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: https://www.newyorker.com/magazine/2021/02/08/the-sputnik-v-vaccine-and-russias-race-to-immunity
@bat999 bat999 added the site-support-request Add extractor(s) for a new domain label Feb 1, 2021
@philshem
Copy link

philshem commented Feb 4, 2021

Here's a description of how to manually extract the audio .m4a file. (Sorry if audio formats are out of scope of youtube-dl.)

Background: Many articles on newyorker.com have pleasant human-narrated audio for listening. To be able to listen offline, it would be really cool to extend the condenast.py extractor to be able to download these .m4a files. The actual audio file is hosted at https://api.audm.com.


Example process to get manually audio file

https://www.newyorker.com/magazine/2021/01/18/whats-wrong-with-the-way-we-work
(archive.org link)

At the top of the page, under the header image (for many but not all) articles is this audio box:

audio box on new yorker articles

  • Open "Web Developer Console" --> "Network Inspector"

  • Hit "Play" on the audio box

image

  • Search for filetype .m4a

  • Right-click on entry and select "Copy URL"

  • Use wget to download .m4a from URL

wget "https://api.audm.com/v6/signed-article-m4a?article-slug=we-work-lepore" -O we-work-lepore.m4a

Thanks for your consideration!

@bat999
Copy link
Author

bat999 commented Feb 4, 2021

@philshem
Thanks
I've downloaded the Sputnik V audio track with wget.
That's good to know.
It's OK with Chromium browser on my Linux Desktop PC.

But it will be difficult with Chromium browser on my ANDROID phone.
I don't think there is 'Web Developer Console' on mobile Chrome/Chromium app.

I use youtube-dl with Linux Desktop and also with Termux console on ANDROID phone.

It would be useful if the Audm feeds were available with youtube-dl so that they could be downloaded with mobile devices too.

Any chance we could leave this 'site-support-request' open?

This is the Sputnik V audio track downloaded with wget on Linux Desktop PC.

mint ~ $ wget https://api.audm.com/v6/signed-article-m4a?article-slug=five-month-plan-yaffa -O five-month-plan-yaff.m4a
--2021-02-04 13:01:14--  https://api.audm.com/v6/signed-article-m4a?article-slug=five-month-plan-yaffa
Resolving api.audm.com... 104.237.148.42
Connecting to api.audm.com|104.237.148.42|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://dch3cp3um5qn7.cloudfront.net/five-month-plan-yaffa-singlefile.m4a?Expires=1615122083&Signature=BKvvNvbgIoIXvQcXL2NenSrQYGBLxUSrfVWwU10reSqx-aQGRN~ai6VcbKalGg~oKYUXMVupyv4UZJD0YeqggsM18DA7VXntb-1uJhN4u9FPJW42svK3dPDc~fr1OxYS738UJ5iNgkrJqeQ4leQpZvdIybhtJr7ZQpXSCu7HwGQMBQ7qMpVzJirabIGdoMzp~uBOT7QIbYNUNpxruMBxMLovkRZskz5~oLwNUwGIKCWorgH1e3syveKxguF3NYxAtPu~Uesu-qvKMo7Ikjs06MXhODUjZXQ5LqAviP16Cfff3rGfA~66jgrjr9XooOhhaGJKSm1SGvS0kbf5VU08lA__&Key-Pair-Id=APKAJFXO72WYUPEGEZ6Q [following]
--2021-02-04 13:01:23--  https://dch3cp3um5qn7.cloudfront.net/five-month-plan-yaffa-singlefile.m4a?Expires=1615122083&Signature=BKvvNvbgIoIXvQcXL2NenSrQYGBLxUSrfVWwU10reSqx-aQGRN~ai6VcbKalGg~oKYUXMVupyv4UZJD0YeqggsM18DA7VXntb-1uJhN4u9FPJW42svK3dPDc~fr1OxYS738UJ5iNgkrJqeQ4leQpZvdIybhtJr7ZQpXSCu7HwGQMBQ7qMpVzJirabIGdoMzp~uBOT7QIbYNUNpxruMBxMLovkRZskz5~oLwNUwGIKCWorgH1e3syveKxguF3NYxAtPu~Uesu-qvKMo7Ikjs06MXhODUjZXQ5LqAviP16Cfff3rGfA~66jgrjr9XooOhhaGJKSm1SGvS0kbf5VU08lA__&Key-Pair-Id=APKAJFXO72WYUPEGEZ6Q
Resolving dch3cp3um5qn7.cloudfront.net... 13.224.233.55, 13.224.233.105, 13.224.233.75, ...
Connecting to dch3cp3um5qn7.cloudfront.net|13.224.233.55|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 18754215 (18M) [application/octet-stream]
Saving to: ‘five-month-plan-yaff.m4a’

@bat999
Copy link
Author

bat999 commented Feb 5, 2021

Hi
I looked further into Audm.
It is owned by New York Times.
The website is here -> https://audm.com/

Audio tracks can be listened to in the browser for free.

For smartphones there is also an app.
The app is free but there is a monthly subscription...
"After your free trial, you will be charged $6.99/month for access to the entire Audm catalogue."

Apparently those Audm feeds are available in lots of magazines now...

Listen to hours' worth of new stories every week, from publications including:


* The New Yorker

* The Atlantic

* WIRED

* Rolling Stone

* New York Magazine

* BuzzFeed News

* Vanity Fair

* POLITICO Magazine

* Esquire

* The Daily Beast

* The New York Review of Books

* Outside Magazine

* Backchannel

* ProPublica

* London Review of Books

* The Atavist

* Texas Monthly

* Epic Magazine

* Foreign Policy

* The Texas Observer

* The Times Literary Supplement

* Harper's Bazaar

* Marie Claire

* Road & Track

* Popular Mechanics

* First Things

* Tablet Magazine

* Pacific Standard

* Guernica

* World Policy Journal

* The Bitter Southerner

* The Marshall Project

* The American Scholar

* Places Journal

* Coda Story

* The Morning News

@bat999
Copy link
Author

bat999 commented Feb 23, 2021

Hi
Some of these Audm pages seem to use mp3 instead of m4a.

Overall bit rate        : 160 kb/s
Track name/Position     : 1
Encoded by              : Fraunhofer IIS MP3 v04.01.02 (fast)
Recorded date           : 2021

This page for example...
https://www.nytimes.com/2021/02/18/magazine/amazon-workers-employees-covid-19.html

Find the link using "Web Developer Console" --> "Network Inspector" as per philshem instruction above.

Then download it...

mint ~$ wget --verbose https://static.nytimes.com/podcasts/2021/02/18/magazine/18audm-amazon-awakening-hayasaki/210218-amazon-awakening-hayasaki-nytmag-audm.mp3
--2021-02-23 11:49:00--  https://static.nytimes.com/podcasts/2021/02/18/magazine/18audm-amazon-awakening-hayasaki/210218-amazon-awakening-hayasaki-nytmag-audm.mp3
Resolving static.nytimes.com... 151.101.1.164, 151.101.65.164, 151.101.129.164, ...
Connecting to static.nytimes.com|151.101.1.164|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 53511347 (51M) [audio/mpeg]
Saving to: ‘210218-amazon-awakening-hayasaki-nytmag-audm.mp3’

210218-amazon-awakening-hayasaki-nytmag 100%[=============================================================================>]  51.03M  1.38MB/s    in 67s     
2021-02-23 11:50:07 (780 KB/s) - ‘210218-amazon-awakening-hayasaki-nytmag-audm.mp3’ saved [53511347/53511347]

EDIT
The download links are very different.
Those m4a links from NewYorker were "https://api.audm.com/v6/..."
The mp3 link from New York Times is "https://static.nytimes.com/podcasts/..."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-support-request Add extractor(s) for a new domain
Projects
None yet
Development

No branches or pull requests

2 participants