Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ArteTV] Unable to download JSON metadata - HTTP Error 404: Not Found #3622

Closed
7 tasks done
hlekin opened this issue May 2, 2022 · 39 comments · Fixed by #3302
Closed
7 tasks done

[ArteTV] Unable to download JSON metadata - HTTP Error 404: Not Found #3622

hlekin opened this issue May 2, 2022 · 39 comments · Fixed by #3302
Labels
site-bug Issue with a specific website

Comments

@hlekin
Copy link

hlekin commented May 2, 2022

Checklist

Region

Germany

Description

Command lines below should be self-explanatory.

By the way, many thanks for this really useful program and the effort you put in to it.

Verbose log

$ mpv https://www.arte.tv/de/videos/041596-000-A/das-leben-ist-seltsam/
[ytdl_hook] ERROR: [ArteTV] 041596-000-A: Unable to download JSON metadata: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U 
[ytdl_hook] youtube-dl failed: unexpected error occurred 
Failed to recognize file format.

$ yt-dlp -U
Latest version: 2022.04.08, Current version: 2022.04.08
yt-dlp is up to date (2022.04.08)

$ yt-dlp --ignore-config -F https://www.arte.tv/de/videos/041596-000-A/das-leben-ist-seltsam/
[ArteTV] 041596-000-A: Downloading JSON metadata
ERROR: [ArteTV] 041596-000-A: Unable to download JSON metadata: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U

$ yt-dlp --ignore-config -vU https://www.arte.tv/de/videos/041596-000-A/das-leben-ist-seltsam/
[debug] Command-line config: ['--ignore-config', '-vU', 'https://www.arte.tv/de/videos/041596-000-A/das-leben-ist-seltsam/']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, err utf-8, pref UTF-8
[debug] yt-dlp version 2022.04.08 [7884ade65]
[debug] Python version 3.10.4 (CPython 64bit) - Linux-5.17.5-arch1-1-x86_64-with-glibc2.35
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg 5.0 (setts), ffprobe 5.0, rtmpdump 2.4
[debug] Optional libraries: certifi, Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
Latest version: 2022.04.08, Current version: 2022.04.08
yt-dlp is up to date (2022.04.08)
[debug] [ArteTV] Extracting URL: https://www.arte.tv/de/videos/041596-000-A/das-leben-ist-seltsam/
[ArteTV] 041596-000-A: Downloading JSON metadata
ERROR: [ArteTV] 041596-000-A: Unable to download JSON metadata: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
  File "/usr/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 641, in extract
    ie_result = self._real_extract(url)
  File "/usr/lib/python3.10/site-packages/yt_dlp/extractor/arte.py", line 57, in _real_extract
    info = self._download_json(
  File "/usr/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 1029, in _download_json
    res = self._download_json_handle(
  File "/usr/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 1008, in _download_json_handle
    res = self._download_webpage_handle(
  File "/usr/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 800, in _download_webpage_handle
    urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query, expected_status=expected_status)
  File "/usr/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 785, in _request_webpage
    raise ExtractorError(errmsg, cause=err)

  File "/usr/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 767, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/lib/python3.10/site-packages/yt_dlp/YoutubeDL.py", line 3601, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
@hlekin hlekin added site-bug Issue with a specific website triage Untriaged issue labels May 2, 2022
@hlekin
Copy link
Author

hlekin commented May 2, 2022

All www.arte.tv video URLs are broken. Did not check this earlier, as it was working a couple of days ago.

@pukkandan
Copy link
Member

Possibly fixed by #3302

@Trit34
Copy link

Trit34 commented May 3, 2022

Possibly fixed by #3302

We just have to replace the arte.py and hls.py files by these ones?

EDIT: it does not work: I replaced those two files with the “fixed” version above, and it crashed:

$ yt-dlp -F https://www.arte.tv/fr/videos/087404-000-A/1944-il-faut-bombarder-auschwitz/
Traceback (most recent call last):
  File "/usr/bin/yt-dlp", line 5, in <module>
    from yt_dlp import main
  File "/usr/lib/python3.10/site-packages/yt_dlp/__init__.py", line 16, in <module>
    from .options import parseOpts
  File "/usr/lib/python3.10/site-packages/yt_dlp/options.py", line 27, in <module>
    from .downloader.external import list_external_downloaders
  File "/usr/lib/python3.10/site-packages/yt_dlp/downloader/__init__.py", line 34, in <module>
    from .hls import HlsFD
  File "/usr/lib/python3.10/site-packages/yt_dlp/downloader/hls.py", line 9, in <module>
    from ..dependencies import Cryptodome_AES
ModuleNotFoundError: No module named 'yt_dlp.dependencies'

In youtube-dl, we could just change the broken extractor file to make it works again. But yt-dlp does not like it and become unusable. Is it because I use my distro packaged version?

@pukkandan
Copy link
Member

replacing files may or may not work depending on how diverged the branches are. Either pull the branch as-is, or git merge it with master

@Trit34
Copy link

Trit34 commented May 3, 2022

replacing files may or may not work depending on how diverged the branches are. Either pull the branch as-is, or git merge it with master

Okay, it worked with just replacing the arte.py file, not hls.py too.

@Trit34
Copy link

Trit34 commented May 3, 2022

@fstirlitz I think there is still a bug with #3302 : it produces too big files.
For example, with the latest Karambolage, when I downladed it last Sunday (1st May), I got a 162.7 MiB file.

MediaInfo data:

Complete name                            : /.../Karambolage.mp4
Format                                   : MPEG-4
Format profile                           : Base Media / Version 2
Codec ID                                 : mp42 (isom/mp42)
File size                                : 163 MiB
Duration                                 : 11 min 7 s
Overall bit rate mode                    : Variable
Overall bit rate                         : 2 045 kb/s
Encoded date                             : UTC 2022-04-29 03:15:17
Tagged date                              : UTC 2022-04-29 03:15:17

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : Main@L3.1
Format settings                          : CABAC / 3 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 3 frames
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 11 min 7 s
Bit rate mode                            : Variable
Bit rate                                 : 1 915 kb/s
Maximum bit rate                         : 2 200 kb/s
Width                                    : 1 280 pixels
Height                                   : 720 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 25.000 FPS
Standard                                 : PAL
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.083
Stream size                              : 152 MiB (94%)
Encoded date                             : UTC 2022-04-29 03:15:19
Tagged date                              : UTC 2022-04-29 03:15:19
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709
Codec configuration box                  : avcC

Audio
ID                                       : 2
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Codec ID                                 : mp4a-40-2
Duration                                 : 11 min 7 s
Bit rate mode                            : Variable
Bit rate                                 : 125 kb/s
Maximum bit rate                         : 425 kb/s
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 48.0 kHz
Frame rate                               : 46.875 FPS (1024 SPF)
Compression mode                         : Lossy
Stream size                              : 9.97 MiB (6%)
Encoded date                             : UTC 2022-04-29 03:15:19
Tagged date                              : UTC 2022-04-29 03:15:19

But with your fixed arte.py extractor and yt-dlp -f VOF-1971+VOF-program_audio_0-VOF https://www.arte.tv/fr/videos/103994-016-A/karambolage/, I get a 313.7 MiB file:

General
Complete name                            : /.../Karambolage V2.mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom (isom/iso2/avc1/mp41)
File size                                : 314 MiB
Duration                                 : 11 min 8 s
Overall bit rate mode                    : Variable
Overall bit rate                         : 3 935 kb/s
Writing application                      : Lavf59.16.100

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : Main@L3.1
Format settings                          : CABAC / 2 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 2 frames
Format settings, GOP                     : M=4, N=50
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 11 min 8 s
Source duration                          : 11 min 8 s
Bit rate mode                            : Variable
Bit rate                                 : 3 668 kb/s
Maximum bit rate                         : 2 200 kb/s
Width                                    : 1 280 pixels
Height                                   : 720 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Variable
Frame rate                               : 49.903 FPS
Minimum frame rate                       : 25.000 FPS
Maximum frame rate                       : 12 800.000 FPS
Original frame rate                      : 25.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.080
Stream size                              : 292 MiB (93%)
Source stream size                       : 292 MiB (93%)
Language                                 : English
mdhd_Duration                            : 668824
Codec configuration box                  : avcC

Audio
ID                                       : 2
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Codec ID                                 : mp4a-40-2
Duration                                 : 11 min 8 s
Bit rate mode                            : Constant
Bit rate                                 : 258 kb/s
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 44.1 kHz
Frame rate                               : 43.066 FPS (1024 SPF)
Compression mode                         : Lossy
Stream size                              : 20.5 MiB (7%)
Language                                 : French
Default                                  : Yes
Alternate group                          : 1

As you can see, the streams rates are different. Plus, when I run a download, it begins with downloading at max speed for a 16 GiB video file, then it slows down to 4-5 MiB/s and the estimated total size go down as well, and it downloads segment by segment. Is it downloading the video twice? Is it a API issue, or your fix that must be refined?

@pukkandan
Copy link
Member

Probably because u didn't apply e4fa34a

@Trit34
Copy link

Trit34 commented May 3, 2022

Probably because u didn't apply e4fa34a

Well, yt-dlp breaks when I do that (see above). I’d prefer to wait an updated version with all the patches well integrated. Since I just need to DL the weekly Karambolage (for my parents) and I can do it anyway with the extractor script only, I can wait.

Thanks for the answer.

@pukkandan
Copy link
Member

While the exact errors are different, I'm closing this as duplicate of #3428 since the fix for both are the same

@pukkandan pukkandan added duplicate This issue or pull request already exists and removed triage Untriaged issue labels May 3, 2022
@pukkandan pukkandan mentioned this issue May 4, 2022
7 tasks
@pukkandan pukkandan changed the title Unable to download JSON metadata [ArteTV] Unable to download JSON metadata - HTTP Error 404: Not Found May 17, 2022
@pukkandan pukkandan reopened this May 17, 2022
@pukkandan pukkandan removed the duplicate This issue or pull request already exists label May 17, 2022
@pukkandan pukkandan linked a pull request May 17, 2022 that will close this issue
12 tasks
@Trit34
Copy link

Trit34 commented May 19, 2022

So, v2022.05.18 fixed the HLS issue, but the arte.py extractor was not fixed (no mention for it in the changelog, and JSON 404 error again). Replacing it with the fstirlitz version makes it work again.

@rtega
Copy link

rtega commented May 21, 2022

Indeed, not fixed. Can't download with the v2022.05.18 version.

@Trit34
Copy link

Trit34 commented May 23, 2022

Indeed, not fixed. Can't download with the v2022.05.18 version.

@rtega If you are on Linux, replace the arte.py file by this one or that more recent one. Both work (if you are in France, at least).

@rtega
Copy link

rtega commented May 23, 2022

Indeed, that solves it. Thanks!

@king-gizzard
Copy link

@Trit34 welp - this worked fine for one file last night, but now a 90min video sits at a download in the order of 2.5TiB.
tried different formats as well, and the first one worked just fine, idk what changed :/
(trying from germany, on a packaged linux install with the arte.py file manually replaced with the most recent version you mentioned)

@pukkandan
Copy link
Member

@tpikonen PR with the patch is welcome in https://github.com/yt-dlp/FFmpeg-Builds

@denezmarchand
Copy link

Ok ! With the patched version of ffmpeg, I can get get the subtitles embeded in the video file, by example with this command line :
yt-dlp -f VOA-STF-2314+VOA-STF-program_audio_0-VOA --embed-subs --sub-langs fr https://www.arte.tv/fr/videos/105029-001-A/berlin-63-1-6/

The only little problem remaining is that the subtitles are the ones with audiodescription, for people with audio disabilities.

@KGOrphanides
Copy link

@rtega If you are on Linux, replace the arte.py file by this one or that more recent one. Both work (if you are in France, at least).

Either of these versions of arte.py fixes the current issue for me. I'm in Germany. Please merge and release!

Having hit this error with the latest release version, I cloned the repo and made the arte.py patch and I'm still seeing it.
Is this a reversion, or have I missed another fix along the way?

Latest version: 2022.06.22.1, Current version: 2022.06.22.1
yt-dlp is up to date (2022.06.22.1)
yt-dlp -v  https://www.arte.tv/fr/videos/108954-034-A/zeal-ardor/
[debug] Command-line config: ['-v', 'https://www.arte.tv/fr/videos/108954-034-A/zeal-ardor/']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.06.22.1 [a86e01e] (zip)
[debug] Python version 3.10.4 (CPython 64bit) - Linux-5.17.5-76051705-generic-x86_64-with-glibc2.35
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg 4.4.2 (setts), ffprobe 4.4.2, rtmpdump 2.4
[debug] Optional libraries: Cryptodome-3.11.0, brotli-1.0.9, certifi-2020.06.20, mutagen-1.45.1, secretstorage-3.3.1, sqlite3-2.6.0, websockets-10.3, xattr-0.9.7
[debug] Proxy map: {}
[debug] [ArteTV] Extracting URL: https://www.arte.tv/fr/videos/108954-034-A/zeal-ardor/
[ArteTV] 108954-034-A: Downloading JSON metadata
ERROR: [ArteTV] 108954-034-A: Unable to download JSON metadata: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 647, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/arte.py", line 54, in _real_extract
    info = self._download_json(
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 1001, in download_content
    res = getattr(self, download_handle.__name__)(url_or_request, video_id, **kwargs)
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 965, in download_handle
    res = self._download_webpage_handle(
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 833, in _download_webpage_handle
    urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query, expected_status=expected_status)
  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 790, in _request_webpage
    raise ExtractorError(errmsg, cause=err)

  File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 772, in _request_webpage
    return self._downloader.urlopen(self._create_request(url_or_request, data, headers, query))
  File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 3594, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

@Trit34
Copy link

Trit34 commented Jun 27, 2022

Ok ! With the patched version of ffmpeg, I can get get the subtitles embeded in the video file, by example with this command line : yt-dlp -f VOA-STF-2314+VOA-STF-program_audio_0-VOA --embed-subs --sub-langs fr https://www.arte.tv/fr/videos/105029-001-A/berlin-63-1-6/

The only little problem remaining is that the subtitles are the ones with audiodescription, for people with audio disabilities.

Probably an error from them: the only subtitles available seem to be the audiodescription version in French (yt-dlp --list-subs https://www.arte.tv/fr/videos/105029-001-A/berlin-63-1-6/).

@Trit34
Copy link

Trit34 commented Jun 27, 2022

@KGOrphanides

Having hit this error with the latest release version, I cloned the repo and made the arte.py patch and I'm still seeing it. Is this a reversion, or have I missed another fix along the way?

I just have overwritten the arte.py file by the patched version and it worked again. yt-dlp installed from the package in the Arch Linux official repos.

@KGOrphanides
Copy link

@KGOrphanides
I just have overwritten the arte.py file by the patched version and it worked again. yt-dlp installed from the package in the Arch Linux official repos.

Confirmed working.
My oversight (I had a version from pop-os repos already installed).

@gjedeer
Copy link
Contributor

gjedeer commented Jun 28, 2022

This version of the extractor fixes the problem for me too.

@Nicryc
Copy link

Nicryc commented Jun 28, 2022

Working for me too. Will this soon be merged in master?

@Lesmiscore Lesmiscore mentioned this issue Jul 2, 2022
9 tasks
@ludo77
Copy link

ludo77 commented Jul 2, 2022

I got this link: https://raw.githubusercontent.com/yt-dlp/yt-dlp/27cadeeaca7ab2398aed0228fac3ca92bdf2de04/yt_dlp/extractor/arte.py.
I did not succeed in compiling the software with the modification.

I am under ubuntu: I made a make.
make
COLUMNS=80 /usr/bin/env python3 yt_dlp/__main__.py --ignore-config --help | /usr/bin/env python3 devscripts/make_readme.py /usr/bin/env python3 devscripts/make_contributing.py README.md CONTRIBUTING.md /usr/bin/env python3 devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.yml /usr/bin/env python3 devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.yml /usr/bin/env python3 devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.yml /usr/bin/env python3 devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/4_bug_report.yml /usr/bin/env python3 devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/5_feature_request.yml /usr/bin/env python3 devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/6_question.yml /usr/bin/env python3 devscripts/make_supportedsites.py supportedsites.md git shortlog -s -e -n | awk '!(out[$NF]++) { $1="";sub(/^[ \t]+/,""); print}' > .mailmap (read log message from standard input)
then nothing.
Can you help me

@jirkaformanek

This comment was marked as spam.

@ludo77
Copy link

ludo77 commented Jul 3, 2022

I can't compile yt-dlp with the command: make
Please help me.

@Nicryc
Copy link

Nicryc commented Jul 3, 2022

I can't compile yt-dlp with the command: make

Just clone the repo, replace the file yt_dlp/extractor/arte.py with the fixed one, go back to the root of the repo and do a make. You'll have a new executable file yt-dlp that you can use normally.

@ludo77
Copy link

ludo77 commented Jul 4, 2022

When I tried to use the first time: make.
It didn't work.
I redid the command, and it worked.
Thanks

@someziggyman
Copy link

Based on this comment #3622 (comment) and since official ffmpeg won't apply the patch for ages now. Is there a way to maybe add a step inside --convert-subs function to just remove all the "style" and "region" junk from vtt, before conversion takes place. Guess, this is a bit of an overkill, but may close the problem for regular folk until the patch https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/296353.html is applied. Just saying.

@jlemonde
Copy link

jlemonde commented Jul 13, 2022

I managed to build yt-dlp as well, using the patched arte.py file. Thank you so much!! :)

Now I'm still waiting on it to be fixed in an official release because my relatives are relying on it and surely won't try to build it themselves. Do you know when this will be done?

@Trit34
Copy link

Trit34 commented Jul 19, 2022

New version of yt-dlp, Arte extractor still not fixed, we still have to fix it ourselves with the patched files by fstirlitz… Well, as long as this trick still works, at least…

@jlemonde
Copy link

Yes indeed, that's sad. Can someone please put the fix to the next version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website
Projects
None yet
Development

Successfully merging a pull request may close this issue.