Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[generic] Detect hls livestreams #6705

Closed
9 of 10 tasks
chrizilla opened this issue Apr 3, 2023 · 11 comments · Fixed by #6775
Closed
9 of 10 tasks

[generic] Detect hls livestreams #6705

chrizilla opened this issue Apr 3, 2023 · 11 comments · Fixed by #6775
Labels
enhancement New feature or request

Comments

@chrizilla
Copy link

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

  • I'm reporting a bug unrelated to a specific site
  • I've verified that I'm running yt-dlp version 2023.03.04 (update instructions) or later (specify commit)
  • I've checked that all provided URLs are playable in a browser with the same IP and same login details
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched known issues and the bugtracker for similar issues including closed ones. DO NOT post duplicates
  • I've read the guidelines for opening an issue

Provide a description that is worded well enough to be understood

I downloaded a part of the (endless) CNN live stream from https://www.newslive.com/featured/cnn-online.html

Half an hour later I downloaded another part of this livestream. Unfortunately it has overwritten my 1st download.

Unluckily I have already closed the window, so I cannot post the log from this specific accident.
I tried to reproduce the problem, but this time the 2nd download did NOT overwrite the 1st one but created a separate file. I don't know what produced the problem before and why it has not reoccurred, but I thought I'd let you know (because overwriting previous downloads is a dangerous bug IMO).

Do you have an idea why it could have happened ?

The only thing I noticed is that this time the 1st download had the format code "790" in the filename and the 2nd download had the format code "850" in the filename. Maybe sometimes the format codes of the first and second download coincide and that's when the 2nd download overwrites the 1st?

Other than that, I have no idea.

Below I attach the output log of my unsuccessful attempt to recreate the "overwrite bug".
If I encounter the bug again, I'll try to catch the log.

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

[debug] Command-line config: ['--downloader', 'ffmpeg', '--hls-use-mpegts', 'https://www.newslive.com/featured/cnn-online.html']
[debug] Portable config "c:\hp\-\!portableapps\yt-dlp\yt-dlp.conf": ['--verbose', '--format-sort', 'hasvid,ie_pref,quality,res,hdr,codec,size,br,asr,ext,fps,hasaud,lang,source,proto,id', '-f', 'bestvideo*+bestaudio/best', '--paths', 'c:\\hp\\-\\#YT', '-o', '.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\%(title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'thumbnail:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\thumbs\\thumb.%(ext)s', '-o', 'subtitle:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\subs\\%(alt_title,title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'infojson:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\!nfo\\%(alt_title,title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'link:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\!nfo\\%(title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'description:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\!nfo\\%(title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'annotation:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\!nfo\\%(title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'pl_infojson:.\\%(playlist|)s\\playlist.%(ext)s', '-o', 'pl_thumbnail:.\\%(playlist|)s\\folder.%(ext)s', '-o', 'pl_description:.\\%(playlist|)s\\playlist description.%(ext)s', '--output-na-placeholder', '~~~', '--write-info-json', '--write-description', '--write-annotations', '--no-overwrites', '--no-post-overwrites', '--merge-output-format', 'mkv', '--remux-video', 'mkv', '--write-subs', '--write-auto-subs', '--all-subs', '--sub-langs', 'all', '--sub-format', 'all/ttml/vtt/best', '--write-thumbnail', '--write-all-thumbnails', '--add-metadata', '--embed-chapters', '--add-chapters', '--write-link', '--write-url-link', '--write-desktop-link', '--no-check-certificate', '--retries', 'infinite', '--geo-bypass', '--geo-bypass-country', 'AT', '--no-clean-infojson', '--write-comments', '--audio-multistreams', '--no-playlist']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.03.04 [392389b7d] (win_exe)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.19045-SP0 (OpenSSL 1.1.1k  25 Mar 2021)
[debug] exe versions: ffmpeg N-105208-gb24f0c82b3-20220108 (setts), ffprobe N-105208-gb24f0c82b3-20220108
[debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1786 extractors
[debug] Using fake IP 77.117.144.147 (AT) as X-Forwarded-For
[generic] Extracting URL: https://www.newslive.com/featured/cnn-online.html
[generic] cnn-online: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] cnn-online: Extracting information
[debug] Looking for embeds
[generic] cnn-online: Downloading m3u8 information
WARNING: [generic] Failed to download m3u8 information: HTTP Error 403: Forbidden
[debug] Identified a JW Player JS loader
[generic] index: Downloading m3u8 information
[debug] Sort order given by user: hasvid, ie_pref, quality, res, hdr, codec, size, br, asr, ext, fps, hasaud, lang, source, proto, id
[debug] Formats sorted by: hasvid, ie_pref, quality, res, hdr, vcodec, acodec, filesize, fs_approx, tbr, vbr, abr, asr, vext, aext, fps, hasaud, lang, source, proto, id, channels
[info] index: Downloading 1 format(s): 850
[info] Writing video description to: c:\hp\-\#YT\2023-04-03 CNN Live Stream Online - Watch CNN News USA Live for free\!nfo\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [850].description
[info] There's no subtitles for the requested languages
[info] Downloading video thumbnail 0 ...
[info] Writing video thumbnail 0 to: c:\hp\-\#YT\2023-04-03 CNN Live Stream Online - Watch CNN News USA Live for free\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [850].png
[info] Writing video metadata as JSON to: c:\hp\-\#YT\2023-04-03 CNN Live Stream Online - Watch CNN News USA Live for free\!nfo\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [850].info.json
WARNING: There are no annotations to write.
[info] Writing internet shortcut (.url) to: c:\hp\-\#YT\2023-04-03 CNN Live Stream Online - Watch CNN News USA Live for free\!nfo\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [850].url
[info] Writing internet shortcut (.desktop) to: c:\hp\-\#YT\2023-04-03 CNN Live Stream Online - Watch CNN News USA Live for free\!nfo\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [850].desktop
[debug] Invoking ffmpeg downloader on "https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/mono.m3u8"
[download] Destination: c:\hp\-\#YT\2023-04-03 CNN Live Stream Online - Watch CNN News USA Live for free\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [850].mp4
[debug] ffmpeg command line: ffmpeg -y -loglevel verbose -headers "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Sec-Fetch-Mode: navigate
Referer: https://www.newslive.com/featured/cnn-online.html
X-Forwarded-For: 77.117.144.147
" -i "https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/mono.m3u8" -c copy -f mpegts "file:c:\hp\-\#YT\2023-04-03 CNN Live Stream Online - Watch CNN News USA Live for free\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [850].mp4.part"
ffmpeg version N-105208-gb24f0c82b3-20220108 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 11.2.0 (crosstool-NG 1.24.0.498_5075e1f)
  configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-avisynth --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libass --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librist --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libmfx --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20220108
  libavutil      57. 18.100 / 57. 18.100
  libavcodec     59. 20.100 / 59. 20.100
  libavformat    59. 17.100 / 59. 17.100
  libavdevice    59.  5.100 / 59.  5.100
  libavfilter     8. 25.100 /  8. 25.100
  libswscale      6.  5.100 /  6.  5.100
  libswresample   4.  4.100 /  4.  4.100
  libpostproc    56.  4.100 / 56.  4.100
[tcp @ 000002a42e8b3e80] Starting connection attempt to 156.146.61.140 port 443
[tcp @ 000002a42e8b3e80] Successfully connected to 156.146.61.140 port 443
[hls @ 000002a42e8b29c0] Skip ('#EXT-X-VERSION:3')
[hls @ 000002a42e8b29c0] Skip ('#EXT-X-PROGRAM-DATE-TIME:2023-04-03T21:37:50Z')
[hls @ 000002a42e8b29c0] HLS request for url 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/37/56-05867.ts', offset 0, playlist 0
[hls @ 000002a42e8b29c0] Opening 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/37/56-05867.ts' for reading
[tcp @ 000002a42e8cfe40] Starting connection attempt to 156.146.61.140 port 443
[tcp @ 000002a42e8cfe40] Successfully connected to 156.146.61.140 port 443
[hls @ 000002a42e8b29c0] HLS request for url 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/38/01-05533.ts', offset 0, playlist 0
[hls @ 000002a42e8b29c0] Opening 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/38/01-05533.ts' for reading
[tcp @ 000002a42d28ed80] Starting connection attempt to 156.146.61.140 port 443
[tcp @ 000002a42d28ed80] Successfully connected to 156.146.61.140 port 443
[h264 @ 000002a42e8fbd40] Reinit context to 1280x720, pix_fmt: yuv420p
Input #0, hls, from 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/mono.m3u8':
  Duration: N/A, start: 80338.769733, bitrate: N/A
  Program 0
    Metadata:
      variant_bitrate : 0
  Stream #0:0: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp
    Metadata:
      variant_bitrate : 0
  Stream #0:1: Video: h264 (Main), 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, left), 1280x720 [SAR 1:1 DAR 16:9], 30 tbr, 90k tbn
    Metadata:
      variant_bitrate : 0
[mpegts @ 000002a42f4c7c00] service 1 using PCR in pid=256, pcr_period=0ms
[mpegts @ 000002a42f4c7c00] muxrate VBR, sdt every 500 ms, pat/pmt every 100 ms
Output #0, mpegts, to 'file:c:\hp\-\#YT\2023-04-03 CNN Live Stream Online - Watch CNN News USA Live for free\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [850].mp4.part':
  Metadata:
    encoder         : Lavf59.17.100
  Stream #0:0: Video: h264 (Main), 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, left), 1280x720 (0x0) [SAR 1:1 DAR 16:9], q=2-31, 30 tbr, 90k tbn
    Metadata:
      variant_bitrate : 0
  Stream #0:1: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp
    Metadata:
      variant_bitrate : 0
Stream mapping:
  Stream #0:1 -> #0:0 (copy)
  Stream #0:0 -> #0:1 (copy)
Press [q] to stop, [?] for help
[hls @ 000002a42e8b29c0] HLS request for url 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/38/07-05300.ts', offset 0, playlist 0
[https @ 000002a42e8ccf40] Opening 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/38/07-05300.ts' for reading
[tcp @ 000002a42e980100] Starting connection attempt to 156.146.61.140 port 443/s speed=2.13x
[tcp @ 000002a42e980100] Successfully connected to 156.146.61.140 port 443
[hls @ 000002a42e8b29c0] Skip ('#EXT-X-VERSION:3')
[hls @ 000002a42e8b29c0] Skip ('#EXT-X-PROGRAM-DATE-TIME:2023-04-03T21:37:56Z')
[hls @ 000002a42e8b29c0] HLS request for url 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/38/12-06700.ts', offset 0, playlist 0
[https @ 000002a42e8e2140] Opening 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/38/12-06700.ts' for reading
[https @ 000002a42f6d8e00] Opening 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/mono.m3u8' for reading
[hls @ 000002a42e8b29c0] Skip ('#EXT-X-VERSION:3')
[hls @ 000002a42e8b29c0] Skip ('#EXT-X-PROGRAM-DATE-TIME:2023-04-03T21:38:01Z')
[hls @ 000002a42e8b29c0] HLS request for url 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/38/19-06133.ts', offset 0, playlist 0
[https @ 000002a42e8e2140] Opening 'https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/2023/04/03/21/38/19-06133.ts' for reading
frame=  758 fps= 57 q=-1.0 Lsize=    2556kB time=00:00:25.25 bitrate= 828.9kbits/s speed= 1.9x
video:1882kB audio:414kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 11.310313%
Input file #0 (https://1942468522.rsc.cdn77.org/U_xwrRLUQWPzUK6Mq3OV_g==,1680570756/1942468522/tracks-v1a1/mono.m3u8):
  Input stream #0:0 (audio): 1184 packets read (423644 bytes);
  Input stream #0:1 (video): 758 packets read (1927405 bytes);
  Total: 1942 packets (2351049 bytes) demuxed
Output file #0 (file:c:\hp\-\#YT\2023-04-03 CNN Live Stream Online - Watch CNN News USA Live for free\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [850].mp4.part):
  Output stream #0:0 (video): 758 packets muxed (1927405 bytes);
  Output stream #0:1 (audio): 1184 packets muxed (423644 bytes);
  Total: 1942 packets (2351049 bytes) muxed
[AVIOContext @ 000002a42e96af40] Statistics: 2616960 bytes written, 0 seeks, 10 writeouts
[AVIOContext @ 000002a42e8d5200] Statistics: 1475178 bytes read, 0 seeks
[AVIOContext @ 000002a42d28e680] Statistics: 1182144 bytes read, 0 seeks
[AVIOContext @ 000002a42e9749c0] Statistics: 602 bytes read, 0 seeks
[AVIOContext @ 000002a42e8b4000] Statistics: 301 bytes read, 0 seeks
Exiting normally, received signal 2.

ERROR: Interrupted by user
@chrizilla chrizilla added bug Bug that is not site-specific triage Untriaged issue labels Apr 3, 2023
@pukkandan
Copy link
Member

We normally add timestamp to livestreams. But due to #6704 (comment), not for generic extractor. You can add epoch to your output template.

PS: There may be some merit in using our "guess" for livestream to set the field since it works most of the time. But if we do that, there will be no way for the user to handle the cases when it detects incorrectly. Maybe an extractor-arg could be useful?

@pukkandan pukkandan added question Question and removed bug Bug that is not site-specific triage Untriaged issue labels Apr 4, 2023
@chrizilla
Copy link
Author

chrizilla commented Apr 4, 2023

@pukkandan : We normally add timestamp to livestreams. But due to #6704 (comment), not for generic extractor. You can add epoch to your output template.

Ok, but why are my downloads being overwritten despite explicit instructions for yt-dlp to not do so ?
This is part of my config file, as can also be seen in the log posted above:

# Do not overwrite any files
--no-overwrites

# Do not overwrite post-processed files
--no-post-overwrites

@chrizilla
Copy link
Author

chrizilla commented Apr 4, 2023

@pukkandan : There may be some merit in using our "guess" for livestream to set the field since it works most of the time.

I agree.

@pukkandan : But if we do that, there will be no way for the user to handle the cases when it detects incorrectly. Maybe an extractor-arg could be useful?

I assume you don't want to ask the user:
Is this a live stream [Y/N] ?

Other than that, what would be the downside of a parameter like this ?
--is-livestream [NO/YES/AUTO]

  • NO tells yt-dlp that the URL is not a live stream
  • YES tells yt-dlp that the URL is a live stream
  • AUTO (could also be named GUESS) tells yt-dlp to make an educated guess (as discussed above)

If you like the idea, could you split this into a new feature request (just so to keep it separated from the "overwrite problem)" ?

@pukkandan
Copy link
Member

Other than that, what would be the downside of a parameter like this ? --is-livestream [NO/YES/AUTO]

It doesn't work or even make sense for supported sites. Maybe as an --extractor-arg generic:

I assume you don't want to ask the user:
Is this a live stream [Y/N] ?

No. yt-dlp is never interactive unless user specifically asks it to be (e.g -f-)

@pukkandan
Copy link
Member

Ok, but why are my downloads being overwritten despite explicit instructions for yt-dlp to not do so ?

Sorry, seems I misunderstood your original report entirely. It doesn't seem like the file is being overwritten from the logs. Can you show me the folder contents before the download?

@pukkandan pukkandan changed the title livestream download should not overwrite previous downloads Problems with generic livestreams Apr 5, 2023
@chrizilla
Copy link
Author

chrizilla commented Apr 6, 2023

Sorry, seems I misunderstood your original report entirely.

There are 2 separate issues at hand (which make following this thread a bit confusing):

  1. bug: files are being overwritten (despite parameters telling yt-dlp not to do so)
  2. feature request: to offer the user an option to automatically detect/handle live streams for the generic extractor

I don't see any direct connection. Especially so since repairing the bug doesn't do anything for the feature request to detect generic live streams.

You are the boss @pukkandan , but I think it would be easier to follow if those separate issues were split into separate ... well, issues. Maybe keep #6705 as the bug report and move the comments about the feature request to #6704 (and convert it to a feature request) ?

@chrizilla
Copy link
Author

chrizilla commented Apr 6, 2023

With regards to the overwrite bug:

Ok, but why are my downloads being overwritten despite explicit instructions for yt-dlp to not do so ?

It doesn't seem like the file is being overwritten from the logs.

Sure. Because:

Below I attach the output log of my unsuccessful attempt to recreate the "overwrite bug".
If I encounter the bug again, I'll try to catch the log.

I only added the (IMO unhelpful) log, because otherwise I get called out for the missing log ...

Can you show me the folder contents before the download?

I tried again to reproduce the bug. I think I managed to do so:

folder content before being overwritten:
CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].mp4.part size: 246656 date: 03:38

folder content after being overwritten:
CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].mp4.part size: 131788 date: 03:39

log:

[debug] Command-line config: ['--downloader', 'ffmpeg', '--hls-use-mpegts', '--live-from-start', 'https://www.newslive.com/featured/cnn-online.html']
[debug] Portable config "c:\hp\-\!portableapps\yt-dlp\yt-dlp.conf": ['--verbose', '--format-sort', 'hasvid,ie_pref,quality,res,hdr,codec,size,br,asr,ext,fps,hasaud,lang,source,proto,id', '-f', 'bestvideo*+bestaudio/best', '--paths', 'c:\\hp\\-\\#YT', '-o', '.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\%(title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'thumbnail:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\thumbs\\thumb.%(ext)s', '-o', 'subtitle:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\subs\\%(alt_title,title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'infojson:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\!nfo\\%(alt_title,title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'link:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\!nfo\\%(title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'description:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\!nfo\\%(title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'annotation:.\\%(playlist|)s\\%(playlist_index|)02d%(playlist_index& = |)s%(release_date>%Y-%m-%d,release_timestamp>%Y-%m-%d,upload_date>%Y-%m-%d,timestamp>%Y-%m-%d|#nodate#)s %(alt_title,title)s\\!nfo\\%(title)s [%(extractor)s.%(extractor_key)s.%(id)s] [%(format_id)s].%(ext)s', '-o', 'pl_infojson:.\\%(playlist|)s\\playlist.%(ext)s', '-o', 'pl_thumbnail:.\\%(playlist|)s\\folder.%(ext)s', '-o', 'pl_description:.\\%(playlist|)s\\playlist description.%(ext)s', '--output-na-placeholder', '~~~', '--write-info-json', '--write-description', '--write-annotations', '--no-overwrites', '--no-post-overwrites', '--merge-output-format', 'mkv', '--remux-video', 'mkv', '--write-subs', '--write-auto-subs', '--all-subs', '--sub-langs', 'all', '--sub-format', 'all/ttml/vtt/best', '--write-thumbnail', '--write-all-thumbnails', '--embed-metadata', '--embed-chapters', '--no-embed-info-json', '--write-link', '--write-url-link', '--write-desktop-link', '--no-check-certificate', '--retries', 'infinite', '--geo-bypass', '--geo-bypass-country', 'AT', '--no-clean-infojson', '--write-comments', '--audio-multistreams', '--no-playlist']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.03.04 [392389b7d] (win_exe)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.19045-SP0 (OpenSSL 1.1.1k  25 Mar 2021)
[debug] exe versions: ffmpeg N-105208-gb24f0c82b3-20220108 (setts), ffprobe N-105208-gb24f0c82b3-20220108
[debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1786 extractors
[debug] Using fake IP 77.119.42.191 (AT) as X-Forwarded-For
[generic] Extracting URL: https://www.newslive.com/featured/cnn-online.html
[generic] cnn-online: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] cnn-online: Extracting information
[debug] Looking for embeds
[generic] cnn-online: Downloading m3u8 information
WARNING: [generic] Failed to download m3u8 information: HTTP Error 403: Forbidden
[debug] Identified a JW Player JS loader
[generic] index: Downloading m3u8 information
[debug] Sort order given by user: hasvid, ie_pref, quality, res, hdr, codec, size, br, asr, ext, fps, hasaud, lang, source, proto, id
[debug] Formats sorted by: hasvid, ie_pref, quality, res, hdr, vcodec, acodec, filesize, fs_approx, tbr, vbr, abr, asr, vext, aext, fps, hasaud, lang, source, proto, id, channels
[info] index: Downloading 1 format(s): 820
[info] Video description is already present
[info] There's no subtitles for the requested languages
[info] Video thumbnail is already present
[info] Video metadata is already present
WARNING: There are no annotations to write.
[info] Writing internet shortcut (.url) to: c:\hp\-\#YT\2023-04-06 CNN Live Stream Online - Watch CNN News USA Live for free\!nfo\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].url
[info] Writing internet shortcut (.desktop) to: c:\hp\-\#YT\2023-04-06 CNN Live Stream Online - Watch CNN News USA Live for free\!nfo\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].desktop
[debug] Invoking ffmpeg downloader on "https://1942468522.rsc.cdn77.org/U5aziT9GwfXwjP_H1641Ig==,1680757960/1942468522/tracks-v1a1/mono.m3u8"
[download] Destination: c:\hp\-\#YT\2023-04-06 CNN Live Stream Online - Watch CNN News USA Live for free\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].mp4
[debug] ffmpeg command line: ffmpeg -y -loglevel verbose -headers "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Sec-Fetch-Mode: navigate
Referer: https://www.newslive.com/featured/cnn-online.html
X-Forwarded-For: 77.119.42.191
" -i "https://1942468522.rsc.cdn77.org/U5aziT9GwfXwjP_H1641Ig==,1680757960/1942468522/tracks-v1a1/mono.m3u8" -c copy -f mpegts "file:c:\hp\-\#YT\2023-04-06 CNN Live Stream Online - Watch CNN News USA Live for free\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].mp4.part"
ffmpeg version N-105208-gb24f0c82b3-20220108 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 11.2.0 (crosstool-NG 1.24.0.498_5075e1f)
  configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-avisynth --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libass --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librist --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libmfx --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20220108
  libavutil      57. 18.100 / 57. 18.100
  libavcodec     59. 20.100 / 59. 20.100
  libavformat    59. 17.100 / 59. 17.100
  libavdevice    59.  5.100 / 59.  5.100
  libavfilter     8. 25.100 /  8. 25.100
  libswscale      6.  5.100 /  6.  5.100
  libswresample   4.  4.100 /  4.  4.100
  libpostproc    56.  4.100 / 56.  4.100
[tcp @ 0000025f78a43e80] Starting connection attempt to 156.146.61.140 port 443
[tcp @ 0000025f78a43e80] Successfully connected to 156.146.61.140 port 443
[hls @ 0000025f78a429c0] Skip ('#EXT-X-VERSION:3')
[hls @ 0000025f78a429c0] Skip ('#EXT-X-PROGRAM-DATE-TIME:2023-04-06T01:38:54Z')
[hls @ 0000025f78a429c0] HLS request for url 'https://1942468522.rsc.cdn77.org/U5aziT9GwfXwjP_H1641Ig==,1680757960/1942468522/tracks-v1a1/2023/04/06/01/39/00-06000.ts', offset 0, playlist 0
[hls @ 0000025f78a429c0] Opening 'https://1942468522.rsc.cdn77.org/U5aziT9GwfXwjP_H1641Ig==,1680757960/1942468522/tracks-v1a1/2023/04/06/01/39/00-06000.ts' for reading
[tcp @ 0000025f78a5fe40] Starting connection attempt to 156.146.61.140 port 443
[tcp @ 0000025f78a5fe40] Successfully connected to 156.146.61.140 port 443
[hls @ 0000025f78a429c0] HLS request for url 'https://1942468522.rsc.cdn77.org/U5aziT9GwfXwjP_H1641Ig==,1680757960/1942468522/tracks-v1a1/2023/04/06/01/39/06-06000.ts', offset 0, playlist 0
[hls @ 0000025f78a429c0] Opening 'https://1942468522.rsc.cdn77.org/U5aziT9GwfXwjP_H1641Ig==,1680757960/1942468522/tracks-v1a1/2023/04/06/01/39/06-06000.ts' for reading
[tcp @ 0000025f76eced80] Starting connection attempt to 156.146.61.140 port 443
[tcp @ 0000025f76eced80] Successfully connected to 156.146.61.140 port 443
[h264 @ 0000025f78a87cc0] Reinit context to 1280x720, pix_fmt: yuv420p
Input #0, hls, from 'https://1942468522.rsc.cdn77.org/U5aziT9GwfXwjP_H1641Ig==,1680757960/1942468522/tracks-v1a1/mono.m3u8':
  Duration: N/A, start: 76716.142689, bitrate: N/A
  Program 0
    Metadata:
      variant_bitrate : 0
  Stream #0:0: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp
    Metadata:
      variant_bitrate : 0
  Stream #0:1: Video: h264 (Main), 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, left), 1280x720 [SAR 1:1 DAR 16:9], 30 tbr, 90k tbn
    Metadata:
      variant_bitrate : 0
[mpegts @ 0000025f78b09400] service 1 using PCR in pid=256, pcr_period=0ms
[mpegts @ 0000025f78b09400] muxrate VBR, sdt every 500 ms, pat/pmt every 100 ms
Output #0, mpegts, to 'file:c:\hp\-\#YT\2023-04-06 CNN Live Stream Online - Watch CNN News USA Live for free\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].mp4.part':
  Metadata:
    encoder         : Lavf59.17.100
  Stream #0:0: Video: h264 (Main), 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, left), 1280x720 (0x0) [SAR 1:1 DAR 16:9], q=2-31, 30 tbr, 90k tbn
    Metadata:
      variant_bitrate : 0
  Stream #0:1: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp
    Metadata:
      variant_bitrate : 0
Stream mapping:
  Stream #0:1 -> #0:0 (copy)
  Stream #0:0 -> #0:1 (copy)
Press [q] to stop, [?] for help
frame=   37 fps=0.0 q=-1.0 Lsize=     129kB time=00:00:01.23 bitrate= 852.1kbits/s speed=2.81x
video:96kB audio:20kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 11.035471%
Input file #0 (https://1942468522.rsc.cdn77.org/U5aziT9GwfXwjP_H1641Ig==,1680757960/1942468522/tracks-v1a1/mono.m3u8):
  Input stream #0:0 (audio): 58 packets read (20343 bytes);
  Input stream #0:1 (video): 37 packets read (98347 bytes);
  Total: 95 packets (118690 bytes) demuxed
Output file #0 (file:c:\hp\-\#YT\2023-04-06 CNN Live Stream Online - Watch CNN News USA Live for free\CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].mp4.part):
  Output stream #0:0 (video): 37 packets muxed (98347 bytes);
  Output stream #0:1 (audio): 58 packets muxed (20343 bytes);
  Total: 95 packets (118690 bytes) muxed
[AVIOContext @ 0000025f78a69f40] Statistics: 131788 bytes written, 0 seeks, 1 writeouts
[AVIOContext @ 0000025f76ece680] Statistics: 179375 bytes read, 0 seeks
[AVIOContext @ 0000025f78a65200] Statistics: 0 bytes read, 0 seeks
[AVIOContext @ 0000025f78a44000] Statistics: 301 bytes read, 0 seeks
Exiting normally, received signal 2.

@chrizilla chrizilla mentioned this issue Apr 6, 2023
8 tasks
@chrizilla
Copy link
Author

Regarding the feature request:

I think it would be helpful to users if a more user-friendly way could be found than expecting users to memorize or look up the required parameters for a live stream and manually add them each time.

what would be the downside of a parameter like this ? --is-livestream [NO/YES/AUTO]

It doesn't work or even make sense for supported sites. Maybe as an --extractor-arg generic:

Yes, I was mainly referring to the generic extractor. Or more generally speaking to any situation where yt-dlp is not 100% sure but >90%.

@pukkandan
Copy link
Member

There are 2 separate issues at hand (which make following this thread a bit confusing):

  1. bug: files are being overwritten (despite parameters telling yt-dlp not to do so)
  2. feature request: to offer the user an option to automatically detect/handle live streams for the generic extractor

I merged the issues because I suspected there is no bug in --overwrite but instead something about the generic livestream is causing the problem. And I turned out to be right.

folder content before being overwritten: CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].mp4.part size: 246656 date: 03:38

folder content after being overwritten: CNN Live Stream Online - Watch CNN News USA Live for free [generic.Generic.index] [820].mp4.part size: 131788 date: 03:39

It is the part file that is being overwritten, not the finalized download. So the option itself is working as intended. However, I see what is going on here. Since the video is not detected as live, when you ctrl+c, the resulting file will not be finalized and left as .part.

So implementing our 2 suggestions should fix all the issues reported here and in #6704

  • In GenericIE, treat "possibly live" video as live.
  • Add something like --extractor-arg generic:live_status=False to reverse this

@pukkandan pukkandan added enhancement New feature or request and removed question Question labels Apr 6, 2023
@pukkandan pukkandan changed the title Problems with generic livestreams [generic] Detect hls livestreams Apr 6, 2023
@chrizilla
Copy link
Author

chrizilla commented Apr 6, 2023

It is the part file that is being overwritten

Yes.

So the option itself is working as intended.

Why do you say it's "working" when files are unintentionally overwritten ?
This should NEVER occur with --no-overwrites --no-post-overwrites , no ?
It's still not clear why this is happening.

However, I see what is going on here. Since the video is not detected as live, when you ctrl+c, the resulting file will not be finalized and left as .part.

Yes. From my experiments the livestream remains a partfile unless 1 GB is reached upon which point the recording stops automatically (for whatever reason) and the partfile is then finalized.

So implementing our 2 suggestions should fix all the issues reported here and in #6704

That's great! 👍

* In GenericIE, treat "possibly live" video as live.
* Add something like `--extractor-arg generic:live_status=False` to reverse this

Oh, that sounds very good!
(I assume the parameter can be also set to TRUE to override false negative detection?)

@pukkandan
Copy link
Member

(I assume the parameter can be also set to TRUE to override false negative detection?)

Yes

So the option itself is working as intended.

Why do you say it's "working" when files are unintentionally overwritten ? This should NEVER occur with --no-overwrites --no-post-overwrites , no ?

Temporary files are deleted if they can't be resumed, irrespective of any options.

bashonly added a commit that referenced this issue Apr 13, 2023
* Extract duration for non-live generic HLS videos
* Add extractor-arg `is_live` to bypass live HLS check

Closes #6705
Authored by: bashonly
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this issue Apr 21, 2024
* Extract duration for non-live generic HLS videos
* Add extractor-arg `is_live` to bypass live HLS check

Closes yt-dlp#6705
Authored by: bashonly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants