[Feature request] Handle Long filenames in default template and temporary files #1136

tylerszabo · 2021-10-01T02:20:41Z

Checklist

I'm reporting a bug unrelated to a specific site
I've verified that I'm running yt-dlp version 2021.09.25
I've checked that all provided URLs are alive and playable in a browser
The provided URLs do not contain any DRM to the best of my knowledge
I've checked that all URLs and arguments with special characters are properly quoted or escaped
I've searched the bugtracker for similar bug reports including closed ones
I've read bugs section in FAQ

Verbose log

[debug] Command-line config: ['--verbose', 'https://twitter.com/NASA/status/1443572363757559808', '-o', 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx']
[debug] Encodings: locale cp1252, fs utf-8, out utf-8, pref cp1252
[debug] yt-dlp version 2021.09.25 (source)
[debug] Plugins: ['SamplePluginIE', 'SamplePluginPP']
[debug] Git HEAD: ad095c428
[debug] Python version 3.8.10 (CPython 64bit) - Windows-10-10.0.19043-SP0
[debug] exe versions: ffmpeg n4.4-80-gbf87bdd3f6-20210811, ffprobe n4.4-80-gbf87bdd3f6-20210811, phantomjs 2.1.1
[debug] Optional libraries: sqlite
[debug] Proxy map: {}
[debug] [twitter] Extracting URL: https://twitter.com/NASA/status/1443572363757559808
[twitter] 1443572363757559808: Downloading guest token
[twitter] 1443572363757559808: Downloading JSON metadata
[twitter] 1443572363757559808: Downloading m3u8 information
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, vcodec:vp9.2(10), acodec, filesize, fs_approx, tbr, vbr, abr, asr, proto, vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] 1443572363757559808: Downloading 1 format(s): hls-2176
[debug] Invoking downloader on "https://video.twimg.com/amplify_video/1443570535904935945/pl/1280x720/eB7trHC2QS5NrGUL.m3u8"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 22
[download] Destination: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[download]  40.9% of ~5.05MiB at  9.73MiB/s ETA 00:01 ERROR: unable to open for writing: [Errno 22] Invalid argument: 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.part-Frag10.part'
Traceback (most recent call last):
  File "F:\source\repos\yt-dlp\yt_dlp\downloader\http.py", line 262, in download
    ctx.stream, ctx.tmpfilename = sanitize_open(
  File "F:\source\repos\yt-dlp\yt_dlp\utils.py", line 2068, in sanitize_open
    stream = open(encodeFilename(filename), open_mode)
OSError: [Errno 22] Invalid argument: 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.part-Frag10.part'

ERROR: unable to download video data: [Errno 2] No such file or directory: 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.part-Frag10'
Traceback (most recent call last):
  File "F:\source\repos\yt-dlp\yt_dlp\YoutubeDL.py", line 2758, in process_info
    success, real_download = self.dl(temp_filename, info_dict)
  File "F:\source\repos\yt-dlp\yt_dlp\YoutubeDL.py", line 2475, in dl
    return fd.download(name, new_info, subtitle)
  File "F:\source\repos\yt-dlp\yt_dlp\downloader\common.py", line 408, in download
    return self.real_download(filename, info_dict), True
  File "F:\source\repos\yt-dlp\yt_dlp\downloader\hls.py", line 350, in real_download
    return self.download_and_append_fragments(ctx, fragments, info_dict)
  File "F:\source\repos\yt-dlp\yt_dlp\downloader\fragment.py", line 478, in download_and_append_fragments
    frag_content, frag_index = download_fragment(fragment, ctx)
  File "F:\source\repos\yt-dlp\yt_dlp\downloader\fragment.py", line 418, in download_fragment
    success, frag_content = self._download_fragment(ctx, fragment['url'], info_dict, headers)
  File "F:\source\repos\yt-dlp\yt_dlp\downloader\fragment.py", line 132, in _download_fragment
    return True, self._read_fragment(ctx)
  File "F:\source\repos\yt-dlp\yt_dlp\downloader\fragment.py", line 135, in _read_fragment
    down, frag_sanitized = sanitize_open(ctx['fragment_filename_sanitized'], 'rb')
  File "F:\source\repos\yt-dlp\yt_dlp\utils.py", line 2068, in sanitize_open
    stream = open(encodeFilename(filename), open_mode)
FileNotFoundError: [Errno 2] No such file or directory: 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.part-Frag10'

Description

The defaults can result in filenames that are too long and confusingly this can also occur in temporary files. The example here shows a 239 character filename fails when the .part-Frag10 suffix causes the filename to exceed 255 characters.

The workaround is to explicitly specify an output template that will not become too long even when suffixes are added. The default %(title)s [%(id)s].%(ext)s and fallback %(title)s-%(id)s.%(ext)s can both exceed 255 characters, especially with videos in long tweets.

In the example provided the output template is explicitly set to a filename that will exactly exceed 255 characters on fragment 10 to better illustrate the issue. However, omitting template or explicitly using the default %(title)s [%(id)s].%(ext)s will still result in an immediate failure on fragment 1.

This is related to #1003 and would appear to be cross-platform issue (and depends on the filesystem for the target files rather than the OS or runtime).

Work published by NASA is in the public domain so there are no licensing concerns with testing using this URL.

There are various codepaths that expect to be able to add a suffix to a temporary file and the temporary filename is based on the destination filename. By adding an option to specify tempfile format such as %(extractor)s-%(id)s.%(ext)s and altering the default file template to set some high but sane limits this could be mitigated for most users and still allow advanced users to explicitly specify long filenames.

The text was updated successfully, but these errors were encountered:

pukkandan · 2021-10-01T09:38:08Z

Are you asking only about the default? or about user-provided templates too? For user provided templates, you can do something like -o %(title).200B.%(ext)s

pukkandan · 2021-10-01T09:40:45Z

Related: ytdl-org/youtube-dl#29989

tylerszabo · 2021-10-01T09:44:12Z

Improved defaults would be nice but I thought the effects of temp files (since they will always be longer than the main output file) was the more confusing aspect and there is currently no mechanism for specifying a template for temp files (they just inherit the output template).

I experimented with a number of techniques for deterministically shortening but they were not ideal - any format with length limits would ultimately end up being a guess but some limits as you suggested with maybe ~160 characters for title and ~40 for ID might be sensible. Then if temp files could have a different format maybe as high as ~180 for title would be okay without risking a sudden error at chunk 10000.

GenericGoose · 2021-10-13T20:20:29Z

annoyingly, when using yt-dlp Ive had an issue where the filename which was video title + chapter name was too long, which only gave a confusing 'no such file or directory' error. I suppose this is a separate issue

tylerszabo · 2021-10-13T20:43:23Z

@GenericGoose I think that may under this issue as I see it. I don't think there's a reliable and deterministic way to prevent invalid names in all cases since the limitations come down to filesystems (not even just OS differences) but the way I see this problem is in terms of defaults. The current defaults have no built-in size limitations (even for common limitations such as 255 chars) and since they're composed from multiple properties there are many combinations that can exceed limitations at different stages.

In your case the defaults that include chapter names can also encounter this issue. A workaround is to override the default templates unless a more conservative default is introduced.

If we consider this issue to have 2 parts:

The default templates have no limits of any kind and can exceed common limits.
The temp files append additional suffixes which exacerbates this limitation (and makes a sane template require even stricter limits)

The your issue would fall clearly under part 1. While I think part 2 is a bigger concern (because it causes confusing errors partway into a download rather than giving a clearer error and failing fast) part 1 is certainly a component of this issue in my view :)

InconsolableCellist · 2022-04-30T23:32:11Z

Re: cross-platform compatibility, you can allow the user to specify a maximum file length as a flag. Users can then alias yt-dlp to automatically include that flag at their leisure.

Alternatively, you can determine if the attempted path exceeds the maximum file length by checking to see if the file was successfully created or not. File APIs for all major platforms will provide this information. You can then truncate the requested filename and inform the user. This is how things were handled all the way back in FAT16 with the "FILENA~1.EXT" convention.

I run into this bug all the time when archiving tweets.

chapmanjacobd · 2022-05-01T01:32:10Z

It should also be noted that the limitation is often not 255 characters but 255 bytes so for non-ASCII languages you are limited to a lot fewer than 255 char. "%(uploader)s/%(title).100s [%(id)s].%(ext)s" is pretty safe but I doubt it is a complete soluti

pukkandan · 2022-05-01T14:24:59Z

It should also be noted that the limitation is often not 255 characters but 255 bytes so for non-ASCII languages you are limited to a lot fewer than 255 char.

This is why I added the B formatter. Eg: %(title).200B to limit to 200 bytes. See "output template" section of readme for a list of the custom formatters yt-dlp provides

pukkandan · 2022-05-18T00:11:41Z

@InconsolableCellist Read the comments literally just above your question! #1136 (comment) #1136 (comment)

nick-s-b · 2022-07-30T21:05:41Z

ERROR: unable to open for writing: [Errno 36] File name too long

This is THE MOST ANNOYING bug in yt-dlp I can think of. Almost every time I try to save a Twitter video, yt-dlp fails to save a file. Why can't it just truncate the filename?

rebane2001 · 2022-08-15T09:48:55Z

It should also be noted that the limitation is often not 255 characters but 255 bytes so for non-ASCII languages you are limited to a lot fewer than 255 char.

This is why I added the B formatter. Eg: %(title).200B to limit to 200 bytes. See "output template" section of readme for a list of the custom formatters yt-dlp provides

Thank you for this feature. I figured I'd leave a comment with some things I found unclear myself and had to check so if anybody else comes across this thread they don't need to check it for themselves:

The readme states yt-dlp additionally supports converting to B = Bytes, but the end result is still a filename with correct unescaped unicode in it, not something like \xf0\x9f\xa6\x84. The readme isn't wrong, but the "converts" part can be interpreted in multiple ways.
The bytes are truncated up to the last valid byte, so a string such as 🦄🦄 with a formatting of 6B will only take the first emoji (4 bytes) and result in 🦄 without leaving half of the other one like 🦄\xf0\x9f.

erlenmayr · 2022-08-25T08:31:47Z

This happens with Twitter videos often because it uses the whole tweet as filename. How about just using the Twitter handle plus tweet ID and then crop the Tweet?

rpdelaney · 2023-04-11T18:02:11Z

Thanks to comments in this thread I got this working:

$ yt-dlp -o '%(title).200B.%(ext)s' '<url>'

Hypothetically, if I wanted to be able to tell which files have truncated names, is it possible to add something to indicate that? Could be an … (0x2026) or w/e, doesn't matter.

pukkandan · 2023-04-11T18:30:03Z

Add a %(title.201&…|)s - read as "if title.201 (201'th char) exists, then (&) add …, else (|) add nothing"

rpdelaney · 2023-04-11T18:45:46Z

Awesome. At first I misunderstood you, but this seems to be working great!

yt-dlp --output '%(title).200B%(title.201&…|)s.%(ext)s'

Edit:
This seems better: count bytes not characters when adding the ellipsis too. (See below)

yt-dlp --output '%(title).200B%(title.201B&…|)s.%(ext)s'

chrizilla · 2023-04-26T20:31:11Z

yt-dlp --output '%(title).200B%(title.201&…|)s.%(ext)s

@pukkandan : isn't this mixing apples and oranges (200 bytes and 201st character) ?

rpdelaney · 2023-04-27T16:03:04Z

@chrizilla possibly. Do you have any suggestions?

in #6882 this was suggested but I haven't tried it myself: -o "%(title).150B [%(id)s].%(ext)s" I suppose it's unlikely that the id or the extension would have many surprises, though.

chrizilla · 2023-04-27T16:16:57Z

@rpdelaney : If your title consists of ANSI characters only, I guess the 201st character is the 201st byte. But for unicode chars, we probably need to ask if the 201st byte is present, not the 201st character. But I haven't looked into if and how this can be done. Maybe %(title.201B&…|)s ?

I suppose it's unlikely that id or ext would have many surprises, though.

Depends. For example, the id for this CNN video is 91 bytes/chars long: 😮
id = world/2023/05/03/russia-attempted-drone-attack-kremlin-putin-video-ukraine-reaction-vpx.cnn

rpdelaney · 2023-04-27T16:32:30Z

At least in linux, I know that the filename length limits are counted in raw bytes since there is no enforcement of a specific character encoding for a filesystem. Counting bytes would be safest.

rpdelaney · 2023-04-28T15:56:02Z

@chrizilla How about this then?

yt-dlp --output '%(title).200B%(title.201B&…|)s.%(ext)s'

chrizilla · 2023-04-28T15:58:06Z

@rpdelaney : Does it work for you? Yes, this was my suggestion, but upon closer inspection it doesn't work for me. I am not sure this is really on-topic here, so I opened #6983.

kenorb · 2023-12-13T00:55:04Z

Same for https://twitter.com/BrainStorm_Joe/status/1734386440706953260.

ERROR: unable to open for writing: [Errno 36] File name too long

pukkandan · 2024-03-27T00:20:20Z

Everyone agrees this is a good idea. The reason this is not implemented is mostly due to technical reasons and partly due to compatibility reasons. More examples aren't helping.

pukkandan changed the title ~~Long titles and IDs can result in filenames that are too long for the OS~~ [Feature request] Handle Long filenames in default template Oct 1, 2021

pukkandan added the enhancement New feature or request label Oct 1, 2021

pukkandan changed the title ~~[Feature request] Handle Long filenames in default template~~ [Feature request] Handle Long filenames in default template and temporary files Oct 1, 2021

This was referenced Oct 14, 2021

[Bug] FFmpeg not finding files downloaded by yt-dlp #1273

Closed

[Broken] Twitter: File name too long error #1280

Closed

hairycactus mentioned this issue Nov 28, 2021

Portable Yt-dlp Unable to Detect .CONF Config File #1820

Closed

6 tasks

pukkandan mentioned this issue Dec 12, 2021

[Funimation] Error download fragment #1910

Closed

6 tasks

pukkandan mentioned this issue Dec 22, 2021

Download fails when the title of the video is too long (or contains invalid characters) #2088

Closed

6 tasks

Wikinaut mentioned this issue Jan 10, 2022

shorten filenames in the middle if needed (linux only) #2291

Closed

9 tasks

pukkandan mentioned this issue Jan 13, 2022

Unable to download videos due to "File name too long" #2329

Closed

6 tasks

pukkandan mentioned this issue Feb 6, 2022

[twitter] %(title)s contains another %(uploader)s, even template already has a %(uploader)s #2587

Closed

7 tasks

pukkandan mentioned this issue Mar 4, 2022

Reddit: ERROR: unable to download video data #2936

Closed

7 tasks

This comment was marked as resolved.

Sign in to view

pukkandan mentioned this issue May 25, 2022

Fails to save JSON file when filename which is retrieved from the source is invalid #3867

Closed

6 tasks

This was referenced Aug 7, 2022

🚨[IMPORTANT]🚨 KNOWN ISSUES/FAQ #3766

Open

Split Chapters -- Chapter Title Too Long #4582

Closed

This comment was marked as off-topic.

Sign in to view

This was referenced Nov 29, 2022

--split-chapters fails when chapter name is very long #5668

Closed

--trim-filenames fails to clip file name properly when there is a dot in a video title #2314

Open

Twitter video does not work #5695

Closed

bashonly mentioned this issue Feb 23, 2023

Twitter Download does not work for videos with age restriction #6333

Closed

9 tasks

pukkandan mentioned this issue Feb 27, 2023

Facebook reel issue #6366

Closed

10 tasks

bashonly mentioned this issue Apr 21, 2023

ERROR: unable to open for writing: [Errno 36] File name too long xxx #6882

Closed

10 tasks

This was referenced Apr 29, 2023

struggling to find the right stage for --print-to-file comments #6939

Closed

intelligent filename trim #6974

Open

mark truncated filenames with ellipsis "…" in -o template if length exceeds x bytes #6983

Open

pukkandan mentioned this issue Jun 27, 2023

yt-dlp has troubles downloading YouTube videos on Windows #7435

Closed

10 tasks

bashonly mentioned this issue Jul 27, 2023

Video slug not sanitized resulting in failure to download Twitter video #7709

Closed

11 tasks

pukkandan mentioned this issue Jul 29, 2023

File naming does not support emoji #7721

Closed

9 tasks

bashonly mentioned this issue Aug 4, 2023

Cannot download twitter videos with emojis in the text/filename #7772

Closed

11 tasks

bashonly mentioned this issue Dec 6, 2023

File saving bug in Twitter video (might be a Core bug) #8723

Closed

11 tasks

bashonly mentioned this issue Dec 19, 2023

Filename too long #8789

Closed

11 tasks

This comment was marked as duplicate.

Sign in to view

bashonly mentioned this issue Mar 27, 2024

[Twitter/X] Fragment downloading fails when tweets are too long #9545

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Handle Long filenames in default template and temporary files #1136

[Feature request] Handle Long filenames in default template and temporary files #1136

tylerszabo commented Oct 1, 2021

pukkandan commented Oct 1, 2021 •

edited

pukkandan commented Oct 1, 2021

tylerszabo commented Oct 1, 2021

GenericGoose commented Oct 13, 2021

tylerszabo commented Oct 13, 2021

InconsolableCellist commented Apr 30, 2022

chapmanjacobd commented May 1, 2022 •

edited

pukkandan commented May 1, 2022

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

pukkandan commented May 18, 2022

nick-s-b commented Jul 30, 2022

rebane2001 commented Aug 15, 2022

erlenmayr commented Aug 25, 2022

This comment was marked as off-topic.

rpdelaney commented Apr 11, 2023

pukkandan commented Apr 11, 2023

rpdelaney commented Apr 11, 2023 •

edited

chrizilla commented Apr 26, 2023 •

edited

rpdelaney commented Apr 27, 2023

chrizilla commented Apr 27, 2023 •

edited

rpdelaney commented Apr 27, 2023

rpdelaney commented Apr 28, 2023

chrizilla commented Apr 28, 2023 •

edited

kenorb commented Dec 13, 2023

This comment was marked as duplicate.

This comment was marked as duplicate.

pukkandan commented Mar 27, 2024

[Feature request] Handle Long filenames in default template and temporary files #1136

[Feature request] Handle Long filenames in default template and temporary files #1136

Comments

tylerszabo commented Oct 1, 2021

Checklist

Verbose log

Description

pukkandan commented Oct 1, 2021 • edited

pukkandan commented Oct 1, 2021

tylerszabo commented Oct 1, 2021

GenericGoose commented Oct 13, 2021

tylerszabo commented Oct 13, 2021

InconsolableCellist commented Apr 30, 2022

chapmanjacobd commented May 1, 2022 • edited

pukkandan commented May 1, 2022

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

pukkandan commented May 18, 2022

nick-s-b commented Jul 30, 2022

rebane2001 commented Aug 15, 2022

erlenmayr commented Aug 25, 2022

This comment was marked as off-topic.

rpdelaney commented Apr 11, 2023

pukkandan commented Apr 11, 2023

rpdelaney commented Apr 11, 2023 • edited

chrizilla commented Apr 26, 2023 • edited

rpdelaney commented Apr 27, 2023

chrizilla commented Apr 27, 2023 • edited

rpdelaney commented Apr 27, 2023

rpdelaney commented Apr 28, 2023

chrizilla commented Apr 28, 2023 • edited

kenorb commented Dec 13, 2023

This comment was marked as duplicate.

This comment was marked as duplicate.

pukkandan commented Mar 27, 2024

pukkandan commented Oct 1, 2021 •

edited

chapmanjacobd commented May 1, 2022 •

edited

rpdelaney commented Apr 11, 2023 •

edited

chrizilla commented Apr 26, 2023 •

edited

chrizilla commented Apr 27, 2023 •

edited

chrizilla commented Apr 28, 2023 •

edited