Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with long image file name #48

Closed
janniklasrose opened this issue Jul 14, 2021 · 6 comments
Closed

Problem with long image file name #48

janniklasrose opened this issue Jul 14, 2021 · 6 comments
Labels
bug - error Fails with an error fixed The issue was fixed.

Comments

@janniklasrose
Copy link
Contributor

The following Markdown file

# Test

![OK](https://img.shields.io/badge/-Python-black?style=flat-square&logo=Python)

![NOK](https://img.shields.io/badge/-Python-black?style=flat-square&logo=Python&logocolor=yellow&logowidth=40&link=https://www.python.org/doc/)

fails in WSL with the error:

OSError: [Errno 36] File name too long: 'images/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f2d507974686f6e2d626c61636b3f7374796c653d666c61742d737175617265266c6f676f3d507974686f6e266c6f676f636f6c6f723d79656c6c6f77266c6f676f77696474683d3430266c696e6b3d68747470733a2f2f7777772e707974686f6e2e6f72672f646f632f.svg'

Reason

This can be reproduced for any image with sufficiently long file name, and for badges from shields.io with a long list of parameters, which result in a file name that encodes these parameters in ascii.

The bug appears to be due to the maximum path length limitation of 260 characters in Windows. Presumably this would also happen in plain Windows and not just WSL, but I could not reproduce due to a different error (UnicodeEncodeError).

Workaround

The problem can be circumvented by running gh_md_to_html test.md -i (deliberately leaving the optional argument to -i empty to not download images), but this is not ideal.

Solution

See this commit in httpie for a possible workaround.

@phseiff
Copy link
Owner

phseiff commented Jul 14, 2021

Hello, and thanks for the detailed issue report!

The problem seems to be reproducible on Ubuntu Linux as well, not just on Windows; I'm getting the same error on my machine.

The underlying issue seems to be that gh-md-to-html usually determines the file name based on the image URL, and falls back to the hash as a file name under certain circumstances (if it can't determine a good file name based on the URL, for example). The reason why the hash is used in this case is because the hash is kept around anyways to be able to ensure that two identical images are always saved under the same image name, to reduce the disk space that image caching requires.

I should be able to solve this by determining the file name differently in these case scenarios, though; I'm currently looking deeper into this.

(On a relatively unrelated side note, I'm glad to see that there are people who are actually using the image caching feature 😀 )

@phseiff
Copy link
Owner

phseiff commented Jul 14, 2021

Okay, so, I figured out the root of the issue now. It actually had nothing to do with the hashes I mentioned, but rather with the fact that GitHub caches images itself, and that gh-md-to-html's image caching therefore accidentally wraps around GitHub's image caching, copying GitHub's absurdly long file names in the process.

I will fix this tomorrow (in 10-12 hours) and update you once it is done, if that's okay.

@phseiff phseiff added the bug - error Fails with an error label Jul 14, 2021
@janniklasrose
Copy link
Contributor Author

Awesome, thanks for investigating this so quickly!

@phseiff phseiff added the fixed The issue was fixed. label Jul 15, 2021
@phseiff
Copy link
Owner

phseiff commented Jul 15, 2021

The issue should be fixed now; you can update to v1.17.1 (e.g. via pip3 install --upgrade gh-md-to-html>=1.17.1) to get said fix.

I also fixed how image names are generated; they should make much more sense now. You will have to delete the already-cached images (probably in the local images folder) to see this change, though (but not deleting them will do no harm; it will just keep the contra-intuitive seemingly random file names for already-cached images).

@janniklasrose
Copy link
Contributor Author

Great, I have tested it and it indeed solves the problem. Thank you!

The script now prints a bunch of output, though. Perhaps these prints should be inside an if DEBUG statement?

@phseiff
Copy link
Owner

phseiff commented Jul 15, 2021

Oh yes, thanks for pointing that out!

I simply forgot to remove the print-statements after I was done fixing the problem (which, of course, shouldn't happen in a published Python module). They're now partially removed and partially behind an if DEBUG-wall like you suggested, so you shouldn't see any unwanted output now.

You can update to v1.17.2 (e.g. via pip3 install --upgrade gh-md-to-html>=1.17.2) to get said fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug - error Fails with an error fixed The issue was fixed.
Projects
None yet
Development

No branches or pull requests

2 participants