-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Problem with special characters in file name #663
Comments
I am pretty sure this is already managed on windows in some way, but I currently don't have access to a Windows machine to check. |
We do have logic to screen out character names, though I have no clue what is in that filename that is tripping Windows up. |
I didn't notice the first time but upon rereading the bug report, it seems that the error is from you using Linux to download the files to an exFAT filesystem. Linux is much more loose in terms of the characters in file names but exFAT is a Microsoft creation that conforms to Windows rules. You're essentially tricking the BDFR into believing that it should follow Linux naming conventions and then it errors out when the filesystem says no. The erroring out I can address but the other issue isn't really something that can be easily fixed, and certainly not automatically. I don't know of a way to gather the information on the filesystem that is being written to, especially since there are options like SAMBA, NFS, USBs, disks, and a whole bunch of other network protocols for filesharing that may or may not expose the underlying filesystem to the user or our queries. Maybe being able to force the Windows naming conventions in the configuration file would be a suitable work around? I'm not sure. |
Woah, thanks for the heads up on that @Serene-Arc. I had been noticing files with the incorrect names and flabbergasted why they were downloading but silently remaining in the BDFR logs. It turns out that smbutil was transparently renaming them to a macOS friendly name. Thus, when I listed the directory in macOS zsh it would show the wrong names. So tonight I SSHd into the linux VM that I run BDFR on, listed the directory there and the names are perfect. Providing an option to force the Windows naming in the configuration file would likely rectify the situation for any users accessing BDFR's results over SMB. |
Thanks for adding an option to fix this issue, really appreciate your work. |
Description
I noticed that bdfr crashes when attempting to download posts with special characters in the title which are not allowed in file names of some file systems (like "*\~). For example this post (ignore the necrophilia joke):
https://www.reddit.com/r/197/comments/xdi48h/
I tried to clone it with this command:
python -m bdfr clone -v -l https://www.reddit.com/r/197/comments/xdi48h/ ""
And got this error:
Note how downloading just fails, but archiving crashes bdfr, which is more annoying when downloading multiple posts.
Now I noticed this only happens when downloading to my USB-Stick, so this isn't really easily reproducible as I work on linux, old hardware and did a custom format of the USB stick (exFAT). Therefore I'd suggest a more general way to deal with this kind of problem.
My suggestion would be either an option that automatically removes these problematic characters from file names, simply adding a catch so archiving fails for this file but doesn't crash the whole process or even replacing these characters if this operation fails and trying again. The second one would obviously be the easiest to implement and makes this problem more managable, the first is more of a feature request.
Command
Environment (please complete the following information):
Logs
The text was updated successfully, but these errors were encountered: