Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_PostPathFormatter breaks Windows-style path #529

Closed
fireattack opened this issue Feb 22, 2020 · 7 comments · Fixed by #631
Closed

_PostPathFormatter breaks Windows-style path #529

fireattack opened this issue Feb 22, 2020 · 7 comments · Fixed by #631
Labels
bug Bug Windows Problem originates from using Microsoft Windows

Comments

@fireattack
Copy link
Collaborator

fireattack commented Feb 22, 2020

Describe the bug

_PostPathFormatter is used too aggressively in multiple places, which results in wrong folder name and structure.

Example no.1:

instaloader --dirname-pattern "D:\test" instagram

in command line on Windows, it will download files to .D꞉﹨test, instead of proper D:\test (because both : and \ are substituted).

Example no.2:

When downloading highlights, it by default download to a subfolder Path(name) / Path(user_highlight.title):

else Path(name) / Path(user_highlight.title))

However, this Path-object later is butchered by _PostPathFormatter at

dirname = _PostPathFormatter(item).format(self.dirname_pattern, target=target)

which broke all the \s on Windows.

image

My suggestion is to re-think where we want to substitute the invalid characters. Maybe only do it against filename/foldername (and should also replace "/" there for any OS, if we want to avoid unexpected subdirectory creation), not fullpath. Probably only against automatically generated fields, too.

Version: 4.3a1

@fireattack fireattack added the bug Bug label Feb 22, 2020
@fireattack fireattack changed the title Support Windows-style pathname in dirname-pattern Support Windows-style absolute path in dirname-pattern Feb 22, 2020
@fireattack fireattack changed the title Support Windows-style absolute path in dirname-pattern Support Windows-style path (backslash) in dirname-pattern Feb 22, 2020
@fireattack fireattack changed the title Support Windows-style path (backslash) in dirname-pattern Support Windows-style path in dirname-pattern Feb 22, 2020
@fireattack
Copy link
Collaborator Author

Also, I feel like full-width colon (U+FF1A) would be a better substitution for ":" than what we were using now (MODIFIER LETTER COLON, "U+A789"). It would be more consistent will others (we use full-width backslash, pipe, etc. already).

@fireattack fireattack changed the title Support Windows-style path in dirname-pattern _PostPathFormatter breaks Windows-style path Feb 22, 2020
@fireattack
Copy link
Collaborator Author

Updated the description and title to reflect some new information.

This problem is more prominent than I think; it happens all the time when you download highlights (which by default saved to a subdir in the main dir).

@blablable123456
Copy link

Theres a trick that I use for quite a while and it works wonders. Simple put a 1 at the end of line 671. Of course this is just a fix for this isolated issue. By the way, this issue of folder formatting doesnt bug on linux, only on windows.

else Path(name) / Path(user_highlight.title + '1'))

@fireattack
Copy link
Collaborator Author

fireattack commented Feb 22, 2020

Theres a trick that I use for quite a while and it works wonders. Simple put a 1 at the end of line 671. Of course this is just a fix for this isolated issue. By the way, this issue of folder formatting doesnt bug on linux, only on windows.

else Path(name) / Path(user_highlight.title + '1'))

This doesn't seem to work. From the look of it, it would just change the folder name to {title}1, has no effect on \ issue.

(And I just tried, it indeed didn't work.:

image

)

@fishermansfriendsachet
Copy link

I really hope that this gets patched soon or somebody finds a workaround. It's stopping me from downloading highlights.

@fireattack
Copy link
Collaborator Author

I really hope that this gets patched soon or somebody finds a workaround. It's stopping me from downloading highlights.

Just remove .replace('\\', '\ufe68') from

ret = ret.replace('\\', '\ufe68').replace('|', '\uff5c').replace('?', '\ufe16').replace('*', '\uff0a')

will fix it.

@Thammus Thammus added the Windows Problem originates from using Microsoft Windows label Mar 10, 2020
fireattack added a commit to fireattack/instaloader that referenced this issue May 22, 2020
…matter

This fixes issues of path problem on Windows such as instaloader#304, instaloader#529, etc.

Also made the following change:

1. Changes substitution of colon to full-width colon (U+FF1A) instead of MODIFIER LETTER COLON (U+A789) to be consistent with others (we use full-width backslash, pipe, etc. already).
1. Also replaces `\r` on Windows with space, just like `\n`.
@aandergr aandergr linked a pull request May 22, 2020 that will close this issue
aandergr pushed a commit that referenced this issue May 22, 2020
Only substitute characters from auto generated fields in _PostPathFormatter

This fixes issues of path problem on Windows such as #304, #529, etc.

Also made the following changes:

1. Changes substitution of colon to full-width colon (U+FF1A) instead of MODIFIER LETTER COLON (U+A789) to be consistent with others (we use full-width backslash, pipe, etc. already).
2. Also replaces `\r` on Windows with space, just like `\n`.
@aandergr
Copy link
Member

Thanks for reporting! This bug has been fixed with Version 4.4.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug Windows Problem originates from using Microsoft Windows
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants