New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
./updateHostsFile.py --auto
crashes when run under path contains CJK characters using Python 2.7.12
#316
Comments
@Vdragon : The scripts don't have a help message, which is why the |
I just noticed the Anyway IMO updateHostsFile.py shouldn't crash even when it is called in a wrong way |
./updateHostsFile.py --auto
crashes when run under path contains CJK characters
@Vdragon : It's not a "wrong" way (the invocation is perfectly fine). However, it's because you have a LOCALE issue that's preventing the script from running when we try to construct the file path. I suspect it's because you are using Python 2.x instead of Python 3.x, which will handle the CJK characters MUCH BETTER than Python 2.x can. |
I've updated the issue as it can be simplified, unless we no longer support Python 2 it's still a valid issue |
./updateHostsFile.py --auto
crashes when run under path contains CJK characters./updateHostsFile.py --auto
crashes when run under path contains CJK characters using Python 2.7.12
It's more that it's an issue with Python 2.x that's making this very difficult to execute. Try running the following commands FROM the directory of >>> import os
>>> current_dir = os.getcwd()
>>> os.path.join(current_dir, "data") What happens? |
> python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> current_dir = os.getcwd()
>>> os.path.join(current_dir, "data")
'/home/<username>/\xe5\xb7\xa5\xe4\xbd\x9c\xe7\xa9\xba\xe9\x96\x93/\xe7\xac\xac\xe4\xb8\x89\xe6\x96\xb9\xe5\xb0\x88\xe6\xa1\x88/Unified hosts file with base extensions - Extending and consolidating hosts files from several well-curated sources like adaway.org, mvps.org, malwaredomainlist.com, someonewhocares.org, and potentially others./data' |
Okay, can you add the following line of code above this line in the script (line 146 - 147) and run? print(BASEDIR_PATH)
print(options["outputsubfolder"])
# This line already exists in the code, just wanted to indicate where!
options["outputpath"] = path_join_robust(BASEDIR_PATH, options["outputsubfolder"]) I'd be interested to see what gets printed. |
> python updateHostsFile.py --auto
/home/<username>/工作空間/第三方專案/Unified hosts file with base extensions - Extending and consolidating hosts files from several well-curated sources like adaway.org, mvps.org, malwaredomainlist.com, someonewhocares.org, and potentially others.
Traceback (most recent call last):
File "updateHostsFile.py", line 791, in <module>
main()
File "updateHostsFile.py", line 148, in main
options["outputsubfolder"])
File "updateHostsFile.py", line 766, in path_join_robust
"likely a LOCALE issue:\n\n" + str(e))
locale.Error: Unable to construct path. This is likely a LOCALE issue:
'ascii' codec can't decode byte 0xe5 in position 18: ordinal not in range(128) |
Interesting...it's printing the CJK correctly now. Could you add two more lines as follows: print(type(BASEDIR_PATH))
print(type(options["outputsubfolder"]))
print(BASEDIR_PATH)
print(options["outputsubfolder"])
# This line already exists in the code, just wanted to indicate where!
options["outputpath"] = path_join_robust(BASEDIR_PATH, options["outputsubfolder"]) |
@Vdragon : Any updates on this one? I unfortunately can't debug this locally, so I would need your feedback to diagnose the issue (or figure out a workaround). |
Stuck on the indentation error for a bit, anyway:
|
Okay. I think we can now replicate the import os
BASEDIR_PATH = os.path.dirname(os.path.realpath(__file__))
print(os.path.join(BASEDIR_PATH, unicode(""))) and then run this with Python 2.x? |
wierd.
|
There is no line 5 even...can you |
|
Hmm...not sure why I didn't catch it earlier...there's a parenthesis missing at the end...oops 😄 Just add a |
|
There we go! That's the error I was looking for. So now that we have isolated the error, can you try this: import os
BASEDIR_PATH = unicode(os.path.dirname(os.path.realpath(__file__)))
print(os.path.join(BASEDIR_PATH, unicode(""))) If this works, can you find this line in BASEDIR_PATH = os.path.dirname(os.path.realpath(__file__)) and replace it with BASEDIR_PATH = unicode(os.path.dirname(os.path.realpath(__file__))) and try running the script again? |
|
I've been reporting similar issues on another Python application, Git Cola a while ago, maybe the fixes of those issues can help: |
I'll take a look, but how about trying one more thing: import os
BASEDIR_PATH = os.path.dirname(os.path.realpath(__file__))
print(os.path.join(BASEDIR_PATH, str(unicode("")))) # str(unicode(...)) is weird but just try |
|
Okay, so what you can try now is changing this line in options["outputpath"] = path_join_robust(BASEDIR_PATH, options["outputsubfolder"]) to this: options["outputpath"] = path_join_robust(BASEDIR_PATH, str(options["outputsubfolder"])) and run the script again. I think it will still crash, but the stacktrace should be different now. |
|
Awesome, so it looks like we need to ensure that the paths we are joining are all strings (they're currently |
|
Okay! So not entirely sure where that |
@Vdragon : Okay, PR is up. Let me know if the patch works for you (and whether you still get that warning). |
|
I think that warning is because |
Actually, I change my mind on that. Can you add the following change to the line here: if stem == "*" and change it to be: if stem == str("*") on top of my changes and see if the warning goes away? |
Yep.
|
Closes StevenBlackgh-316. Former-commit-id: e983cb3 Former-commit-id: af7c10eda1667a10a498998bb48f3fb347b92089
Closes StevenBlackgh-316. Former-commit-id: 265b800 Former-commit-id: df22c169726f960134951f465ec8ca2ca3ffd982
Console Output
Reporter's Environment
Working Directory
Operating System
KDE Neon(based on Ubuntu 16.04 AMD64)
Python
Python 2.7.12
Unified hosts file with base extensions
commit 231dc43
Locale
The text was updated successfully, but these errors were encountered: