Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fixed #19357 -- Allow non-ASCII chars in filesystem paths #571

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
2 participants
Member

claudep commented Dec 2, 2012

No description provided.

@aaugustin aaugustin commented on an outdated diff Dec 7, 2012

django/utils/_os.py
@@ -75,3 +77,7 @@ def rmtree_errorhandler(func, path, exc_info):
# use the original function to repeat the operation
func(path)
+def upath(path):
+ if not six.PY3:
+ return path.decode(sys.getfilesystemencoding())
@aaugustin

aaugustin Dec 7, 2012

Owner

sys.getfilesystemencoding() may return None under Linux — see Python's docs. You need a fallback.

django.template.loaders.app_directories uses fs_encoding = sys.getfilesystemencoding() or sys.getdefaultencoding(). You may also hardcode or 'ascii' and or 'utf-8'.

I don't have any strong arguments one way or another. If I had to choose, I would use or sys.getdefaultencoding() for consistency with existing code.

@aaugustin aaugustin and 1 other commented on an outdated diff Dec 7, 2012

django/core/management/commands/compilemessages.py
@@ -41,8 +42,8 @@ def compile_messages(stderr, locale=None):
# command, so that we can take advantage of shell quoting, to
# quote any malicious characters/escaping.
# See http://cyberelk.net/tim/articles/cmdline/ar01s02.html
- os.environ['djangocompilemo'] = pf + '.mo'
- os.environ['djangocompilepo'] = pf + '.po'
+ os.environ['djangocompilemo'] = force_str(pf + '.mo')
+ os.environ['djangocompilepo'] = force_str(pf + '.po')
@aaugustin

aaugustin Dec 7, 2012

Owner

This code assumes that the filesystem uses 'utf-8', because that's what force_str defaults to.

Furthermore, under Python 2, when force_str (ie. force_bytes) is called on a bytestring with as encoding argument that isn't utf-8, it decodes the string with utf-8 and re-encodes it with the given charset. In this case, that isn't desireable.

I think it'd be safer to use:

os.environ['djangocompilemo'] = pf + str('.mo')
os.environ['djangocompilepo'] = pf + str('.po')

(assuming code works, I didn't test)

IIRC Python 3 automatically encodes environment variables set to unicode values. That techniques offloads the problem to the language :)

@claudep

claudep Dec 7, 2012

Member

pf is already unicode here, so we can safely assume pf + '.mo' is always unicode. Now os.environ does not support setting unicode content with non-ascii. Should we once again retrieve fs_encoding to pass it to force_str (or add a npath (native path) utility in _os)?

@aaugustin

aaugustin Dec 8, 2012

Owner

I see.

npath sounds good.

IMO the alternative is to explicitly encode under Python 2:

if not six.PY3:
    pf = pf.encode(fs_encoding)
os.environ['djangocompilemo'] = pf + str('.mo')
os.environ['djangocompilepo'] = pf + str('.po')

where fs_encoding would be imported from _os.

@aaugustin aaugustin commented on the diff Dec 7, 2012

django/core/urlresolvers.py
@@ -251,9 +251,9 @@ def __repr__(self):
urlconf_repr = '<%s list>' % self.urlconf_name[0].__class__.__name__
else:
urlconf_repr = repr(self.urlconf_name)
- return force_str('<%s %s (%s:%s) %s>' % (
@aaugustin

aaugustin Dec 7, 2012

Owner

I wouldn't swear this is 100% backwards-compatible.

But I'm also in favor of using force_str as sparingly as possible :)

@claudep

claudep Dec 7, 2012

Member

This is explicitely tested in urlpatterns_reverse tests (test_resolver_repr), so hopefully it should not break anything.

@aaugustin aaugustin commented on an outdated diff Dec 7, 2012

tests/modeltests/fixtures/tests.py
management.call_command('loaddata', 'fixture5', verbosity=0, commit=False)
+ self.assertIn("Multiple fixtures named 'fixture5'", cm.exception.args[0])
@aaugustin

aaugustin Dec 7, 2012

Owner

Minor: I would put this line inside the with block. Yes, cm leaks outside the block, but that isn't clean :)

There's a few other instances of this below.

@aaugustin aaugustin and 1 other commented on an outdated diff Dec 7, 2012

tests/regressiontests/admin_scripts/tests.py
@@ -115,7 +117,7 @@ def run_test(self, script, args, settings_file=None, apps=None):
del os.environ['DJANGO_SETTINGS_MODULE']
python_path = [project_dir, base_dir]
python_path.extend(ext_backend_base_dirs)
- os.environ[python_path_var_name] = os.pathsep.join(python_path)
+ os.environ[python_path_var_name] = force_str(os.pathsep.join(python_path))
@aaugustin

aaugustin Dec 7, 2012

Owner

For the reasons explained above, I would avoid force_str if possible. For instance, it's to override test_dir with a native string version at the top of the function:

    test_dir = os.path.dirname(os.path.dirname(__file__))
    project_dir = os.path.dirname(test_dir)
    base_dir = os.path.dirname(project_dir)
@claudep

claudep Dec 7, 2012

Member

Right, we are not joining paths in this part of the code, so we can let native versions and tests still pass. I will remove the upath call here.

@aaugustin aaugustin and 1 other commented on an outdated diff Dec 7, 2012

tests/regressiontests/i18n/tests.py
@@ -666,8 +667,8 @@ def test_get_format_modules_stability(self):
with self.settings(USE_L10N=True,
FORMAT_MODULE_PATH='regressiontests.i18n.other.locale'):
with translation.override('de', deactivate=True):
- old = "%r" % get_format_modules(reverse=True)
- new = "%r" % get_format_modules(reverse=True) # second try
+ old = str(get_format_modules(reverse=True))
@aaugustin

aaugustin Dec 7, 2012

Owner

At first sight, I don't understand why you replaced repr by str, but if you have a good reason go ahead.

@claudep

claudep Dec 8, 2012

Member

get_format_modules returns a native string, as django/utils/formats.py has no unicode_literals import. But I could also have done str("%r")....

Member

claudep commented Dec 8, 2012

Committed.

@claudep claudep closed this Dec 8, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment