Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #19357 -- Allow non-ASCII chars in filesystem paths #571

Closed
wants to merge 2 commits into from

Conversation

Projects
None yet
2 participants
@claudep
Copy link
Member

commented Dec 2, 2012

No description provided.

@@ -75,3 +77,7 @@ def rmtree_errorhandler(func, path, exc_info):
# use the original function to repeat the operation
func(path)

def upath(path):
if not six.PY3:
return path.decode(sys.getfilesystemencoding())

This comment has been minimized.

Copy link
@aaugustin

aaugustin Dec 7, 2012

Member

sys.getfilesystemencoding() may return None under Linux — see Python's docs. You need a fallback.

django.template.loaders.app_directories uses fs_encoding = sys.getfilesystemencoding() or sys.getdefaultencoding(). You may also hardcode or 'ascii' and or 'utf-8'.

I don't have any strong arguments one way or another. If I had to choose, I would use or sys.getdefaultencoding() for consistency with existing code.

os.environ['djangocompilemo'] = pf + '.mo'
os.environ['djangocompilepo'] = pf + '.po'
os.environ['djangocompilemo'] = force_str(pf + '.mo')
os.environ['djangocompilepo'] = force_str(pf + '.po')

This comment has been minimized.

Copy link
@aaugustin

aaugustin Dec 7, 2012

Member

This code assumes that the filesystem uses 'utf-8', because that's what force_str defaults to.

Furthermore, under Python 2, when force_str (ie. force_bytes) is called on a bytestring with as encoding argument that isn't utf-8, it decodes the string with utf-8 and re-encodes it with the given charset. In this case, that isn't desireable.

I think it'd be safer to use:

os.environ['djangocompilemo'] = pf + str('.mo')
os.environ['djangocompilepo'] = pf + str('.po')

(assuming code works, I didn't test)

IIRC Python 3 automatically encodes environment variables set to unicode values. That techniques offloads the problem to the language :)

This comment has been minimized.

Copy link
@claudep

claudep Dec 7, 2012

Author Member

pf is already unicode here, so we can safely assume pf + '.mo' is always unicode. Now os.environ does not support setting unicode content with non-ascii. Should we once again retrieve fs_encoding to pass it to force_str (or add a npath (native path) utility in _os)?

This comment has been minimized.

Copy link
@aaugustin

aaugustin Dec 8, 2012

Member

I see.

npath sounds good.

IMO the alternative is to explicitly encode under Python 2:

if not six.PY3:
    pf = pf.encode(fs_encoding)
os.environ['djangocompilemo'] = pf + str('.mo')
os.environ['djangocompilepo'] = pf + str('.po')

where fs_encoding would be imported from _os.

@@ -251,9 +251,9 @@ def __repr__(self):
urlconf_repr = '<%s list>' % self.urlconf_name[0].__class__.__name__
else:
urlconf_repr = repr(self.urlconf_name)
return force_str('<%s %s (%s:%s) %s>' % (

This comment has been minimized.

Copy link
@aaugustin

aaugustin Dec 7, 2012

Member

I wouldn't swear this is 100% backwards-compatible.

But I'm also in favor of using force_str as sparingly as possible :)

This comment has been minimized.

Copy link
@claudep

claudep Dec 7, 2012

Author Member

This is explicitely tested in urlpatterns_reverse tests (test_resolver_repr), so hopefully it should not break anything.

management.call_command('loaddata', 'fixture5', verbosity=0, commit=False)
self.assertIn("Multiple fixtures named 'fixture5'", cm.exception.args[0])

This comment has been minimized.

Copy link
@aaugustin

aaugustin Dec 7, 2012

Member

Minor: I would put this line inside the with block. Yes, cm leaks outside the block, but that isn't clean :)

There's a few other instances of this below.

@@ -115,7 +117,7 @@ def run_test(self, script, args, settings_file=None, apps=None):
del os.environ['DJANGO_SETTINGS_MODULE']
python_path = [project_dir, base_dir]
python_path.extend(ext_backend_base_dirs)
os.environ[python_path_var_name] = os.pathsep.join(python_path)
os.environ[python_path_var_name] = force_str(os.pathsep.join(python_path))

This comment has been minimized.

Copy link
@aaugustin

aaugustin Dec 7, 2012

Member

For the reasons explained above, I would avoid force_str if possible. For instance, it's to override test_dir with a native string version at the top of the function:

    test_dir = os.path.dirname(os.path.dirname(__file__))
    project_dir = os.path.dirname(test_dir)
    base_dir = os.path.dirname(project_dir)

This comment has been minimized.

Copy link
@claudep

claudep Dec 7, 2012

Author Member

Right, we are not joining paths in this part of the code, so we can let native versions and tests still pass. I will remove the upath call here.

@@ -666,8 +667,8 @@ def test_get_format_modules_stability(self):
with self.settings(USE_L10N=True,
FORMAT_MODULE_PATH='regressiontests.i18n.other.locale'):
with translation.override('de', deactivate=True):
old = "%r" % get_format_modules(reverse=True)
new = "%r" % get_format_modules(reverse=True) # second try
old = str(get_format_modules(reverse=True))

This comment has been minimized.

Copy link
@aaugustin

aaugustin Dec 7, 2012

Member

At first sight, I don't understand why you replaced repr by str, but if you have a good reason go ahead.

This comment has been minimized.

Copy link
@claudep

claudep Dec 8, 2012

Author Member

get_format_modules returns a native string, as django/utils/formats.py has no unicode_literals import. But I could also have done str("%r")....

@claudep

This comment has been minimized.

Copy link
Member Author

commented Dec 8, 2012

Committed.

@claudep claudep closed this Dec 8, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.