Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config.from_pyfile crashes on Python 3 when source isn't encoded in default encoding #2118

Closed
xinyvz opened this issue Dec 20, 2016 · 24 comments

Comments

@xinyvz
Copy link

commented Dec 20, 2016

when I read my instance config file, I get an error.

exec(compile(config_file.read(), filename, 'exec'), d.dict)
UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 437: illegal multibyte sequence
Then I modify the code of config.from_pyfile to this

with open(filename, 'rb') as config_file:
The problem is resolved.

@wgwz

This comment has been minimized.

Copy link
Contributor

commented Dec 20, 2016

exec(compile(config_file.read(), filename, 'exec'), d.dict)
UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 437: illegal multibyte sequence
Then I modify the code of config.from_pyfile to this

is this a traceback from something flask related?

@xinyvz

This comment has been minimized.

Copy link
Author

commented Dec 21, 2016

Yes. sorry Iam forgot said this.
my os is win10
python is 3.51
flask 0.11

@wgwz

This comment has been minimized.

Copy link
Contributor

commented Dec 21, 2016

what does your config file look like? and how are you calling it within your flask app? if it is an issue, not just an implementation thing, you should be posting the minimum to reproduce the error. we need a bit more information to judge whats going on.

@xinyvz

This comment has been minimized.

Copy link
Author

commented Dec 21, 2016

this is my code.

app\__init__.py
# -*- coding: utf-8 -*-
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from flask_login import LoginManager

app = Flask(__name__, instance_relative_config=True)
app.config.from_object('config.default')
instance_path = r'D:\Python\flask\brick\instance'
app.config.from_pyfile('setting.py', silent=False)
app.config.from_envvar('APP_CONFIG_FILE')
#
db = SQLAlchemy(app)
lm = LoginManager()
lm.init_app(app)

config\development.py
# -*- coding: utf-8 -*-
# development config
# SQLAlchemy
SQLALCHEMY_TRACK_MODIFICATIONS = True
SQLALCHEMY_ECHO = True
#debug
DEBUG = True
instance\setting.py
# -*- coding: utf-8 -*-
# instance/setting.py
# security
SECRET_KEY = 'xxxxxxx'
STRIPE_API_KEY = 'xxxxxxxxx'
# database
SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://user:pass@localhost/wemedia?charset=utf8'



config\default.py
# -*- coding: utf-8 -*-
# default config
import os

basedir = os.path.abspath(os.path.join(os.path.dirname(__file__), os.path.pardir))

this Traceback
D:\Python\flask\env\Scripts\python.exe D:/Python/flask/brick/run.py Traceback (most recent call last): File "D:/Python/flask/brick/run.py", line 3, in <module> from wemediaapp import app File "D:\Python\flask\brick\wemediaapp\__init__.py", line 20, in <module> app.config.from_pyfile('setting.py', silent=False) File "D:\Python\flask\env\lib\site-packages\flask\config.py", line 130, in from_pyfile exec(compile(config_file.read(), filename, 'exec'), d.__dict__) UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 453: illegal multibyte sequence

@xinyvz

This comment has been minimized.

Copy link
Author

commented Dec 21, 2016

My win10 is chinese.

@wgwz

This comment has been minimized.

Copy link
Contributor

commented Dec 21, 2016

i don't think you need this:

instance_path = r'D:\Python\flask\brick\instance'

but i don't see why that would cause the error.

also all config vars need to be capitalized, this line should be:

BASEDIR = os.path.abspath(os.path.join(os.path.dirname(__file__), os.path.pardir))

@untitaker untitaker added the question label Dec 21, 2016

@xinyvz

This comment has been minimized.

Copy link
Author

commented Dec 22, 2016

Yes, you are ring. I am learning Flask, So I want to try as much as possible the various functions. I think the reason of this error is chinese version win10 file system default encodeing is GBK, I don't know why i change to utf-8.
I find online, get an method, "filename = open('hello.docx','rb')". And, I see a Flask another function open_instance_resource(self, resource, mode='rb') and the use.
So, I think, use this mothod to modify "open(filename, 'rb') as config_file" .
Sorry, my English is not good, Don't know whether to express correctly.

@wgwz

This comment has been minimized.

Copy link
Contributor

commented Dec 22, 2016

i think we can probably close this here for now. it seems to be just an issue getting the encoding handled properly. maybe this will help: http://stackoverflow.com/questions/23103485/

@lanfon72

This comment has been minimized.

Copy link

commented Dec 23, 2016

hi,
I think the issue in config.from_pyfile, it seems implicitly use default encoding, not explicitly assign encoding to utf-8.
docstring in open() said:

if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding.

I suggest to fix it to always using encoding="utf8", because in windows, the default encoding might be be cp950, gbk, or cp1252...etc.

@xinyvz

This comment has been minimized.

Copy link
Author

commented Dec 23, 2016

Hi,
http://stackoverflow.com/questions/23103485/
This said is the string encodeing, string is in memory, but my question is that when open file, so, that is not a good solution.
lanfon72 said is right.
config.from_pyfile only have one parameter, You can add a parameter is used to specify the encoding.

@wgwz

This comment has been minimized.

Copy link
Contributor

commented Dec 23, 2016

@xinyvz is the issue resolved? if so, what solved it? it is not entirely clear based on what you have said. please share the solution for others that might find this thread facing similar issues. and of course, close the issue if it's done.

@xinyvz

This comment has been minimized.

Copy link
Author

commented Dec 24, 2016

I modify the flask.config.from_pyfile function, add a paramter, mode='rb', Solved the problem.
I'm sorry, I didn't think of other methods.

    def from_pyfile(self, filename, mode='rb', silent=False):
        """Updates the values in the config from a Python file.  This function
        behaves as if the file was imported as module with the
        :meth:`from_object` function.

        :param filename: the filename of the config.  This can either be an
                         absolute filename or a filename relative to the
                         root path.
        :param silent: set to ``True`` if you want silent failure for missing
                       files.

        .. versionadded:: 0.7
           `silent` parameter.
        """
        filename = os.path.join(self.root_path, filename)
        d = types.ModuleType('config')
        d.__file__ = filename
        try:
            with open(filename, mode) as config_file:
                exec(compile(config_file.read(), filename, 'exec'), d.__dict__)
        except IOError as e:
            if silent and e.errno in (errno.ENOENT, errno.EISDIR):
                return False
            e.strerror = 'Unable to load configuration file (%s)' % e.strerror
            raise
        self.from_object(d)
        return True
@wgwz

This comment has been minimized.

Copy link
Contributor

commented Dec 24, 2016

Ok, nice. Let's close this.

@untitaker

This comment has been minimized.

Copy link
Member

commented Dec 24, 2016

Not sure if this is actually resolved.

@davidism

This comment has been minimized.

Copy link
Member

commented Dec 24, 2016

Seems the issue here is that the config file is encoded with utf8 but the machine's default locale is gbk. Either we allow passing an encoding to from_pyfile, or we require the encoding to be utf8. Opening the file in binary mode is not correct.

@ThiefMaster

This comment has been minimized.

Copy link
Member

commented Dec 24, 2016

I think always using UTF-8 would be the cleanest solution.

@untitaker

This comment has been minimized.

Copy link
Member

commented Dec 24, 2016

The encoding should be indicated at the top of the file using a comment # -*- coding: utf-8 -*-. I do think this issue is a Python 3 portability bug and that opening the file in binary mode is correct (as is the default in Python 2). compile() can deal with binary strings and apparently also can detect the encoding using that comment annotation.

@untitaker

This comment has been minimized.

Copy link
Member

commented Dec 24, 2016

So if I'm right @xinyvz shouldn't have this problem in Python 2.

@davidism

This comment has been minimized.

Copy link
Member

commented Dec 24, 2016

If Python will handle the coding comment if the file is opened in binary mode then yeah we should just always open in binary mode.

@xinyvz

This comment has been minimized.

Copy link
Author

commented Dec 25, 2016

@untitaker I have not try in python2, but I think you are right.

@xinyvz xinyvz closed this Dec 25, 2016

@davidism davidism reopened this Dec 25, 2016

@xinyvz xinyvz closed this Dec 25, 2016

@davidism

This comment has been minimized.

Copy link
Member

commented Dec 25, 2016

This isn't fixed, don't close it.

@davidism davidism reopened this Dec 25, 2016

@xinyvz

This comment has been minimized.

Copy link
Author

commented Dec 25, 2016

@davidism OK reopen

@xinyvz

This comment has been minimized.

Copy link
Author

commented Dec 25, 2016

I read Python3,5 doc, it said in open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) function:

encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform dependent (whatever locale.getpreferredencoding() returns), but any text encoding supported by Python can be used. See the codecs module for the list of supported encodings.

so, I modify flask.config.from_pyfile function again.

def from_pyfile(self, filename, mode='r', encoding='utf-8', silent=False):
       ......
       try:
            with open(filename, mode=mode, encoding=encoding) as config_file:

untitaker added a commit that referenced this issue Dec 25, 2016

@untitaker untitaker self-assigned this Dec 25, 2016

@untitaker

This comment has been minimized.

Copy link
Member

commented Dec 25, 2016

Please try out #2123 @xinyvz

@untitaker untitaker added bug and removed question labels Dec 25, 2016

@untitaker untitaker changed the title config.from_pyfile read 'gbk' codec can't decode config.from_pyfile crashes on Python 3 when source isn't encoded in default encoding Dec 25, 2016

@untitaker untitaker closed this in 789715a Dec 26, 2016

@untitaker untitaker removed the in progress label Dec 26, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.