Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unicode for internal processing? #1

Closed
ms8r opened this issue Apr 11, 2015 · 2 comments
Closed

Use unicode for internal processing? #1

ms8r opened this issue Apr 11, 2015 · 2 comments

Comments

@ms8r
Copy link

ms8r commented Apr 11, 2015

Firstly, many thanks for putting this up - I think it's brilliant. In addition to code refactoring this is tremendously useful for people editing and revising books (with LaTeX, markdown and/or reStructured Text sources). For this purpose it would be very helpful to have repren deal also with non-ascii characters (e.g. accented characters in foreign words).

Would you consider adding an --encoding option to enable users to specify file encoding? repren could then decode the inputs read from pattern and input files to unicode, do all the internal processing in unicode and then encode again when writing output.

@ms8r
Copy link
Author

ms8r commented Apr 11, 2015

Never mind... it already works fine for non-ascii - I forgot to switch off expandtabs when editing my pattern file.

@jlevy
Copy link
Owner

jlevy commented Apr 11, 2015

I actually avoided giving the program knowledge of encodings since then binary files, weird encodings, malformed UTF8, etc would cause random Python encoding exceptions. By working at byte level it should work on anything. (Well, case insensitive matching on non-ascii chars would require encoding knowledge -- but it doesn't handle that.) I've used it on Unicode files without problems, but if you have issues with encodings -- or ideas/PRs to improve it in general -- do let me know.

And thanks for the kind words -- glad it's useful!

@jlevy jlevy closed this as completed Apr 11, 2015
jlevy added a commit that referenced this issue Sep 3, 2015
Good start at #1.
Still could use a lot more.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants