Issue370 encoding cookie #371

tjguk · 2018-03-06T13:41:34Z

DO NOT MERGE

This is a PoC PR to test the approach advocated in #370

There are a number of unresolved issues, including:

(My answers in square brackets)

Do we inject a missing encoding cookie when we load? [Yes]
Do we honour/retain an existing encoding after loading? [No]
Does our [New] file become the single-line encoding cookie? [Not sure]
Do we add an explanatory comment to the end of the cookie line? [Yes]
What do we put on the error dialog?
What happens when we extract a hex file? [I've left it untouched for now; not sure]
What's the internal newline policy when manipulating the text [\n -- think it's worth specifying that somewhere; I note the newline='' from Maintain OS specific end-of-lines in code files. #133]

I've also added a "design note" under the new design folder in docs/. It's currently orphaned, but it's something I mooted a while ago and never got around to pursuing. If people like the idea, I'll add a design.rst to collect them together.

Refactor the file-saving mechanism so that it's consistent between, eg, save and autosave Also slightly rework write_and_flush to reflect that it's being passed a file object, not a file descriptor (In principle the os.fsync ought to need the fileobj to call its .fileno() method, but it seems to be silently converting under the covers. I've confirmed, on Windows at least, that the correct underlying system API [FlushBuffersFile] is being called against the correct file path).

# Conflicts: # tests/test_logic.py

…ding-cookie

…the file being opened could not be decoded

ntoll · 2018-03-06T17:58:52Z

This pull request introduces 1 alert when merging cd2b1ca into 8236b4b - view on lgtm.com

new alerts:

1 for Encoding error

Comment posted by lgtm.com

carlosperate · 2018-03-07T12:04:20Z

Haven't got a chance yet to look into the code or the CI failures, but if you are printing or displaying any type of unicode on AppVeyor is likely you'll have to add before running the tests:

- cmd: chcp 65001

tjguk · 2018-03-07T12:05:35Z

Thanks, @carlosperate . I'd leave it for now; I'm still working through issues. I pushed it just so others could see the approach / progress.

Add support for detecting newline convention

Plus fixing encoding stuff which didn't work previously!

…rom __init__.py)

…nit__.py Remove a debug trace from logic.py

ntoll · 2018-03-09T11:40:11Z

mu/__init__.py

+
+# Configure locale and language
+# Define where the translation assets are to be found.
+localedir = os.path.join('mu', 'locale')


This needs to be the same as what is currently (in master) in app.py:

localedir = os.path.abspath(os.path.join(os.path.dirname(__file__), 'locale'))

Thanks; fixed

ntoll · 2018-03-09T11:42:38Z

Wow... this is great work. Please do say when it's ready for review etc... :-)

tjguk · 2018-03-09T11:58:28Z

So obviously this PR has sprawled somewhat. I've yet to devise tests (or to fix those which now fail because of the injection of the encoding cookie). I've raised a couple separate issues to highlight issues which I've attempted to address in the course of the PR

Issue #380 -- standardising on \n for newlines internally to Mu while honouring the original data
Issue #379 -- refactoring gettext support so that logic.py can be imported on its own

I still have unresolved questions from above, and I'm conscious that this has turned into a set of changes which are likely to have widespread impact, as they address loading & saving of files. I haven't, for example, considered the effect on embedded boards where we're packaging and sending code down the wire.

My next step will be to add and fixup testing so we have a clean starting point for discussion, but please feel free to comment and/or challenge the approaches I've taken here. @carlosperate I'd like to hear from you especially on the line-ending change as I wasn't 100% sure of of the motivation behind your previous PR which used newline=""

tjguk · 2018-03-09T13:29:58Z

So testing just got a bit more complicated. Because I'm opening the file twice, once in binary, once in text mode, the current mock_open framework can't cope. (Took me about 20 mins to work out why my open(..., "rb") was not returning bytes!

I remember discussing before the question of mocking, tho' I can't find the conversation anywhere. For now, I'm going to try for a filesystem-based test for a couple of tests, leaving the mocks in place for the Qt stuff. If that is gives a cleaner result, I'll move to leave it in. Let's see..

tjguk · 2018-03-16T09:01:02Z

This got mired in push races and merge conflicts, so I'm replacing it by several, smaller PRs including #389 and #390 with more to come.

tjguk added 14 commits March 2, 2018 21:16

Add a failing test for UTF-8 output

e6d0791

Add a failing test for UTF-8 output

61604ce

Merge branch 'master' into issue124-no-file-encoding-specified

05deedf

# Conflicts: # tests/test_logic.py

Sort out merge conflicts

b41e512

Fix comment in line with code change

2f7a323

Generate and detect various text encodings

991eef2

Temporary working folder for issue370

ee4d6a8

Design notes on using UTF-8

ac3a9f3

Merge branch 'issue124-no-file-encoding-specified' into issue370-enco…

f3335ec

…ding-cookie

First go at saving with UTF-8 and loading with BOM/cookie detection

152052b

Whoops. Put design note into docs/ where it should have been

1543a22

Fix a few issues from ad hoc testing

34fa51f

As a starting point for discussion, add a dialog box explaining that …

cd2b1ca

…the file being opened could not be decoded

tjguk added 2 commits March 6, 2018 18:41

Add an encoding cookie

b6b25d4

Show the effect of the different newline modes

89844eb

tjguk mentioned this pull request Mar 7, 2018

No file encoding specified #364

Closed

tjguk added 2 commits March 8, 2018 06:35

Flesh out the design note on line-endings

a51541a

Merge branch 'issue370-encoding-cookie' into line-endings

c3a5ad0

tjguk mentioned this pull request Mar 8, 2018

Mu does not support Chinse cha #376

Closed

tjguk added 5 commits March 9, 2018 07:45

Refactor i18n support so mu modules can be imported independently

90d239d

Add support for detecting newline convention

Continue adding newline support to Mu

3179a4a

Plus fixing encoding stuff which didn't work previously!

For now undo the removal of the i18n support (while not removing it f…

bec377a

…rom __init__.py)

Merge branch 'master' into issue370-encoding-cookie

532b3d4

Once again, remove the gettext support from app, now that it's in __i…

8ae1aad

…nit__.py Remove a debug trace from logic.py

ntoll reviewed Mar 9, 2018

View reviewed changes

tjguk mentioned this pull request Mar 9, 2018

gettext support only implicitly imported by logic.py #379

Closed

tjguk mentioned this pull request Mar 9, 2018

Mu has no standard line-ending for programmatic use #380

Closed

Address review comments

e66f5d5

tjguk closed this Mar 16, 2018

tjguk mentioned this pull request Mar 20, 2018

Encodings #399

Merged

tjguk deleted the issue370-encoding-cookie branch December 11, 2020 08:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue370 encoding cookie #371

Issue370 encoding cookie #371

tjguk commented Mar 6, 2018 •

edited

Loading

ntoll commented Mar 6, 2018

carlosperate commented Mar 7, 2018

tjguk commented Mar 7, 2018

ntoll Mar 9, 2018

tjguk Mar 9, 2018

ntoll commented Mar 9, 2018

tjguk commented Mar 9, 2018

tjguk commented Mar 9, 2018

tjguk commented Mar 16, 2018

Issue370 encoding cookie #371

Issue370 encoding cookie #371

Conversation

tjguk commented Mar 6, 2018 • edited Loading

ntoll commented Mar 6, 2018

carlosperate commented Mar 7, 2018

tjguk commented Mar 7, 2018

ntoll Mar 9, 2018

Choose a reason for hiding this comment

tjguk Mar 9, 2018

Choose a reason for hiding this comment

ntoll commented Mar 9, 2018

tjguk commented Mar 9, 2018

tjguk commented Mar 9, 2018

tjguk commented Mar 16, 2018

tjguk commented Mar 6, 2018 •

edited

Loading