Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying -p on command line can lead to data corruption #9142

Closed
sasumner opened this issue Nov 12, 2020 · 1 comment
Closed

Specifying -p on command line can lead to data corruption #9142

sasumner opened this issue Nov 12, 2020 · 1 comment

Comments

@sasumner
Copy link
Contributor

sasumner commented Nov 12, 2020

Description of the Issue

Similar to the data corruption that can happen with the Goto dialog's Offset option (see #9101 and #9129 (comment) ) , the -p command line parameter can set the current position between the bytes of a multibyte-encoded UTF-8 character, or between the bytes of a Windows' line-ending of a CRLF pair. This should not be allowed to occur.

Multibyte UTF-8 characters should be be considered "atomic" (strong feeling about) and so should Windows' line-endings (less strong of a feeling, but still fairly strong).

Steps to Reproduce the Issue

  1. Turn visible line-endings on via View menu > Show symbol > Show End of Line
  2. Open test_4byte_utf8.txt file (attached, below), observe the UTF-8 character (after zooming):
    image
  3. Optional, using HexEditor plugin, look at hex view, observe:
    image
  4. Open test_crlf.txt file (attached, below), observe:
    image
  5. Close all files; quit Notepad++
  6. Run the command line: notepad++.exe -p1 test_4byte_utf8.txt using the attached file of the same name
  7. After the file loads but before doing anything else, type a
  8. Observe data is corrupted as the 4-byte UTF-8 character has been split:
    image
  9. Repeat steps 5 through 7 using the test_crlf.txt file instead of the test_4byte_utf8.txt file in step 6.
  10. Observe line-endings, which should be CRLF, are "corrupted"; one line-ending is CR, the other is LF:
    image

Expected Behavior

No data corruption.

Actual Behavior

The data corruption shown in steps 8 and 10.

Debug Information

Notepad++ v7.9.1 (64-bit)
Build time : Nov 2 2020 - 01:07:46
Path : C:\........\npp.7.9.1.portable.x64\notepad++.exe
Admin mode : OFF
Local Conf mode : ON
OS Name : Windows 10 Enterprise (64-bit)
OS Version : 1809
OS Build : 17763.1518
Current ANSI codepage : 1252
Plugins : mimeTools.dll NppConverter.dll NppExport.dll

Test files

test_4byte_utf8.txt
test_crlf.txt

@sasumner
Copy link
Contributor Author

Probably also want to tick the box for Use DirectWrite... in the Preferences

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant