Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some unicode characters are replaced with bad characters when formatting document #2140

Closed
ghost opened this issue Dec 3, 2019 · 24 comments · Fixed by #2270
Closed

Some unicode characters are replaced with bad characters when formatting document #2140

ghost opened this issue Dec 3, 2019 · 24 comments · Fixed by #2270
Labels
in editor Relates to code editing or language features is bug
Milestone

Comments

@ghost
Copy link

ghost commented Dec 3, 2019

@ashamp commented on Nov 27, 2019, 5:02 PM UTC:

image

can anybody tell how to find witch plugin cause this bug?

This issue was moved by DanTup from flutter/flutter#45707.

@ghost
Copy link
Author

ghost commented Dec 3, 2019

@ashamp commented on Nov 27, 2019, 5:07 PM UTC:

this is a very serious bug

@ghost
Copy link
Author

ghost commented Dec 3, 2019

@ashamp commented on Nov 27, 2019, 5:18 PM UTC:

image

@ghost
Copy link
Author

ghost commented Dec 3, 2019

@iapicca commented on Nov 28, 2019, 11:49 AM UTC:

Hi @ashamp
could you please provide your flutter doctor -v?
If you print the bad char do you get the correct output?
Thank you

@ghost
Copy link
Author

ghost commented Dec 3, 2019

@ashamp commented on Nov 28, 2019, 3:45 PM UTC:

Hi @ashamp
could you please provide your flutter doctor -v?
If you print the bad char do you get the correct output?
Thank you

flutter doctor -v
[✓] Flutter (Channel stable, v1.9.1+hotfix.6, on Mac OS X 10.14.4 18E226, locale zh-Hans-CN)
• Flutter version 1.9.1+hotfix.6 at /Users/macbook/flutter
• Framework revision 68587a0 (3 months ago), 2019-09-13 19:46:58 -0700
• Engine revision b863200c37
• Dart version 2.5.0

[✓] Android toolchain - develop for Android devices (Android SDK version 29.0.1)
• Android SDK at /Users/macbook/Library/Android/sdk
• Android NDK location not configured (optional; useful for native profiling support)
• Platform android-29, build-tools 29.0.1
• Java binary at: /Applications/Android Studio.app/Contents/jre/jdk/Contents/Home/bin/java
• Java version OpenJDK Runtime Environment (build 1.8.0_202-release-1483-b49-5587405)
• All Android licenses accepted.

[✓] Xcode - develop for iOS and macOS (Xcode 11.1)
• Xcode at /Applications/Xcode.app/Contents/Developer
• Xcode 11.1, Build version 11A1027
• CocoaPods version 1.8.4

[✓] Android Studio (version 3.5)
• Android Studio at /Applications/Android Studio.app/Contents
• Flutter plugin version 41.1.2
• Dart plugin version 191.8593
• Java version OpenJDK Runtime Environment (build 1.8.0_202-release-1483-b49-5587405)

[✓] VS Code (version 1.40.2)
• VS Code at /Applications/Visual Studio Code.app/Contents
• Flutter extension version 3.6.0

[✓] Connected device (1 available)
• iPhone 11 • 37700F44-81F6-408B-AA15-9D8E114DA4E4 • ios • com.apple.CoreSimulator.SimRuntime.iOS-13-1 (simulator)

• No issues found!

@ghost
Copy link
Author

ghost commented Dec 3, 2019

@ashamp commented on Nov 28, 2019, 3:51 PM UTC:

I take a screen shot show a toast .the message means "please enter 6 digit mobile verify code".
image

@ghost
Copy link
Author

ghost commented Dec 3, 2019

@zanderso commented on Dec 2, 2019, 6:15 PM UTC:

/cc @DanTup

@ghost
Copy link
Author

ghost commented Dec 3, 2019

@DanTup commented on Dec 2, 2019, 7:34 PM UTC:

@ashamp am I understanding correctly that when you're running "Format Code" in VS Code, it is messing up the characters directly in the editor? (eg. it's not a Flutter bug, but VS Code or the VS Code extension?).

I tried here but have been unable to reproduce it. Does it occur in a small isolated .dart file (without Flutter)?

If so, could you paste some specific characters that trigger the issue for you and confirm which file encoding is shown in the VS Code status bar?

Screenshot 2019-12-02 at 7 30 31 pm

Can you also try running the Dart: Capture Logs command from the VS Code command palette (you can leave everything ticked) and then format the document (causing the bug), and then click Cancel on the logging notification to get a log file (note: this log may contain some of the contents of your source files - please check there is nothing sensitive before sharing).

Thanks!

@ghost
Copy link
Author

ghost commented Dec 3, 2019

@ashamp commented on Dec 3, 2019, 9:52 AM UTC:

@ashamp am I understanding correctly that when you're running "Format Code" in VS Code, it is messing up the characters directly in the editor? (eg. it's not a Flutter bug, but VS Code or the VS Code extension?).

I use shift+option+F code format before commit code to git server,than it is messing up the characters directly in the editor

I tried here but have been unable to reproduce it. Does it occur in a small isolated .dart file (without Flutter)?

not always occur,I have not find the regularity too.

If so, could you paste some specific characters that trigger the issue for you and confirm which file encoding is shown in the VS Code status bar?

Screenshot 2019-12-02 at 7 30 31 pm

when I have this bug occur next time ,I will try to save the file so it can be recur this bug.the encoding is UTF-8,the file is created by vscode.

Can you also try running the Dart: Capture Logs command from the VS Code command palette (you can leave everything ticked) and then format the document (causing the bug), and then click Cancel on the logging notification to get a log file (note: this log may contain some of the contents of your source files - please check there is nothing sensitive before sharing).

ok,next time I will try this.
Thanks!

@ghost
Copy link
Author

ghost commented Dec 3, 2019

@DanTup commented on Dec 3, 2019, 10:03 AM UTC:

Ok great. Let's move this to the Dart-Code repository since I don't think it's a Flutter bug.

/move Dart-Code/Dart-Code

@DanTup DanTup added in editor Relates to code editing or language features is bug labels Dec 3, 2019
@DanTup DanTup added this to the On Deck milestone Dec 3, 2019
@DanTup DanTup added the awaiting info Requires more information from the customer to progress label Dec 3, 2019
@DanTup
Copy link
Member

DanTup commented Dec 3, 2019

@ashamp please comment here next time you see this with the info above and a log. Thanks!

@DanTup DanTup changed the title VSCode may cause bad char after format code! Some unicode characters are replaced with bad characters when formatting document Dec 3, 2019
@mstepanov214
Copy link

mstepanov214 commented Feb 7, 2020

@DanTup I faced the same problem. As far as I noticed, this problem appears in big files with more than 600 lines. In my case with cyrillic characters.

image

image

Dart-Code-Log-2020-01-05 03-14-49.txt

@DanTup
Copy link
Member

DanTup commented Feb 13, 2020

@mstepanov214 I'm having no luck reproducing - are you able to provide a complete file that you can reproduce this with that I can use? (I tried to create one from your log, but since the long lines are truncated I don't have the whole file). Thanks!

@mstepanov214
Copy link

@DanTup I used this one. Just add line break and format. It appears on the first try for me.
dart-formatter-bug-capture.dart.zip

@DanTup
Copy link
Member

DanTup commented Feb 14, 2020

@mstepanov214 I'm still unable to repro :( I tried adding blank lines all over the place:

Screenshot 2020-02-14 at 11 53 09

Then ran Format Document, but there were no bad characters in the file as far as I can tell:

Screenshot 2020-02-14 at 11 53 37

Are you adding linebreaks (or seeing the characters) in any specific place?

@mstepanov214
Copy link

mstepanov214 commented Feb 15, 2020

@DanTup It's weird, but I'll try to explain..
After several attempts (for this file), I noticed:

  • bad characters appear in certain strings at approximately the same interval
  • an extra line break should be added to make the formatter remove it
  • if the problem disappears, then it will return after restarting vscode

short screen capture record with demonstration:
dart_formatter_bug_demo.mp4.zip

Probably it may occur in other cases..

@DanTup
Copy link
Member

DanTup commented Feb 24, 2020

Thanks, I'm able to reproduce this now. The edit from the server looks good, but after it's applied I can see the characters are being inserted into the document:

1582548317722:Req:{"id"::"101","method"::"analysis.updateContent","params"::{"files"::{"/Users/danny/Desktop/dart_format_encoding_issue/lib/dart-formatter-bug-capture.dart"::{"edits"::[{"id"::"","length"::1,"offset"::5026,"replacement"::"��"},{"id"::"","length"::2,"offset"::0,"replacement"::""}],"type"::"change"}}},"clientRequestTime"::1582548317721}

I'll try to track down where it's coming from. Thanks!

DanTup added a commit that referenced this issue Feb 24, 2020
This seems to fix an issue with some unicode characters becoming "broken" which is likely when their bytes split across buffers and we convert them to strings before joining back together.

Fixes #2140.
@DanTup
Copy link
Member

DanTup commented Feb 24, 2020

Ok, I think I've got to the bottom of this, but could do some testing.

What I believe is happening, is that the (large) response from the analysis server with the formatted document is being "split" into multiple events from the stdout stream. We have code that concatenates these results until it gets one that ends with a newline, and then it's processed.

I think that these large packets are being split in the middle of multi-byte characters, and because we're converting them back to strings before joining them, we're ending up with a bad character at the end of the first part and a bad character at the start of the second part!

I've made a fix that keeps these as Buffers and then uses Buffer.concat and in my testing, I don't seem to be able to repro (though as mentioned above, it's a bit intermittent - possibly due to varying lengths for things like the request ID / client timestamp).

Please test out this build and see if it solves the issue for you:

Hopefully with this installed, the issue will be gone. Please let me know either way!

You can remain on this test build for now - when there's a new stable build published, you will automatically be upgraded to it.

Thanks!

@DanTup DanTup removed the awaiting info Requires more information from the customer to progress label Feb 24, 2020
@DanTup
Copy link
Member

DanTup commented Feb 24, 2020

This appears to have broken some things, so hold off with the test build for now - I'm investigating and will upload a new build when resolved!

@DanTup
Copy link
Member

DanTup commented Feb 24, 2020

Ok, new test build at https://github.com/Dart-Code/Dart-Code/releases/tag/v3.9.0-alpha.2 - see instructions above for installing.

DanTup added a commit that referenced this issue Feb 24, 2020
DanTup added a commit that referenced this issue Feb 24, 2020
This seems to fix an issue with some unicode characters becoming "broken" which is likely when their bytes split across buffers and we convert them to strings before joining back together.

Fixes #2140.
DanTup added a commit that referenced this issue Feb 24, 2020
@DanTup
Copy link
Member

DanTup commented Feb 24, 2020

I was able to make a reliable test for this (6dd51d7) for this, and it passes with the fix, so I'm confident this was the issue. I'll land the change then the PR goes green.

You can use the test build linked above in the meantime. Thanks for the help tracking it down!

DanTup added a commit that referenced this issue Feb 25, 2020
This seems to fix an issue with some unicode characters becoming "broken" which is likely when their bytes split across buffers and we convert them to strings before joining back together.

Fixes #2140.
DanTup added a commit that referenced this issue Feb 25, 2020
@DanTup DanTup modified the milestones: On Deck, v3.9.0 Feb 25, 2020
@mstepanov214
Copy link

@DanTup Really, now I can't reproduce the problem in the old way. Thank you for your help!

@DanTup
Copy link
Member

DanTup commented Feb 26, 2020

@mstepanov214 do you mean you can't reproduce with the new build, or the live stable build? It shouldn't reproduce in the test build, but in the live stable build you should be able to trigger it. I did it with a file with 130 lines of emoji:

// 🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈
// 🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈
// 🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈
// 🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈
// 🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈🙈
// x 130!

Then just insert some newlines somewhere and run Format Document and usually you'll get some corruption.

@mstepanov214
Copy link

@DanTup I meant the new build of course

@DanTup
Copy link
Member

DanTup commented Feb 29, 2020

Got it - thanks for confirming! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in editor Relates to code editing or language features is bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants