Display better snippet for invalid char literal #30763

gchp · 2016-01-07T16:32:23Z

This is achieved by adding the scan_back method. This method looks back
through the source_text of the StringReader until it finds the target
char, returning it's offset in the source. We use this method to find
the offset of the opening single quote, and use that offset as the start
of the error.

Given this code:

fn main() {
    let _ = 'abcd';
}

The compiler would give a message like:

error: character literal may only contain one codepoint: ';
let _ = 'abcd';
             ^~

With this change, the message now displays:

error: character literal may only contain one codepoint: 'abcd';
let _ = 'abcd';
        ^~~~~~~

Fixes #30033

rust-highfive · 2016-01-07T16:32:38Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

apasel422 · 2016-01-07T16:42:00Z

Ideally, the span would not include the ; there. Also, can you add a compile-fail test for this?

gchp · 2016-01-07T16:44:08Z

@apasel422 ok - sure thing.

nagisa · 2016-01-07T17:11:01Z

To me it seems we should investigate why start does not contain the correct position, and if it is not possible to fix start, add a comment explaining why.

gchp · 2016-01-10T19:16:41Z

Investigating the failing test now.

@nagisa if I understand it correctly, start does not contain the correct position because we have already parsed past the opening single quote trying to look for a closing single quite (which in this case is not found). So I think it makes sense that start doesn't point to the opening single quote position?

alexcrichton · 2016-01-12T18:02:11Z

I agree with @nagisa that the start variable should indeed be the start of the character constant, and in theory we shouldn't need to scan backwards looking for the end (e.g. we just parsed it and should have the position as well).

Would it be possible to fix the span without the scan backwards function?

gchp · 2016-01-12T18:12:15Z

@alexcrichton sure thing. I'll dig into it and update the PR!

nagisa · 2016-01-12T18:12:43Z

So I think it makes sense that start doesn't point to the opening single quote position?

You could introduce another variable earlier in the function (or rename the start which gets set to the closing apostrophe) to deal with it, I think.

gchp · 2016-01-12T18:45:08Z

@nagisa I did try that before I ended up with the current iteration. Even at the very beginning of the function the start variable was not in the right place.

I might have just overlooked something though. I'll go back and try it again and see what I find. Will update here when I have something to report.

Thanks for the feedback!

gchp · 2016-01-13T12:37:52Z

@nagisa & @alexcrichton - I've fixed the problem and updated the PR. See the latest commit message for details on the fix. Basically we were exiting the function too early, without checking for a closing single quote.

The latest commit makes the others which preceeded it unecessary. Should I squash them all together? Or is it ok to leave as is?

bors · 2016-01-13T15:09:34Z

☔ The latest upstream changes (presumably #30684) made this pull request unmergeable. Please resolve the merge conflicts.

gchp · 2016-01-13T15:44:36Z

src/liblibc

@@ -1 +1 @@
-Subproject commit 95d6a00134f284e6b889d98f4c2cb4b285950327
+Subproject commit e0c0bf439add63a6a25a25ba47e4aec9547bf9af


This showed up when I rebased. Not sure how to make it not show up as part of this pull request...

cd src/liblibc git checkout 95d6a00134f284e6b889d98f4c2cb4b285950327 cd ../../ git add/commit --amend

is the crudest way which should work, I think.

gchp · 2016-01-14T13:46:40Z

Ok, updated from comments. Let me know what other tests you'd like, if any.
Also let me know if I should squash all these commits.

Thanks!

nagisa · 2016-01-14T14:05:08Z

Yeah, a squash is necessary. Perhaps also a test for '\x00\x00'. r=me once those are done.

gchp · 2016-01-14T16:48:04Z

@nagisa having some issues getting the error right for '\x00\x00. As it stands, it comes out like: ...one codepoint: '\x00\x. Missing the end of the second sequence. I'm not sure how to reliably fix that.

I could try parse forwards until the end of the escape sequence? I'd need to do that any number of times, though depending on the number of sequences found. Take this example:

'\x00\x00\x00\x00'

Should I just recursively parse escape sequences?

nagisa · 2016-01-14T16:54:14Z

My suggestion would be to not parse until the end – you already know what’s the error and where it is happening, you just need to ensure the error is well worded. Current nightly outputs something like:

<anon>:2:5: 2:10 error: unterminated character constant: '\x00
<anon>:2     '\x00\x00'
             ^~~~~

It might be best to fall-back on error of this sort if you can’t immediately tell whether the literal is not terminated (i.e. missing ') or it is too long (i.e. too many characters before the closing ').

EDIT: it could also be that we do not really want the “literal may only contain one codepoint” error and would rather always report the “unterminated character constant” instead and print an associated help message. For example:

<anon>:2:5: 2:7 error: unterminated character constant: 'h
<anon>:2     'hello'
             ^~
<anon>:2:5: 2:10 help: character literals may only contain one codepoint

… at least for the non-lifetime case.

gchp · 2016-01-14T16:56:38Z

@nagisa ok, sounds good. Will update the PR shortly. Thanks!

Given this code: fn main() { let _ = 'abcd'; } The compiler would give a message like: error: character literal may only contain one codepoint: '; let _ = 'abcd'; ^~ With this change, the message now displays: error: character literal may only contain one codepoint: 'abcd' let _ = 'abcd' ^~~~~~ Fixes rust-lang#30033

nagisa · 2016-01-14T17:46:53Z

@bors r+

bors · 2016-01-14T17:46:54Z

📌 Commit acc9428 has been approved by nagisa

nagisa · 2016-01-14T17:50:26Z

Thanks, the PR turned out nicely!

gchp · 2016-01-14T17:51:06Z

Thank you! Appreciate you going through it with me :)

bors · 2016-01-15T01:04:59Z

⌛ Testing commit acc9428 with merge 0cc6f21...

bors · 2016-01-15T01:52:00Z

💔 Test failed - auto-mac-64-nopt-t

nagisa · 2016-01-15T05:11:32Z

@bors retry

Seems like its one of the spurious errors we've been having lately.
On Jan 15, 2016 3:52 AM, "bors" notifications@github.com wrote:

[image: 💔] Test failed - auto-mac-64-nopt-t
http://buildbot.rust-lang.org/builders/auto-mac-64-nopt-t/builds/7711

—
Reply to this email directly or view it on GitHub
#30763 (comment).

bors · 2016-01-15T06:38:27Z

⌛ Testing commit acc9428 with merge a70a60a...

This is achieved by adding the scan_back method. This method looks back through the source_text of the StringReader until it finds the target char, returning it's offset in the source. We use this method to find the offset of the opening single quote, and use that offset as the start of the error. Given this code: ```rust fn main() { let _ = 'abcd'; } ``` The compiler would give a message like: ``` error: character literal may only contain one codepoint: '; let _ = 'abcd'; ^~ ``` With this change, the message now displays: ``` error: character literal may only contain one codepoint: 'abcd'; let _ = 'abcd'; ^~~~~~~ ``` Fixes #30033

bors · 2016-01-15T08:30:53Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-64-debug-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-x-android-t, auto-linux-cross-opt, auto-linux-musl-64-opt, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-win-gnu-32-nopt-t, auto-win-gnu-32-opt, auto-win-gnu-64-nopt-t, auto-win-gnu-64-opt, auto-win-msvc-32-opt, auto-win-msvc-64-opt

rust-highfive assigned alexcrichton Jan 7, 2016

gchp force-pushed the issue/30033 branch from 96a30c8 to 214192d Compare January 13, 2016 15:42

gchp reviewed Jan 13, 2016
View reviewed changes

gchp force-pushed the issue/30033 branch from cbfd54b to acc9428 Compare January 14, 2016 17:35

bors merged commit acc9428 into rust-lang:master Jan 15, 2016

nagisa mentioned this pull request Jan 31, 2016

Fix for #31267 and additional zero-width constant bug #31291

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Display better snippet for invalid char literal #30763

Display better snippet for invalid char literal #30763

gchp commented Jan 7, 2016

rust-highfive commented Jan 7, 2016

apasel422 commented Jan 7, 2016

gchp commented Jan 7, 2016

nagisa commented Jan 7, 2016

gchp commented Jan 10, 2016

alexcrichton commented Jan 12, 2016

gchp commented Jan 12, 2016

nagisa commented Jan 12, 2016

gchp commented Jan 12, 2016

gchp commented Jan 13, 2016

bors commented Jan 13, 2016

gchp Jan 13, 2016

nagisa Jan 13, 2016

gchp commented Jan 14, 2016

nagisa commented Jan 14, 2016

gchp commented Jan 14, 2016

nagisa commented Jan 14, 2016

gchp commented Jan 14, 2016

nagisa commented Jan 14, 2016

bors commented Jan 14, 2016

nagisa commented Jan 14, 2016

gchp commented Jan 14, 2016

bors commented Jan 15, 2016

bors commented Jan 15, 2016

nagisa commented Jan 15, 2016

bors commented Jan 15, 2016

bors commented Jan 15, 2016

		@@ -1 +1 @@
		Subproject commit 95d6a00134f284e6b889d98f4c2cb4b285950327
		Subproject commit e0c0bf439add63a6a25a25ba47e4aec9547bf9af

Display better snippet for invalid char literal #30763

Display better snippet for invalid char literal #30763

Conversation

gchp commented Jan 7, 2016

rust-highfive commented Jan 7, 2016

apasel422 commented Jan 7, 2016

gchp commented Jan 7, 2016

nagisa commented Jan 7, 2016

gchp commented Jan 10, 2016

alexcrichton commented Jan 12, 2016

gchp commented Jan 12, 2016

nagisa commented Jan 12, 2016

gchp commented Jan 12, 2016

gchp commented Jan 13, 2016

bors commented Jan 13, 2016

gchp Jan 13, 2016

Choose a reason for hiding this comment

nagisa Jan 13, 2016

Choose a reason for hiding this comment

gchp commented Jan 14, 2016

nagisa commented Jan 14, 2016

gchp commented Jan 14, 2016

nagisa commented Jan 14, 2016

gchp commented Jan 14, 2016

nagisa commented Jan 14, 2016

bors commented Jan 14, 2016

nagisa commented Jan 14, 2016

gchp commented Jan 14, 2016

bors commented Jan 15, 2016

bors commented Jan 15, 2016

nagisa commented Jan 15, 2016

bors commented Jan 15, 2016

bors commented Jan 15, 2016