Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better diagnostics when concatenating strings #39018

Closed
Manishearth opened this Issue Jan 12, 2017 · 42 comments

Comments

Projects
None yet
9 participants
@Manishearth
Copy link
Member

Manishearth commented Jan 12, 2017

A lot of people come from other languages and expect "a" + "b" to work, or x + y where x and y are both Strings.

We should be suggesting "a".to_string() + "b" in the first place, x + &y in the second, and in both cases also recommend using format!() instead.

Context: https://www.reddit.com/r/rust/comments/5nl3fk/rust_severely_disappoints_me/dccdi17/

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 12, 2017

I'd like to take a stab at this if I could get some guidance as to where things in the compiler need to be changed and how.

@Manishearth Manishearth added the E-easy label Jan 12, 2017

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 12, 2017

Sure! What you need to do is find the place in the code which emits the error on "a" + "b" (hint: error code E0369)

Somewhere there you'll want to add an if checking that the operation is add, and then check that the LHS and RHS are both &str (we'll deal with the String + String/&str + String ones later). In that code, the Ty values (lhs_ty and rhs_ty_var) hold these types. They contain TypeVariants, and &str is basically TyRef(TyStr).

Then, span_note! about it. Once you have this going I can help convert the note into a suggestion.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 13, 2017

I'll be home soon so I'll get working on it then

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 13, 2017

Cool. You can ask me (or others) for help in #rust-internals if you need it.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 13, 2017

I found where after a bit of prodding so I think I have this part but if I run into problems I'll definitely jump on #rust-internals

@andersk

This comment has been minimized.

Copy link
Contributor

andersk commented Jan 13, 2017

For the String + StringString + &String part, that suggestion actually applies to anything that wants &str and is incorrectly passed String instead of &String, right? I don’t think concatenation is special.

It could even apply to anything that wants &T and is incorrectly passed U instead of &U where U: Deref<T>, though maybe that’s trickier.

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 13, 2017

Yeah. It's just that the concatenation stuff is handled separately.

I was planning on investigating this (or help someone investigate this) after this bug is fixed, by refactoring the functionality introduced here into a separate function that gets used in multiple error cases.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 14, 2017

@Manishearth here's the WIP that I have now: mgattozzi@007d459

Here's the output though:

michael@eva-02 ~/Code/rustc (git)-[better-string-message] % ./build/x86_64-unknown-linux-gnu/stage1/bin/rustc string.rs
_
error[E0369]: binary operation `+` cannot be applied to type `&str`
 --> string.rs:4:20
  |
4 |     println!("{}", s1.as_str() + s2.as_str());
  |                    ^^^^^^^^^^^
  |
note: You can't use `+` between a &str and &str
 --> string.rs:4:20
  |
4 |     println!("{}", s1.as_str() + s2.as_str());
  |                    ^^^^^^^^^^^

error: aborting due to previous error

It seems the rhs_ty_var doesn't have a type that can be matched against. Is there some way to check for it here?

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 14, 2017

Hmm, _ is a TyInfer. Does it work with string literals?

You also might want to restructure the code so that the "an implementation of {} might be missing for {}" doesn't need to be repeated. It's fine for now if that note is unconditionally emitted.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 14, 2017

Let me try it out with string literals to see if I get something different.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 14, 2017

No it doesn't work. It looks like unless we identify what TyInfer is we won't be able to check the type. This is probably due to the note in the source code here: https://github.com/mgattozzi/rust/blob/007d45926f920b51cfab9924f0ee6461f527a2aa/src/librustc_typeck/check/op.rs#L178-L183

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 14, 2017

ugh. You could check the AST if it's a literal but that's not so great.

Would there be a way of extracting this info? @eddyb ?

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Jan 15, 2017

Checking for a literal is okay-ish. You might be able to type-check the RHS before emitting the error, giving you the missing information. cc @nikomatsakis

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 15, 2017

Yeah, type checking the RHS is what I want to do, but I don't know how 😄

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Jan 15, 2017

It's this call. If I were you I would keep an Option<DiagnosticWhatever> around and add the note & emit after that RHS type-checking line.
You can also duplicate that line (there's a simpler form without the coercion) and return early or something.
Oh and the call returns the actual type of the expression.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 15, 2017

@eddyb thanks for the tip. I'm going to work out a rough (but somewhat naive) implementation that works and then hopefully get that refined from there with some guidance.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 15, 2017

@Manishearth My most recent commit got it working now. While it's a bit naive (we could return early) it's doing what's needed for now. I think we can move from here into the next few parts of the issue.

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 15, 2017

Okay, now try to suggest using to_string() there. Check out how span_suggestion is used elsewhere.

Make the error message mention explicitly that the first operand must be a String for concatenation.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 16, 2017

I've got it outputting a message but I'd like it to be the string literal with to_string() attached at the end. I'm unsure how to extract it from lhs_expr but my guess is it's a Symbol embedded within an Expr. Right now the message output with the expr input as a debug print is:

help: Your first argument should be a String, not &str. Try using `to_string()`
|            println!("{}", expr(34: "Hello").to_string() + " World");

How would I get it so that only the "Hello" part prints? Also having it underlined or with more spaces on the line for readability would be nice as well, but I think that's secondary compared to getting this part done right.

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 16, 2017

Don't extract the string literal at all. There might not be one.

Just use span_to_snippet on the expr, and compose that with a .to_string() and the rest of the expression.

span_suggestion is used as err.span_suggestion(span_to_replace, "suggestion text", "replacement text"). So you use the span of the entire addition expr (expr.span), and construct the rest using span_to_snippet on the operand expressions.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 16, 2017

Ahhh okay I see what you're saying after digging through some other code for examples. When I get this part working I'll let you know.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 16, 2017

@Manishearth This gets it working to output the proper help message and works regardless of how many uses of + are used: mgattozzi@03ec8f1

Not sure what to put for the error message when it's destructured in the match statement there. I think next would also be matching on things like String + String and &String + &str or &String + String right as well as some cleanup of formatting here?

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 16, 2017

Yeah, probably. Refactoring this into a method might help. It's also okay to land with just this one suggestion with a compile-fail test and fix the rest in a separate PR. Regardless, you probably should clean it up a bit and make a WIP PR so that it's easier to leave inline comments.

Not sure what to put for the error message when it's destructured in the match statement there

I don't know what you mean here?

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 16, 2017

Okay sounds good. If you look here it's just a blank string. I'm not sure what to place here in the case where span_to_snippet would fail.

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 16, 2017

Oh. Use a <expression> or don't emit the suggestion at all.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 16, 2017

Ahhhh okay got it. I've split it all out into a separate method that just checks for the use of &str. I'm giving it another compile to be sure. When it comes to the test cases where should I add them and is there a good example similar to this somewhere?

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 16, 2017

src/tests/compile-fail. Look at other tests in that folder, basically you can annotate errors and suggestions and stuff with // ~^. You can run an individual test with ./mach test src/tests/compile-fail -- --test-args name-of-test-file, I think.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 16, 2017

Alright I'll play around with it!

mgattozzi added a commit to mgattozzi/rust that referenced this issue Jan 17, 2017

Add clearer error message using `&str + &str`
This is the first part of rust-lang#39018. One of the common things for new users
coming from more dynamic languages like JavaScript, Python or Ruby is to
use `+` to concatenate strings. However, this doesn't work that way in
Rust unless the first type is a `String`. This commit adds a check for
this use case and outputs a new error as well as a suggestion to guide
the user towards the desired behavior. It also adds a new test case to
test the output of the error.
@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 17, 2017

@Manishearth I've cleaned it up and put in a test case. The PR is here #39116

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 17, 2017

Nice work @mgattozzi!

I have to ask though -- maybe we should just make adding strings work like people expect? I think the "fear of allocations" is the only reason not to? If so, it seems a bit overblown to me personally. I feel like I often get annoyed at places that we've made it less convenient to concatenate and manipulate strings because of the cost of allocation -- but usually the code where I'm manipulating strings is not in my hot path, or even if it is I often find I need the allocations anyway, I just have to express them differently. That said, the only major example of this I can come up with something around the way that join() works and I can't remember the details. =)

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 17, 2017

I am .... conflicted about that. I would not oppose it however.

Diagnostics is the path of least resistance here; not contingent on an rfc (I think this would be) and not controversial, so I went that path 😄 . Changing this with an impl works too.

Similarly, #39029 can be replaced by a language change, I just am not as convinced if I want that language change to want to make an rfc for it.

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 17, 2017

@nikomatsakis

I have to ask though -- maybe we should just make adding strings work like people expect? I think the "fear of allocations" is the only reason not to?

I thought coherence was the main problem, not the fear? (You can't impl Add<str> for str in places where String is available.)
Having at least String + String and &str + String would certainly be nice.

@mgattozzi

This comment has been minimized.

Copy link
Member

mgattozzi commented Jan 17, 2017

Thank you @nikomatsakis!

Personally I like that Rust is explicit. Sure it's inconvenient to have to add a to_string() or a & or as_str() to get the desired behavior, but it makes one think about whether they really need to allocate memory or not. This isn't necessarily desirable for newer users, but that's why I think this issue, providing diagnostics and helpful error messages, is a better route long term then something like impl Add.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 17, 2017

@petrochenkov

I thought coherence was the main problem, not the fear? (You can't impl Add for str in places where String is available.)

Ah, this may be true. I feel like the inability to extend coherence to cover a set of crates is an annoyance I would like to overcome. :(


@mgattozzi

Personally I like that Rust is explicit.

Certainly there is something of a trade-off involved.

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 17, 2017

Ah, this may be true. I feel like the inability to extend coherence to cover a set of crates is an annoyance I would like to overcome. :(

I've often wanted the ability to forward-declare an impl, saying something like

#[forward_declare(StringAddition)]
impl Add<str> for str {
    type Target = _;
    fn add(&self, other: &str) -> Self::Target;
}

This generates an instance of a new kind of item with item name StringAddition. It can be used for inherent impls as well. We could just have a different syntax for the item, after all all items have a syntax itemtype /* randomstuff */ {} but I'm not sure if this can be done without breaking the grammar.

and later,

#[fills_declaration(core::StringAddition)]
impl Add<str> for str {
    type Target = String;
    fn add(&self, other: &str) -> String { /* body */}
}

The system works the same way lang items do; these forward-declared impls are left dangling and can't be used until someone defines them, and you can only have one definition active (much like how redefining lang items doesn't work). However, from the point of view of coherence and conflict resolution, these impls do exist even if not defined (I'm not sure if any situation arises where this matters outside of #[fundamental] types).

Thus, the impl exists in two places at once. From the POV of coherence, it exists in libcore, but from the POV of codegen and type resolution, it exists in libstd.

The main purpose is to handle cases like these where you have multiple separate crates that are nevertheless maintained as a single unit. Coherence exists so that I, a Servo developer, am not forced to deal with error messages that must be fixed in, say, hyper (e.g. when I include some other crate and now we have conflicting impls that can't be fixed on the servo-side). But in this case the developers of libstd and libcore are the same and it's fine to punch through coherence for this. Similarly, Servo itself is split into crates for maintainability and compile times, but being able to punch through coherence in the context of servo is pretty useful. Being able to say "I know that this can break if upstream-crate-X changes things, but I am upstream-crate-X and want to do it anyway" is useful. We've had to merge crates before because coherence was making things problematic.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Jan 17, 2017

    type Target = _;

You don't have to do that, just don't mention it, and might as well do the same with the method.
This could be done by reusing partial impls, I think, as a coherence "beachhead" marker.

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 17, 2017

You don't have to do that, just don't mention it

You will have to do something for inherent impls though. I guess you can just define the item name and leave it at that. fn foo; or something.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Jan 17, 2017

Well... inherent impls don't do anything other than wrap the items inside, so each one would have to be annotated, and it'd be a bit of a mess.

@rocallahan

This comment has been minimized.

Copy link

rocallahan commented Jan 17, 2017

FWIW I think suggesting to_owned() instead of to_string() is more clear. "Hello".to_string() reads like a no-op.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Jan 18, 2017

It feels to me like String + String is reasonable to allow, since they're consumed and people are already likely used to adding a & if they want to re-use one later. It would even let the implementation re-use whichever one has a more appropriate buffer, without forcing the user to pick &str + String or String + &str a priori.

That does mean that people might end up doing "a".to_owned() + "b".to_owned() unnecessarily, as naiively seems reasonable for symmetry. That's visible cost, though, so doesn't feel too worrisome. That said, a general "you could have just used the Deref target, no ToOwner needed" lint could be interesting to help undo such accidental pessimization.

@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Jan 19, 2017

pre-rfc'd the forward declaration thing in https://internals.rust-lang.org/t/pre-rfc-forward-impls/4628

mgattozzi added a commit to mgattozzi/rust that referenced this issue Jan 25, 2017

Add clearer error message using `&str + &str`
This is the first part of rust-lang#39018. One of the common things for new users
coming from more dynamic languages like JavaScript, Python or Ruby is to
use `+` to concatenate strings. However, this doesn't work that way in
Rust unless the first type is a `String`. This commit adds a check for
this use case and outputs a new error as well as a suggestion to guide
the user towards the desired behavior. It also adds a new test case to
test the output of the error.

bors added a commit that referenced this issue Jan 25, 2017

Auto merge of #39116 - mgattozzi:better-string-message, r=nrc
Add clearer error message using `&str + &str`

This is the first part of #39018. One of the common things for new users
coming from more dynamic languages like JavaScript, Python or Ruby is to
use `+` to concatenate strings. However, this doesn't work that way in
Rust unless the first type is a `String`. This commit adds a check for
this use case and outputs a new error as well as a suggestion to guide
the user towards the desired behavior. It also adds a new test case to
test the output of the error.

mgattozzi added a commit to mgattozzi/rust that referenced this issue Jan 28, 2017

Add clearer error message using `&str + &str`
This is the first part of rust-lang#39018. One of the common things for new users
coming from more dynamic languages like JavaScript, Python or Ruby is to
use `+` to concatenate strings. However, this doesn't work that way in
Rust unless the first type is a `String`. This commit adds a check for
this use case and outputs a new error as well as a suggestion to guide
the user towards the desired behavior. It also adds a new test case to
test the output of the error.

mgattozzi added a commit to mgattozzi/rust that referenced this issue Jan 28, 2017

Add clearer error message using `&str + &str`
This is the first part of rust-lang#39018. One of the common things for new users
coming from more dynamic languages like JavaScript, Python or Ruby is to
use `+` to concatenate strings. However, this doesn't work that way in
Rust unless the first type is a `String`. This commit adds a check for
this use case and outputs a new error as well as a suggestion to guide
the user towards the desired behavior. It also adds a new test case to
test the output of the error.

bors added a commit that referenced this issue Feb 2, 2017

Auto merge of #39116 - mgattozzi:better-string-message, r=nrc
Add clearer error message using `&str + &str`

This is the first part of #39018. One of the common things for new users
coming from more dynamic languages like JavaScript, Python or Ruby is to
use `+` to concatenate strings. However, this doesn't work that way in
Rust unless the first type is a `String`. This commit adds a check for
this use case and outputs a new error as well as a suggestion to guide
the user towards the desired behavior. It also adds a new test case to
test the output of the error.

king6cong added a commit to king6cong/rust that referenced this issue Feb 3, 2017

Add clearer error message using `&str + &str`
This is the first part of rust-lang#39018. One of the common things for new users
coming from more dynamic languages like JavaScript, Python or Ruby is to
use `+` to concatenate strings. However, this doesn't work that way in
Rust unless the first type is a `String`. This commit adds a check for
this use case and outputs a new error as well as a suggestion to guide
the user towards the desired behavior. It also adds a new test case to
test the output of the error.
@Manishearth

This comment has been minimized.

Copy link
Member Author

Manishearth commented Feb 19, 2017

Fixed in #39116

@frewsxcv frewsxcv closed this Feb 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.