Implement code completion #21323

Closed
wants to merge 4 commits into
from

Projects

None yet

9 participants

@sanxiyn
Member
sanxiyn commented Jan 17, 2015
// test.rs
struct Point { x: i32, y: i32 }
fn main() { let p = Point { x: 0, y: 0 }; p. }
$ rustc -A warnings -Z no-trans -Z unstable-options --complete-at test.rs:87 test.rs
x
y

cc #9769.

@rust-highfive
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@brson
Contributor
brson commented Jan 18, 2015

While I heartily endorse this effort, I hope this can be done with minimal maintenance impact on the rest of the compiler. Even this minimal patch touches quite a lot.

@sanxiyn
Member
sanxiyn commented Jan 18, 2015

I took care to make this as minimally invasive as possible, and I think I succeeded. I don't think this is a "dramatical overhaul", to use the words in the CC'd issue.

I want to merge infrastructure changes (one to add token::GenerateCompletion and the other to add ast::ExprCompletion) as soon as possible. After that, I envision changes can be localized inside typeck. Other changes can wait and they are here basically to prove that the infrastructure works.

As for "touches quite a lot", because of exhaustive match any change introducing a new AST node necessarily touches as much. In my opinion the impact here is comparable to, say, introducing if let expression. I would appreciate suggestions how to do this better so that it can be merged sooner.

Thanks!

@huonw huonw and 2 others commented on an outdated diff Jan 23, 2015
src/libsyntax/print/pprust.rs
@@ -1909,6 +1911,10 @@ impl<'a> State<'a> {
try!(self.pclose());
}
ast::ExprMac(ref m) => try!(self.print_mac(m, token::Paren)),
+ ast::ExprCompletion(ref e) => {
+ try!(self.print_expr(&**e));
+ try!(word(&mut self.s, ".\u{B6}"));
@huonw
huonw Jan 23, 2015 Member

What is B6?

@sanxiyn
sanxiyn Jan 23, 2015 Member

U+00B6 is Pilcrow. I chose a proofreading mark, because the compiler is proofreading your code.

@lambda
lambda Jan 23, 2015 Contributor

Wouldn't U+2038 (caret) or U+2041 (caret insertion point) be more appropriate than a paragraph mark?

@sanxiyn
sanxiyn Jan 23, 2015 Member

Indeed, but both have relatively poor font support. Neither are supported by Courier New, the default monospaced font on Windows, for example.

By the way, what a bikeshed. :)

@lambda
lambda Jan 24, 2015 Contributor

Indeed, pretty much a perfect example of a bikeshed. Anyhow, font support is a good reason to choose pilcrow over caret. Carry on!

@huonw huonw and 2 others commented on an outdated diff Jan 23, 2015
src/librustc/session/config.rs
@@ -807,6 +809,7 @@ pub fn rustc_optgroups() -> Vec<RustcOptGroup> {
`everybody_loops` (all function bodies replaced with `loop {}`).",
"TYPE"),
opt::opt_u("", "show-span", "Show spans for compiler debugging", "expr|pat|ty"),
+ opt::opt("", "complete-at", "Give completions at the given position", "POS"),
@huonw
huonw Jan 23, 2015 Member

What units are POS?

@sanxiyn
sanxiyn Jan 23, 2015 Member

It is a codemap byte offset now, which is completely wrong. I plan to change it to filename:bytepos soon.

@alexcrichton
alexcrichton Jan 23, 2015 Member

Could this be opt_u for now to indicate that it's an unstable flag?

@Valloric
Contributor

While I love to see work being done in this area, code-completion semantic engines really must be exposed as libraries, not as binaries one has to call on the command line. I have some experience with building code-completion engines and integrating them with editors, and going through the command line is a non-starter because it kills performance. You suffer the cost of process startup/teardown for every call, a serialization/deserialization roundtrip and most importantly, you lose the ability to cache the AST in memory for subsequent calls.

That last part is critical. If you have to parse the whole file (and parse/load imports) every time the user needs completions, your perf goes out the window.

libclang is a shining example of how to provide a compiler API through a library. Not everything about that API is perfect, but Rust could do much worse than emulate it.

Also note that while clang internals change extremely rapidly, the API exposed by libclang is entirely stable and backwards compatible.

@sanxiyn
Member
sanxiyn commented Jan 23, 2015

I am well aware that AST caching is critical, and am working on it. --complete-at is a test interface, comparable to Clang's -code-completion-at, and is not intended as an actual API. What's here is mostly orthogonal to defining compiler API or enabling AST caching.

@steveklabnik
Contributor

While I love to see work being done in this area, code-completion semantic engines really must be exposed as libraries, not as binaries one has to call on the command line.

https://github.com/garybernhardt/selecta#faq

Won't this be slow?

Nope: startup and termination together burn about 23 ms of CPU on my machine (a mid-2011 MacBook Air). File finding may be slow, but, speaking of that...

Now, that doesn't mean that everything will always perform the best...

@Valloric
Contributor

I am well aware that AST caching is critical, and am working on it. --complete-at is a test interface, comparable to Clang's -code-completion-at, and is not intended as an actual API.

I assumed as much, but wanted to make sure. I've seen plenty of semantic completion engines that offer only a command-line interface (to the detriment of their performance). Thank you for allaying my fears and thank you for working on this!

@steveklabnik I created and have worked on YCM for the past 2 years. I have integrated several different completion engines into it (and wrote a few myself) and I've benchmarked all of them backwards and forwards, including several that didn't end up making the cut because of perf issues.

Process startup & termination eating up 23 ms of the latency budget is IMO unacceptable. On top of that, you get the overhead of serialization & desiralization through the command line (a big issue in pathological cases). That's without mentioning that this precludes the single most important perf-related feature a semantic completion engine can provide, and that's AST caching.

Selecta merely filters strings and doesn't appear to do any language parsing or semantic analysis, which are the bulk of the time spent in a semantic completion engine. So bringing it up is comparing apples and oranges.

@sanxiyn
Member
sanxiyn commented Jan 24, 2015

Updated.

  • Add a comment about pilcrow and remove code duplication.
  • Change to filename:bytepos.
  • Mark as an unstable option.
@bstrie
Contributor
bstrie commented Feb 7, 2015

Mentioned in http://phildawes.net/blog/2015/02/02/racer4/ , worth taking a read if still pondering the utility of this PR.

@Valloric
Contributor
Valloric commented Feb 7, 2015

@bstrie It's great that Racer exists, but IMO Rust should be taking the libclang route, which is exposing the compiler as a library. This way code-completion uses all the logic from the compiler, and thus is always correct and up-to-date with the language.

@sanxiyn sanxiyn changed the title from Implement code completion (work in progress) to Implement code completion Feb 10, 2015
@sanxiyn
Member
sanxiyn commented Feb 10, 2015

I think this is ready to land now. I split commits for easier review.

@ghost
ghost commented Feb 10, 2015

Is this for struct fields only? In this case it would be better to just print type of the expression (like phildawes from racer suggest), because once we have the type it's quite easy to find its members.

EDIT: Provided that there exist external cache that contains data about all other modules, it should be doable to achieve accurate typecheck by parsing only one method body.

@sanxiyn
Member
sanxiyn commented Feb 11, 2015

This PR implements struct fields completion for a demonstration purpose (and testing). The important thing is to get partial parsing working.

My current thinking on "typechecking only one method body" is to follow ideas in RFC 594.

@alexcrichton
Member

We talked about this at yesterday's meeting, and the general consensus is that we don't think that we're quite ready for this just yet. There are doubts as to the specific strategy taken here, but more broadly this feature may want to be more fleshed out before landing in the compiler.

We think that the best way to move forward on this would be to write an RFC to gather feedback about implementation strategy which may provide insight as to when/where this should land. Thanks for the PR regardless, however, and this is all quite exciting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment