Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beginnings of type inference #327

Merged
merged 13 commits into from
Dec 24, 2018
Merged

Beginnings of type inference #327

merged 13 commits into from
Dec 24, 2018

Conversation

flodiebold
Copy link
Member

I was a bit bored, so I thought I'd try to start implementing the type system and see how far I come 😉 This is obviously still extremely WIP, only very basic stuff working, but I thought I'd post this now to get some feedback as to whether this approach makes sense at all.

There's no user-visible effect yet, but the type inference has tests similar to the ones for the parser. My next step will probably be to implement struct types, after which this could probably be used to complete fields.

I realize this may all get thrown away when/if the compiler query system gets usable, but I feel like there are lots of IDE features that could be implemented with somewhat working type inference in the meantime 😄

Ok(Ty::FnPtr(Arc::new(sig)))
}

// TODO this should probably be per namespace (i.e. types vs. values), since for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An altrnative would be to make two DefIds per source tuple struct: one DefId for type, and one DefId for Function. That's an important tricky question which I don't know how to answer.

If we go via de sugaring (creating two def ids), then intrnal compiler phases like type inference become easier, but the IDE bits become harder: you'll need to apply reverse-desugaring to go from internal compiler data structures to the surface syntax.

Going in the opposite direction of special-casing stuff in the compiler makes compiler harder, but IDE easier.


#[derive(Clone, PartialEq, Eq, Debug)]
pub struct InferenceResult {
type_for: FxHashMap<LocalSyntaxPtr, Ty>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to figure out (not in this PR) a better way to refer to things inside a function. The minimal change would be to use LocalSyntaxPtrs, with text offsets relative to the enclosing function. That way, we can reuse type-inference after most changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking about that as well, but I decided I'd just use LocalSyntaxPtr to get started :)

Though if we want to not have to recompute the type inference itself, we'd also have to have a position-independent representation of the function bodies, right? Like the hir::Exprs you mention maybe...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though if we want to not have to recompute the type inference itself, we'd also have to have a position-independent representation of the function bodies, right?

Hm, right, yeah. The current syntax tree is not even "position-based", it's "identity-based": if one parses the same file twice, the resulting trees won't be eq, and this actually causes interesting behavior because salsa relies on Eq a great deal.

In general, I hope to make syntax trees a very temporarily and short-lived thing in rust-analyzer: because they are immutable and store comments & whitespace explicitly, they have a pretty-high space overhead. Ideally, only syntax trees for files opened in the editor should be present in memory, syntax trees for other files should be recreated on demand and torn down liberally.

Using LocalSyntaxPtr is defenitelly the best solution for now!


fn infer_path_expr(&mut self, expr: ast::PathExpr) -> Cancelable<Option<Ty>> {
let ast_path = ctry!(expr.path());
let path = ctry!(Path::from_ast(ast_path));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to make a hir::Path from expr as a first step, and then work from that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem there was that FnScopes::resolve_local_name wants a NameRef. Though a helper method that takes a node and a name separately to resolve could be added to FnScopes...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! I'd say that we need to fix FnScopes somehow as well (via hir::Expr probably).

@matklad
Copy link
Member

matklad commented Dec 24, 2018

😍 😍 😍 😍

Yessssss!

@matklad
Copy link
Member

matklad commented Dec 24, 2018

r=me

@flodiebold I've added you to bors reviewrs, could you bors r+ to check if it works?

@matklad
Copy link
Member

matklad commented Dec 24, 2018

I realize this may all get thrown away when/if the compiler query system gets usable, but I feel like there are lots of IDE features that could be implemented with somewhat working type inference in the meantime smile

Yeah, that's precisely the plan. However, @nikomatsakis was musing about rewriting rustc type-inference as a separate crate, in the style of chalk and poloneous, and this can be a beginning of that work.

@matklad
Copy link
Member

matklad commented Dec 24, 2018

Implementation-wise, I love everything I see :)

The biggest open question is how do we identify an expression inside a function? I am not sure that LocalSyntasPtr is the best choice. And we might want to apply some deshugaring to ast to ease the type inference (removing parens, rewriting for loops, replacing missing expressions with Expr::Missing), etc. This applies to name resolution as well.

So perhaps we should create a hir::Expr where expressions are usual enums, and create a per-function table which maps source expressions to hir::Exprs, the way FileItems table works.

However, I perfer to merge this as is, an iterate on design in separate PRs :)

static INIT: Once = Once::new();
INIT.call_once(|| Logger::with_env().start().unwrap());
dir_tests(&test_data_dir(), &["."], |text, _path| infer_file(text));
}
Copy link
Member

@matklad matklad Dec 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much love such "data-driven" tests, where you just thrown two text-files at the thing and compare results.

However, for debugging, it's very useful to have a separate test for each dataset, so that one can easily launch just a single test and get useful logging, etc. We already have that feature :)

So, let's rewrite this as a bunch of functions like

#[test]
fn infers_types_for_primitives() {
    check_inference(r#"
fn test(a: u32, b: isize, c: !, d: &str) {
     a;
     b;
     c;
     d;
     1usize;
     1isize;
     "test";
     1.0f32;
 }
"#, r"
 [33; 34) 'd': [unknown]
 [88; 94) '1isize': [unknown]
 [48; 49) 'a': u32
 [55; 56) 'b': isize
 [112; 118) '1.0f32': [unknown]
 [76; 82) '1usize': [unknown]
 [9; 10) 'a': u32
 [27; 28) 'c': !
 [62; 63) 'c': !
 [17; 18) 'b': isize
 [100; 106) '"test"': [unknown]
 [42; 121) '{     ...f32; }': ()
 [69; 70) 'd': [unknown]
")
}

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, though I like the ease of just deleting the file and rerunning the test to regenerate it, but admittedly it's just a matter of copy/pasting the expected result. Ideally, we would both be able to run single tests and to automatically update them...

@flodiebold
Copy link
Member Author

@matklad
About the tests, we could do a hybrid like

#[test]
fn infers_types_for_primitives() {
    check_inference(r#"
fn test(a: u32, b: isize, c: !, d: &str) {
     a;
     b;
     c;
     d;
     1usize;
     1isize;
     "test";
     1.0f32;
 }
"#, "data/infers_types_for_primitives.txt");
}

?
Then we could run them separately, but still update automatically. I fear a bit that once we've got a few more tests, as long as there's still so much functionality missing, we'll have to update a lot of tests every time we improve the inference.

@matklad
Copy link
Member

matklad commented Dec 24, 2018

we could do a hybrid like

Excellent idea! I've missed the "auto-update" bit of the current impl

@flodiebold flodiebold changed the title Very WIP: Beginnings of type inference Beginnings of type inference Dec 24, 2018
@flodiebold
Copy link
Member Author

Ok, did that and cleaned up a bit more, so I think we can merge now!

I'll try:
bors r+

bors bot added a commit that referenced this pull request Dec 24, 2018
327: Beginnings of type inference r=flodiebold a=flodiebold

I was a bit bored, so I thought I'd try to start implementing the type system and see how far I come 😉  This is obviously still extremely WIP, only very basic stuff working, but I thought I'd post this now to get some feedback as to whether this approach makes sense at all.

There's no user-visible effect yet, but the type inference has tests similar to the ones for the parser. My next step will probably be to implement struct types, after which this could probably be used to complete fields.

I realize this may all get thrown away when/if the compiler query system gets usable, but I feel like there are lots of IDE features that could be implemented with somewhat working type inference in the meantime 😄 

Co-authored-by: Florian Diebold <flodiebold@gmail.com>
@bors
Copy link
Contributor

bors bot commented Dec 24, 2018

Build succeeded

@bors bors bot merged commit 4befde1 into rust-lang:master Dec 24, 2018
@matklad
Copy link
Member

matklad commented Dec 24, 2018

For the next steps, I think it makes sense to focus on two bits:

  • making type-inference information indpendent of positions & syntax trees. That is, the should not be a direct dependency between type_of and source_file queries, because the latter is identify based and so is always stale. See how InputModuleItems makes name resolution independent of the syntax (typing_inside_a_function_should_not_invalidate_item_map test in particular). I think hir::Expr is the way to go here, but perhaps we can come up with something smarter?

  • type inference for adts: enums and structs. Figuring out the type of self and completing methods from the "current" impl would be awesome user-experience wise.

@flodiebold
Copy link
Member Author

Yeah, I'm working on ADTs for now.

@flodiebold flodiebold deleted the types branch February 24, 2019 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants