Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module resolution and on-demand macro expansion #60

Open
matklad opened this Issue Sep 7, 2018 · 10 comments

Comments

Projects
None yet
5 participants
@matklad
Copy link
Collaborator

matklad commented Sep 7, 2018

This is a discussion issue to figure out how to make the combination of Rust macros and module system compatible with IDE requirements.

Let's start with formulating a specific narrow task:

maintain a "tree of modules" data structure for a package

Specifically, this data structure should handle goto definition on super in super::foo, on foo in self::foo::bar and to find the root module for a given file.

To start, let's pretend that macros in Rust do not exist, and solve the problem in this setting. This is how ModuleMap datastructure works.

It maintains a set of links, where each link is a pair of module declaration (mod foo;) and a file to which this declaration resolves two. To serve "go to parent module", the links are indexed by destination. To serve "go to child module", the links are indexed by source and name, to serve "find crate root", "go to parent" is applied repeatedly (which is ok, b/c crates are shallow).

The main property which makes this data structure "IDE-ready" is that invalidation is cheep: there's roughly a constant amount of work to update a data structure after a single file is changed, independent of the size of the package. When a file is added/deleted, only links which could point to it are updated (based on file_stem), when a file is changed, only links which originate from this file are invalidated.

Another nice property of the macro-less setting is that module tree is isolated to a single package: changes in upstream and downstream crates do not affect module tree at all.

Now, when macros enter the picture, everything becomes significantly more complicated. First of all, macro interfere with name resolution, so use statements become to matter, and upstream crates become to matter. Second, we really need to expand all macros to handle everything correctly. (consider println!("{}", { #[path="sneaky.rs"] mod foo; 92})).

This last point is worth emphasizing, because it makes the complexity of reacting to modifications pretty bad for IDEs. Consider, for example, this sequence of events:

  • user comments out extern crate failure; at the crate root
  • user invokes goto definition on some super.

To handle the goto request, we need to reexpand all failure macros in the whole crate, and that's O(N) work for O(1) modification. What makes this feel wrong is that although any macro expansion could affect module tree, in reality almost none do. That is, for IDE-ready macro expansion the core requirement I think is

Do not expand macros in function bodies unless doing analysis of the function body itself

This model breaks for two reasons, one of which is mod decls with #[path] attribute, another is impls (a macro-generated impl inside function body is globally visible). I wonder if there's a lanauge-level solution here? Could we require (via a warn lint) to annotate such globaly-visible macro invocations/declarations inside function bodies with #[global_effects] or something?

If we indeed restrict ourselves to expanding only module-level macros, than I think the original macro-less solution could work reasonably if we model macro expansion as addition of new files?

@CAD97

This comment has been minimized.

Copy link
Collaborator

CAD97 commented Sep 7, 2018

Another point to consider is the difference between #[macro_use] and path-import macros. Path-import seems like it would be easier to handle (as you have a limited number of macros to import instead of globally and source-order dependent) and stabilize in the 2018 edition.

Would restricting analysis to path-imported macros and own-crate macro_rules macros dial in the complexity any? Would it be worth the limitation?

@matklad

This comment has been minimized.

Copy link
Collaborator Author

matklad commented Sep 7, 2018

In the above, I am only thinking about path-import macros :-)

For macro_rules, I think we'll have give-up on correctness for the current crate in IDE mode, or we'll have to think really hard :)

@matklad

This comment has been minimized.

Copy link
Collaborator Author

matklad commented Sep 7, 2018

Would restricting analysis to path-imported macros and own-crate macro_rules macros dial in the complexity any?

One more thing here: "own-crate macro_rules" doesn't by us too much anyway: for "read only dependencies" I think we should do a full analysis top-down once, and then cache it. And we should even be able to reuse rustc for this analysis, by relying on cargo check.

@kngwyu

This comment has been minimized.

Copy link

kngwyu commented Sep 8, 2018

For macro_rules, I think we'll have give-up on correctness for the current crate in IDE mode, or we'll have to think really hard :)

Mistype of macro_export? If so I really agree with this opinion! It's too hard 😓

If we indeed restrict ourselves to expanding only module-level macros, than I think the original macro-less solution could work reasonably if we model macro expansion as addition of new files?

Now I don't understand everything correctly... but how about just ignoring macros' effects on ModuleMap?(i.e. when we find mod foo; or use foo; in expanded macros, we don't update ModuleMap).

@matklad

This comment has been minimized.

Copy link
Collaborator Author

matklad commented Sep 8, 2018

Mistype of macro_export?

Somewhat, yeah. I mean, there's currently unstable macro keyword for defining properly namespaced macros, and we should support it 100% precisely. For old non-namespaced macros, I think we can use approximations at least initially (this is mitigated by the fact that this affects only read-write ("current") crates. Readonly deps are analyzed once, in traditional top-down fashion and are always precise). In Rust 2018, we should support macros, imported from external crates, precisely.

but how about just ignoring macros' effects on ModuleMap?(i.e. when we find mod foo; or use foo; in expanded macros, we don't update ModuleMap).

We must handle namespaced macros correctly :) Ignoring is a good strategy to at the start, to make an analyzer, useful for completion, but the end game is definitely to be able to handle all stuff, except for old macros, correctly.

@arielb1

This comment has been minimized.

Copy link

arielb1 commented Sep 9, 2018

user comments out extern crate failure; at the crate root

How common are these kinds of "crate-wide deltas"? I would personally expect the IDE to take a few seconds to recover from me poking with extern crate statements, and smart use of "red-green" incremental should prevent these deltas from propagating between crates.

@matklad

This comment has been minimized.

Copy link
Collaborator Author

matklad commented Sep 9, 2018

Probably uncommon, yeah, and a few seconds to deal with changes in the crate graph is a resonable expectation.

What I am fearing though that even with red-green, just figuring out if the edit results in crate-wide delta or not, might be expensive, at least in the pull-based invalidation model.

Say, you add a new struct. It changes the set of items declared in the module, and may change name resolution results in other modules (if there's a * import, and this struct now somehow shadows the module), and it might change the semantics of some local macro expansions. It most likely won't, but it might be the case that figuring that out itself would be costly.

I don't know how all this work out in practice tough, but it would obviously be great if we somehow were able not to care about stuff inside function bodies at all.

BTW, there's a similarlish discussion issue for red-green specifically here in #62.

@arielb1

This comment has been minimized.

Copy link

arielb1 commented Sep 9, 2018

just figuring out if the edit results in crate-wide delta or not, might be expensive, at least in the pull-based invalidation model.

As long as we can incrementalify macro resolution, we have the dependency DAG, and we can do whatever we want with it. I think we can figure something out - the DAG is not that big, and it's all in-memory.

Hopefully we can find a "fast path" for simple macros to avoid fake shadowing.

@pitaj

This comment has been minimized.

Copy link

pitaj commented Oct 18, 2018

I imagine this is a bit of scope creep, but I think a really cool feature to have eventually would be the following: See a codegen from the macro expanded AST if you, for instance, hover over the macro invocation in your IDE.

@matklad

This comment has been minimized.

Copy link
Collaborator Author

matklad commented Oct 19, 2018

@pitaj that actually would be a very easy feature to add, once we can expand macros. IntelliJ can do it, btw:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.