Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat explicit instantiations as full uses #578

Merged
merged 4 commits into from Dec 7, 2018

Conversation

jru
Copy link
Collaborator

@jru jru commented Aug 13, 2018

Modify IsFowardDecl() to check that the decl is not an explicit template instantiation (declaration or definition) This check is also in its own function, IsExplicitInstantiation(), for convenience.

Copy link
Contributor

@kimgr kimgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

I wonder if the test case would be better if it used individual header files for the different cases, so we can see that cross-file analysis does the right thing? The current test is kind-of a no-brainer -- "here's a bunch of declarations, you're going to be happy with that." (but this test did fail before your change, though, right?)

iwyu_ast_util.cc Outdated Show resolved Hide resolved
@jru
Copy link
Collaborator Author

jru commented Aug 13, 2018

Well, in this case the header file doesn't play any role, that's why I didn't add any.

The result of the test previous to the change suggests the instantiation definition for the ExplicitlyDeclaredAndDefined template to get removed:

tests/cxx/explicit_template_instantiation.cc should add these lines:

tests/cxx/explicit_template_instantiation.cc should remove these lines:
- class ExplicitlyDeclaredAndDefined;  // lines 32-32

The full include-list for tests/cxx/explicit_template_instantiation.cc:

Honestly, I haven't analyzed completely why the third declaration and other two end up being treated as full uses (I am guessing something in the semantics changes when one follows the other), but in any case we don't want them to be confused with forward declarations, so I made sure to keep them.

@kimgr
Copy link
Contributor

kimgr commented Aug 13, 2018

I think a more typical layout/use of explicit instantiations is to have the template definition and explicit instantiation decl in a header file, and the explicit instantiation defn in a .cpp file.

So rather than a minimized test case to provoke this bug, we could put up some representative code. Something like:

// explicit_template_instantiation-template.h

// Base template
template<class T>
class Template {};

// Explicit instantiation declaration.
extern template class Template<int>;


// explicit_template_instantiation-short.h

// Another explicit instantiation declaration.
extern template class Template<short>;

// explicit_template_instantiation.cc
#include "explicit_template_instantiation-template.h"
#include "explicit_template_instantiation-short.h"

// Define explicit instantiation
template class Template<int>;

// Some uses
void f() {
    // Use the base template
    Template<bool> t0;

    // Use the locally-defined explicit instantiation
    Template<int> t1;

    // Use the declared but not defined explicit instantiation
    // (I'm guessing this should type-check, but not link?)
    // Should make sure the -short.h header is not removed
    Template<short> t2;
}

I left out the // IWYU assertions in the interest of brevity (and because I'm not entirely sure what might happen). Overall this seems like a more realistic test case to me.

@jru
Copy link
Collaborator Author

jru commented Aug 20, 2018

I played with it a bit more and I found some other cases where IWYU is not doing the right thing.
Specifically, explicit instantiation declarations of templated functions seem to be ignored.

This needs more work than previously thought..

@kimgr
Copy link
Contributor

kimgr commented Aug 22, 2018

I wonder if it's worth treating that as a separate problem?

I've noticed before that IsForwardDecl does not return true for function prototypes, and that seems to align with the behavior for explicit instantiation decls.

The current patch does a nice job of fixing class templates, I guess?

@kimgr
Copy link
Contributor

kimgr commented Sep 8, 2018

I think we should leave function templates as a known issue to make some headway with what we have here.

@jru
Copy link
Collaborator Author

jru commented Sep 8, 2018

Yes, I don't think they can be tackled the same way since they are not visited directly, unless shouldVisitTemplateInstantiations() returns true, and that breaks currently. I'll create an issue to address this later.

Class and Var templates are almost working, but I got stuck for a while. I think I could finalize this tomorrow.

@jru
Copy link
Collaborator Author

jru commented Sep 12, 2018

I finally managed to finalize this, for class templates so far.

Not sure if we can add something else to the test case. What you mentioned in your example above about verifying that a header file with a explicit instantiation declaration is included is not really possible, since such declaration is not necessary for the usage of template itself.

Copy link
Contributor

@kimgr kimgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you mentioned in your example above about verifying that a header file
with a explicit instantiation declaration is included is not really possible, since
such declaration is not necessary for the usage of template itself.

But it should be, right? We included a header saying there's a singular instantiation for short and if that include is removed, we'll implicitly instantiate it, generating more code. Or if the base template has no definition, we'll fail to compile.

iwyu.cc Outdated Show resolved Hide resolved
iwyu.cc Outdated Show resolved Hide resolved
iwyu_ast_util.h Outdated Show resolved Hide resolved
iwyu_ast_util.cc Outdated Show resolved Hide resolved
tests/cxx/explicit_template_instantiation-template.h Outdated Show resolved Hide resolved
tests/cxx/explicit_template_instantiation-template.h Outdated Show resolved Hide resolved
tests/cxx/explicit_template_instantiation-template.h Outdated Show resolved Hide resolved
@kimgr
Copy link
Contributor

kimgr commented Sep 12, 2018

I think the tests would benefit from separation like I suggested earlier. Maybe you need to be explicit with --check_also to get proper output from headers.

As for the short case, that's really the most interesting to me -- see the review feedback.

@jru
Copy link
Collaborator Author

jru commented Sep 13, 2018

We included a header saying there's a singular instantiation for short and if that include is removed, we'll implicitly instantiate it, generating more code. Or if the base template has no definition, we'll fail to compile.

I guess it makes sense. But, it'll need some work :)
So, what should be expected behaviour? Which uses of that template should make the decl stay?

For example, looking at the 7 cases below. Should we keep the include for all of them, or only the uses of the same specialization? And only full uses, or also fwd decl uses? What about implicit uses?

// file template.h
template<class>
class Template {};

// file template-decl-short.h
extern template class Template<short>;

//file main.cc
#include "template.h"
#include "template-decl-short.h"

template <class T> class FullUseArg { T t; }
template <class T> class FwdDeclUseArg { T* t; }
template <class T = Template<short>> class TemplateAsDefaultFull { T t; };
template <class T = Template<short>> class TemplateAsDefaultFwd { T* t; };

Template<short> s; // 1
template Template<short>; // 2
FullUseArg<Template<short>> fu; // 3
FwdDeclUseArg<Template<short>> fd; // 4
TemplateAsDefaultFull<> td; // 5
TemplateAsDefaultFwd<> td; // 6
Template<int> s; // 7

I would go for 1-3 and 5, but maybe that misses other cases. For example, if this is itself an include and there are only forward decl uses of the template, but it is then fully used somewhere else.

Also note, if this is implemented the include 'template-decl-short.h' will be required in all compilation units that use that specialization. Do we want that?

@kimgr
Copy link
Contributor

kimgr commented Sep 16, 2018

Also note, if this is implemented the include 'template-decl-short.h' will be required in all compilation
units that use that specialization. Do we want that?

I think so... Or you mean that it could be transitively satisfied, so the header doesn't need to be included everywhere? From a strict IWYU perspective it would be nice to explicitly include that specialization everywhere it's used because it makes the code resilient to refactors -- if the specialization include is removed from an incidental dependency, our code will silently change meaning (using the base template instead).

I find it hard to reason about the seven cases when they're all in the same file, but I'll give it a shot...

template <class T = Template<short>> class TemplateAsDefaultFull { T t; };
template <class T = Template<short>> class TemplateAsDefaultFwd { T* t; };

I'm not sure this is 100% consistent, but it seems to me IWYU's general rule is that the file that spells out the type should include the type. So template argument defaults should be attributed to the file declaring the template. I can't play this out in my head for the templates above, so not sure how forward-decl vs full uses should behave.

Template<short> s; // 1

This seems like it should obviously require template-decl-short.h.

template Template<short>; // 2

Explicit template instantiation definition of Template<short>, right? That should require template-decl-short.h, otherwise it will create a reusable instantiation of the wrong template.

FullUseArg<Template<short>> fu; // 3

Template<short> is spelled out here and fu constructor will need to see the full type, so template-decl-short.h required.

FwdDeclUseArg<Template<short>> fd; // 4

Template<short> is spelled out but constructing fd does not need the full type, so a forward-decl or explicit instantiation declaration of Template<short> should do here.

TemplateAsDefaultFull<> td; // 5
TemplateAsDefaultFwd<> td; // 6

These are tricky.

The general rule says we shouldn't need Template<short> because it's not spelled out. But it is required for constructing td. I think this is one of the author-intent rules -- Template<short> should be attributed to the file declaring TemplateAsDefaultFull unless it has a forward-declare of Template<short> to signal that users should be responsible. And I think it should always require the complete type of the default template arguments, because I can't get my head around the analysis of what might happen when the templates are instantiated.

Template<int> s; // 7

No use of Template<short>, so this doesn't count toward that header.

@kimgr
Copy link
Contributor

kimgr commented Sep 16, 2018

I also think it's hard to reason about this, because I keep confusing "explicit instantiation" with "explicit specialization".

Specializations, whether partial or total, alter the behavior of the program if visible to the compilation unit.

Explicit instantiations, on the other hand, seem to be more of a build-time optimization: we can use them to help the compiler do separate compilation of templates.

The challenge in IWYU is that we want to preserve the behavior of the program and not worsen build times. So it would seem we can never remove a specialization or explicit instantiation unless we can prove that they're unused.

I'm pretty sure this is not helping :), I just needed to type it out to understand it better. I'm really out of my depth when it comes to template handling in IWYU.

@jru
Copy link
Collaborator Author

jru commented Sep 18, 2018

I just added some work-in-progress that already satisifies 1 and 2. For the rest, there is more work to do.

Also something I noticed is that IWYU tries very hard to normalize decls and their locations. :) For usual template instantiations that's not a problem (as we usually want the definition), but for explicit instantiation decls, I really had to fight it so that it would leave the decl I pass in to ReportDeclUse alone.

I am not happy with the result so far, so I would like to gather more input before I continue with the other cases.

@jru
Copy link
Collaborator Author

jru commented Nov 2, 2018

FYI, I am still working on this. It turned out to be much harder than I thought, but I think I start to see the light at the end of the tunnel :)

@jru jru force-pushed the explicit_instantiation branch 2 times, most recently from 22966d3 to b2b3686 Compare November 7, 2018 17:18
@jru
Copy link
Collaborator Author

jru commented Nov 7, 2018

I finally managed to finish this. There are a few more commits that in the beginning, so let me explain briefly (more details in the commits themselves):

These are the three main changes to make explicit template instantiations work:

  • Treat explicit template instantiations as full uses
  • Require explicit inst. decls to use main template
  • Use the definition to check if decl is provided by template

These contain some refactoring work that simplifies the implementation:

  • Make 'comment' a default argument in Report* functions
  • Refactor CanIgnoreType for clarity and reuse

Supporting change to be able to report specific redecls:

  • Introduce an extra use flag UF_TargetRedecl

Unrelated bugfix discovered during the implementation:

  • Fix GetLocation for templated functions

Also unrelated to my changes. Seems to be failing after a new warning from clang:

  • Silence warning -Wsizeof-pointer-div in test

@kimgr
Copy link
Contributor

kimgr commented Nov 11, 2018

Very cool! I haven't had a chance to look at the body of the template instantiation logic, but it strikes me that many of the refactors would be nice on their own. Can you separate them out and post them as individual PRs? The problem with huge PRs like this is that they're both hard to review and blocked until all commits are ready to go in, and much of this looks useful even before trying to fix the explicit instantiation problems.

I'll actually start off by cherry-picking the -Wno-sizeof-pointer-div commit, to get master green again. I'll come back with what I think is a logical order of changes afterwards.

@kimgr
Copy link
Contributor

kimgr commented Nov 11, 2018

Is Fix GetLocation for templated functions as free-standing as it looks? It's nice when you can remove code to fix a broken test :) Is there a test that should be written to capture the new behavior?

If it doesn't depend on anything else, I think this would be nice to get on master first.

@kimgr
Copy link
Contributor

kimgr commented Nov 11, 2018

Introduce an extra use flag UF_TargetRedecl: I feel there's more than this going on in this commit.

It seems suspicious that UF_FunctionDfn and UF_TargetRedecl control the same behavior, but I intended for use-flags to say something about the context in which the use was detected, so if they describe different contexts, they should be different. That said, I'm not sure what UF_TargetRedecl says...?

Overall, I feel like if we could break this commit into its constituent parts, maybe there are fragments that could go in earlier and solve smaller problems than the elusive explicit template instantiation :)

There's a lot of spurious reformatting in this commit (and maybe others) -- please tread lightly; it's so much harder to review (and blame in the future) when lines are reformatted as part of a functional change.

@jru
Copy link
Collaborator Author

jru commented Nov 12, 2018

Hey,

I extracted the two refactor commits here: #607. I also rebased this branch on top of it to be sure there are no conflicts. Once it is merged I'll rebase again on top of master.

Introduce an extra use flag UF_TargetRedecl

You were right about the whitespace, I messed it up somehow. Should be fixed now! Also made some small changes to ReportFullSymbolUse to hopefully make it clearer.

That said, I'm not sure what UF_TargetRedecl says...?

The main idea here was to pass extra information to the reporting logic that says "please, treat this decl as-is". I went back and forth trying out different ideas and this seemed like the cleanest solution. Although, not perfect by any means :)

Right now, the flag is not used inside OneUse for anything else, but I can imagine this might turn out to be useful for other tings. In fact, during the implementation I had to use the UF_TargetRedecl to reimplement DeclCanBeForwardDeclared as a member of OneUse, but later on turned out not to be necessary for this case, so I removed it to make the patch smaller.

The existing UF_FunctionDfn is used to implement OneUse::is_function_being_defined so I cannot really mix the two. That said, it might not be the right thing to do to add another flag for this purpose. I am open for ideas :)

In general, it seems to me the reporting side of IWYU iwyu_output.* shouldn't know anything about Clang or the AST and instead work exclusively based on the interface provided by OneUse (tiny as of right now), so that it becomes easier to extend and maintain in the long run.

@jru
Copy link
Collaborator Author

jru commented Nov 12, 2018

Is Fix GetLocation for templated functions as free-standing as it looks?

Yes, it is. I dropped it from the branch and will try to think of a test for it.

@kimgr
Copy link
Contributor

kimgr commented Nov 12, 2018

The existing UF_FunctionDfn is used to implement OneUse::is_function_being_defined so I cannot
really mix the two. That said, it might not be the right thing to do to add another flag for this purpose. I
am open for ideas :)

Ah, I see. No, I think a new flag is probably warranted.

In general, it seems to me the reporting side of IWYU iwyu_output.* shouldn't know anything about
Clang or the AST and instead work exclusively based on the interface provided by OneUse (tiny as of
right now), so that it becomes easier to extend and maintain in the long run.

I've actually been considering going in the opposite direction -- capture and expose parts of the AST and expose to iwyu_output.*.

Some things are hard to classify correctly when traversing the AST because they require cross-examination of multiple structures (can't think of an example now, but UF_FunctionDfn came about to solve one of those problems). The idea of use-flags is to capture some information about the use-location and forward it to later analysis where we have more context. I've even played with the idea of refactoring UseFlags into a UseContext class to forward e.g. AST fragments into iwyu_output (which I think is necessary to solve #105, for example).

So I'm not sure I share the concern that iwyu_output.* should not know anything about prior phases. And I think the new use-flag should say something about the context in which the use occurred, not what the earlier phase thinks the later phase should do with it.

@jru
Copy link
Collaborator Author

jru commented Nov 13, 2018

Hi, refactors are in and branch rebased.

I've actually been considering going in the opposite direction -- capture and expose parts of the AST and expose to iwyu_output.*.

Yes, that might work too. The main point I was trying to make is that the visiting side is lower level, and as such, it doesn't have the complete picture. However, it does have access to all of the details about the AST. The analysis (reporting/iwyu_output) side, on the other hand, is much higher level, with much more context, but has lost much of the detail from the lower phase.

The main problem, as I see it, is that this separation of concerns is not well defined or maintained in the code. For example, the visiting side tries to make high-level decisions about what things to report, whereas the analysis will very often overrule those decisions at a later point after doing some low-level visiting of certain nodes.

I think this interplay between the two should be represented with better abstractions. Maybe the analysis phase needs to use the AST directly after all, but I can imagine that a higher-level model of the AST where we expose what we need and hide the rest, could be more benefitial in the long run.

@kimgr
Copy link
Contributor

kimgr commented Nov 13, 2018

The main point I was trying to make is that the visiting side is lower level, and as such, it doesn't have
the complete picture. However, it does have access to all of the details about the AST. The analysis
(reporting/iwyu_output) side, on the other hand, is much higher level, with much more context, but
has lost much of the detail from the lower phase.

Exactly, that's what I was trying to say too.

The main problem, as I see it, is that this separation of concerns is not well defined or maintained in
the code.

Agreed!

For example, the visiting side tries to make high-level decisions about what things to report, whereas
the analysis will very often overrule those decisions at a later point after doing some low-level visiting
of certain nodes.
I think this interplay between the two should be represented with better abstractions. Maybe the
analysis phase needs to use the AST directly after all, but I can imagine that a higher-level model
of the AST where we expose what we need and hide the rest, could be more benefitial in the long
run.

Yes, I think there might be a phase missing to sort out stuff like author intent. That is;

  • AST visiting to collect "raw" uses
  • Refinement phase to take stuff like author intent into account and move responsibilities around
  • Analysis phase to play out the IWYU rules

There are abstractions missing, for sure, and some amount of purity where phases don't make too many "forward" assumptions. I.e. visiting the AST should be mostly oblivious to whether a use will count for something (except possibly for CanIgnore* type things to avoid collecting too much data).

And the Clang AST data structures are mostly wonderful to work with, so it would be nice to use them as a lingua franca throughout the pipeline (with some additional decoration like parent maps, etc) instead of inventing our own representations.

I'm regularly missing an include graph abstraction, where it's possible to ask about the "path" between two files (e.g. between the main file and GetFileEntry(some_symbol)), and I'm sure there are others.

@jru
Copy link
Collaborator Author

jru commented Nov 14, 2018

Yep, all good ideas! :) Sadly, it is not obvious where to start with it all while not breaking anything..

Btw, would you like me to rework the UF_TargetRedecl commit? Or is it clear enough?

@kimgr
Copy link
Contributor

kimgr commented Nov 14, 2018

Btw, would you like me to rework the UF_TargetRedecl commit? Or is it clear enough?

Yes, I think it would be nice if it followed existing form and described the context of the use, rather than telling the next phase what to do.

@jru
Copy link
Collaborator Author

jru commented Nov 15, 2018

Yes, I think it would be nice if it followed existing form and described the context of the use

Right. The thing is that it is not really a flag for the use, but for the decl associated to it. I could name it UF_ExplicitInstantiation and explain that the use targets one.

Another option is to create another set of DeclFlags, but it seems like overdoing it a little.

@jru jru force-pushed the explicit_instantiation branch 2 times, most recently from b0469cd to 66c7c1c Compare November 15, 2018 08:31
@kimgr
Copy link
Contributor

kimgr commented Nov 15, 2018

UF_ExplicitInstantiation works for me!

@jru
Copy link
Collaborator Author

jru commented Nov 16, 2018

Done. All other comments were addressed too.
If you are good with it I'll rebase the fixups and reword the commit.

@kimgr
Copy link
Contributor

kimgr commented Nov 17, 2018

I'd rather review the changes commit-by-commit, so please rewrite the commit history to look the way you want to ship it. I started looking at the total diff, and overall it looks really nice!

@jru jru force-pushed the explicit_instantiation branch 2 times, most recently from 3c30f28 to 5b2e01a Compare November 19, 2018 08:41
@jru
Copy link
Collaborator Author

jru commented Nov 19, 2018

It is ready.

Copy link
Contributor

@kimgr kimgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this incredibly well-factored work! I feel like this really pushes IWYU's understanding of templates forward.

Some nits and questions inline, but this mostly looks good to me.

tests/cxx/explicit_instantiation.cc Outdated Show resolved Hide resolved
iwyu_ast_util.cc Outdated Show resolved Hide resolved
iwyu_ast_util.cc Outdated Show resolved Hide resolved
iwyu.cc Outdated Show resolved Hide resolved
iwyu.cc Show resolved Hide resolved
tests/cxx/explicit_instantiation2.cc Outdated Show resolved Hide resolved
tests/cxx/explicit_instantiation2.cc Show resolved Hide resolved
tests/cxx/explicit_instantiation2.cc Outdated Show resolved Hide resolved
tests/cxx/explicit_instantiation2.cc Outdated Show resolved Hide resolved
@jru
Copy link
Collaborator Author

jru commented Dec 4, 2018

Thanks for the kind words!
I think I addressed all the comments, except this one. Could you clarify what you meant?

Copy link
Contributor

@kimgr kimgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Please merge with the small inline suggestions fixed (or not, if you disagree vehemently). Thank you!

@@ -1649,12 +1652,14 @@ class IwyuBaseAstVisitor : public BaseAstVisitor<Derived> {
if (CanIgnoreDecl(target_decl))
return;

const UseFlags use_flags =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually don't use const for locals, and removing this makes the line fit in colwidth 80

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think is good practice to prevent accidental modifications, especially in long functions. It also helps document the intent.

iwyu_ast_util.cc Outdated Show resolved Hide resolved
iwyu_ast_util.cc Outdated Show resolved Hide resolved
Teach IsFowardDecl() to not confuse an explicit template instantiation
with a forward declaration.

Also make sure to report explicit instantiations (declaration or
definition) as full uses during visitation.

Fixes include-what-you-use#558
To signal that the use refers to an explicit instantiation, for which the
"canonical" decl is not suitable.

Also, since it is meant to be used in a specific context, the ReportDeclUse
prototype has been adapted to take optional extra flags as an additional
input.
If the usage of a template has visible explicit instantiations decls,
require them to be included along with the main template definition.

ScanInstantiatedType has also been changed to make sure that the template
being analyzed always corresponds to the definition to make it work as
expected when there is an explicit instantiation definition visible.

Fixes include-what-you-use#558
TypeToDeclAsWritten would not return the written definition in all cases.

For templates, for example, the selected declaration could be the one
where it was explicitly instantiatiated. That, in turn, affects the source
location that is returned and ultimately the outcome of the function.

Make sure to always get to the definition before using it to determine
whether the template being instantiated provides the decl.

Fixes include-what-you-use#558
@jru jru merged commit edfc3be into include-what-you-use:master Dec 7, 2018
@jru jru deleted the explicit_instantiation branch December 7, 2018 08:12
@kimgr
Copy link
Contributor

kimgr commented Dec 8, 2018

\o/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants