Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No diagnostic for error in instantiated context #137

Closed
HighCommander4 opened this issue Aug 25, 2019 · 9 comments
Closed

No diagnostic for error in instantiated context #137

HighCommander4 opened this issue Aug 25, 2019 · 9 comments

Comments

@HighCommander4
Copy link

HighCommander4 commented Aug 25, 2019

Given the following two files:

test.hpp:

void bar(int);

template <typename T> void foo(T t) { bar(t); }

test.cpp:

#include "main.hpp"

struct S {};

int main() { foo(S()); }

I do not get any diagnostics when I open test.cpp, even though it does not compile:

In file included from test.cpp:1:
test.hpp:3:39: error: no matching function for call to 'bar'
template <typename T> void foo(T t) { bar(t); }
                                      ^~~
test.cpp:5:14: note: in instantiation of function template specialization 'foo<S>' requested here
int main() { foo(S()); }
             ^
test.hpp:1:6: note: candidate function not viable: no known conversion from 'S' to 'int' for 1st argument
void bar(int);
     ^
1 error generated.
@sam-mccall
Copy link
Member

Unfortunately this is a pretty hard performance tradeoff I think. It's definitely a shortcoming we should document, but I'm not sure we can fix it.

foo is in the preamble, which we build once and reuse as long as the initial #includes of the file don't change. This preamble build is very slow (it's the main reason first-diagnostics are slow), and one thing that we do to speed it up is skipping parsing the bodies of functions that are declared in the header files.
This is up to a 2x speedup: https://reviews.llvm.org/D41495

When we come to parse the current file and it uses a template, this means we don't have the parsed body to instantiate, so that gets skipped. This is actually a big speedup too: 3-10x according to the same patch above. This means we don't produce any diagnostics as part of the instantiation.
I'm not sure if it's possible even in principle to instantiate the template once we've decided not to parse it in the preamble, templates need a weird mix of early and late analysis.

So per my understanding, we'll get diagnostics from class template instantiations, but not function template ones, and we'd have to give up a lot of performance all the time to get them back. @ilya-biryukov knows more about this than me, though.

@HighCommander4
Copy link
Author

Perhaps we can compute the more expensive diagnostics for the open file as a lower-priority task that runs once clangd is idle?

@sam-mccall
Copy link
Member

It seems like the feature wouldn't be all that good:

  • once we've shown the user the "full" diagnostics, and they edit the file, and we get new "light" diagnostics, we're in a bind. We don't know if the extra diagnostics are valid, and may not get a refresh for another minute. If we hide them, it's confusing and removes most of the value of the feature. If we don't, it's annoying at best and misleading at worst.
  • the behavior is unpredictable: users can't rely on having these diagnostics at any given time.

And pretty expensive:

  • If we don't store and reuse the "full, body-parsed" preambles, latency is going to be terrible, like up to a minute
  • if we do, we're eating a lot more disk and ram all the time
  • this work needs to be done on another thread (to avoid blocking interactive work), and the two need to synchronize to ensure consistent diagnostics - this is a lot of complexity and doesn't fit our current threading model
  • we'll burn a lot of CPU doing this. Even if it's idle CPU it competes with background indexing and battery life.

I'd suggest commenting out the SkipFunctionBodies and seeing if that's actually better overall. We can easily expose that as a flag. Anything else seems likely to be a lot of effort for little reward.

@HighCommander4
Copy link
Author

Good points.

I'll note that C++20 Concepts will alleviate this somewhat. In a situation like the above, where foo() cannot be instantiated with all types, a C++20 version of foo() will hopefully have its template parameters be appropriately constrained, and trying to instantiate it with a type that doesn't satisfy the constraints can be diagnosed without seeing the function body.

@sam-mccall
Copy link
Member

I don't think we have any plans or serious ideas about how to make clear improvements here without hurting performance too much. (Apart from carving out an exception for make_unique!)

C++20 concepts may help with some cases, but this is definitely a limitation of our design. I think this bug can be closed but we should probably link it from a FAQ or something.

@chriselrod
Copy link

chriselrod commented Jan 20, 2023

It really seems like we should have exceptions for emplace_back type functions, and for all headers within the projects we're developing.

Is there some workaround for this issue, too at least enable parsing functions within one's own projects?

@zyn0217
Copy link

zyn0217 commented Jan 27, 2023

I’ve noticed that the strategy we now used for optimizing(simply ignore function bodies in header files when building preamble) not only affects diagnostic messages aforementioned, but also type deduction for generic lambdas.

Type deduction on generic lambda was introduced about 1 year ago, which allows Clangd provide type info for lambda parameter types. It only applies to those templates that only instantiate once.

Not showing code block here as it's in another repo :(

The way it used to check the number of instantiation is to traverse ASTContext that built with Preamble. However, Preamble for non-main file (like headers?) is incomplete as the building process drops the function body(But std::make_unique* are reserved intentionally....), where templates may be instantiated. This could lead to inconsistent and confusing result like this:

#include <algorithm>
#include <vector>

template <typename Iter, typename F> F my_for_each(Iter b, Iter e, F f) {
  for (; b != e; ++b)
    f(*b);
  return f;
}

int main() {
  std::vector<int> V;
  my_for_each(V.begin(), V.end(), [&](auto I) { });    // I: int
  std::for_each(V.begin(), V.end(), [&](auto I) { });  // I: not deduced
}

By commenting out SkipFunctionBodies line I could get the correct deduction result.

A naive workaround that I can think of is, to allow user specify what headers they want to "parse throughly” in config file. The parsing process is transitive, meaning all headers included in specified file would be affected.

Preamble:
  DoParseFunctionBodyFor:
  [
    <algorithm>  #1
    "/abs/path/to/thirdparty/header”  #2
    "headers/relative/to/current/workspace” #3
  ]
  1. std headers. All functions defined here or transitively included by this header would be parsed, e.g. for libstdc++, std::for_each is defined in <bits/stl_algo.h>, which is included by <algorithm>. Writing <algorithm> here makes std::for_each body visible to ASTContext.

  2. Path to a third party header. This library could be installed somewhere like /usr/local/include/…

  3. Path relative to current workspace.

I don't know if something like wildcard rule should be applicable here — comments are welcomed.

As to performance issue, I have learned from the original review D41495 that parsing function body would bring significant performance downgrade. But I don’t know how much downgrade we would bring up if we allow user to config what they want to parse. Yes it is a trade-off, but shouldn't we give the user opportunity to balance performance and features?

I did a rough test — I didn’t implement the idea yet, but later I’d like to provide more metrics here 🔢 . For <bits/stdc++.h> of GCC 11.1, if we remove the SkipFunctionBodies aforementioned and run clangd -check, building time cost for Preamble increases about 0.6 seconds.
(I’m running this test on a Linux VPS machine with 32G RAM, 16 cores CPU.)


1/28 Update:

I implemented this idea tentatively. Simple usage screenshots:

With <vector> parsed completely:
1
Without <vector>, by default.
2
With <algorithm> and <vector>. Note type deduction on generic lambda is available.
3

@HighCommander4
Copy link
Author

Brainstorming a couple of related ideas:

  • As an alternative to an allowlist of headers, we could have an allowlist of function names. This would be more targeted and have a smaller performance impact, at the cost of more user configuration.
    • (This is slightly complicated by the fact that functions can be overloaded and defined in different scopes, but we could do something simple like apply the rule to all overloads in all scopes with the specified name, and accept some unintended matches.)
  • As an alternative to any user configuration, we could have heuristics to detect the sorts of functions where diagnostics from the instantiation are particularly likely to be valuable. Examples of potential heuristics:
    • Functions with a forwarding (T&&...) interface (suggested in an earlier comment).
    • Functions which take a callback as a template parameter (this would handle the generic lambda use case).

@jpy794
Copy link

jpy794 commented May 3, 2023

I encountered this issue when trying to wrap an emplace() with another emplace(). While I understand the performance concerns, I think at least there should be a flag for users to control clangd's behavior. A whitelist makes sense too. Hope there will be some progress on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants