Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iwyu suggest to include std::map for std::regex #1110

Closed
maflcko opened this issue Sep 12, 2022 · 5 comments
Closed

iwyu suggest to include std::map for std::regex #1110

maflcko opened this issue Sep 12, 2022 · 5 comments

Comments

@maflcko
Copy link

maflcko commented Sep 12, 2022

Steps to reproduce:

$ cat /tmp/1.cpp 
#include <regex>
#include <string>

std::regex Reg(const std::string &r) { return std::regex{r}; }

int main() {}
$ ./bin/include-what-you-use  /tmp/1.cpp 

/tmp/1.cpp should add these lines:
#include <map>      // for map
#include <sstream>  // for basic_stringbuf<>::int_type, basic_stringbuf<>::p...
#include <vector>   // for vector

/tmp/1.cpp should remove these lines:

The full include-list for /tmp/1.cpp:
#include <map>      // for map
#include <regex>    // for regex, _NFA
#include <sstream>  // for basic_stringbuf<>::int_type, basic_stringbuf<>::p...
#include <string>   // for string
#include <vector>   // for vector
---
$ ./bin/include-what-you-use  --version 
include-what-you-use 0.18 (git:abd5d2b) based on Ubuntu clang version 14.0.0-1ubuntu1
@kimgr
Copy link
Contributor

kimgr commented Sep 12, 2022

Interesting. Does it behave the same if you:

  • only default-construct std::regex
  • construct std::regex with a pattern, but don't copy it

?

We have some issues with template symbols being attributed to instatiation location, so I'm looking for a pattern.

Also, is this with libc++ or libstdc++?

@maflcko
Copy link
Author

maflcko commented Sep 13, 2022

Yeah, this was with libstdc++. ./bin/include-what-you-use -stdlib=libstdc++ /tmp/1.cpp gives the same result. libc++ gives:

$ ./bin/include-what-you-use -stdlib=libc++ /tmp/1.cpp 

/tmp/1.cpp should add these lines:
#include <iosfwd>  // for string

/tmp/1.cpp should remove these lines:
- #include <string>  // lines 2-2

The full include-list for /tmp/1.cpp:
#include <iosfwd>  // for string
#include <regex>   // for regex
---

Switching to default-construction is fine:

$ ./bin/include-what-you-use -stdlib=libstdc++ /tmp/1.cpp 

(/tmp/1.cpp has correct #includes/fwd-decls)

Removing the copy keeps the bug:

$ cat /tmp/1.cpp && ./bin/include-what-you-use -stdlib=libstdc++ /tmp/1.cpp 
#include <regex>
#include <string>
int main() {std::regex{std::string{}};}

/tmp/1.cpp should add these lines:
#include <map>      // for map
#include <sstream>  // for basic_stringbuf<>::int_type, basic_stringbuf<>::p...
#include <vector>   // for vector

/tmp/1.cpp should remove these lines:

The full include-list for /tmp/1.cpp:
#include <map>      // for map
#include <regex>    // for _NFA, regex
#include <sstream>  // for basic_stringbuf<>::int_type, basic_stringbuf<>::p...
#include <string>   // for string
#include <vector>   // for vector
---

@kimgr
Copy link
Contributor

kimgr commented Sep 13, 2022

Thanks! I wonder what the string ctor is doing with the map. Will look into this in more detail as time permits.

kimgr added a commit to kimgr/include-what-you-use that referenced this issue Jan 2, 2023
A number of Visit... methods have been implemented in IwyuBaseASTVisitor
over the years without any particular rationale.

This means they affect traversal and use analysis for both template
instantiations (InstantiatedTemplateVisitor) and regular IWYU
logic (IwyuAstConsumer).

Over time, we've seen several instances of confusing attribution with
templates (particularly with the standard library), and it turns out
this is the reason.

This patch moves CXXMethodDecl, BinaryOperator, CXXCatchStmt and
CXXForRangeStmt down to IwyuAstConsumer, which means they don't see
through template instantiations any more.

Fixes issues: include-what-you-use#927, include-what-you-use#944, include-what-you-use#1003, include-what-you-use#1004, include-what-you-use#1110, include-what-you-use#1149

format
kimgr added a commit to kimgr/include-what-you-use that referenced this issue Jan 2, 2023
A number of Visit... methods have been implemented in IwyuBaseASTVisitor
over the years without any particular rationale.

This means they affect traversal and use analysis for both template
instantiations (InstantiatedTemplateVisitor) and regular IWYU
logic (IwyuAstConsumer).

Over time, we've seen several instances of confusing attribution with
templates (particularly with the standard library), and it turns out
this is the reason.

This patch moves CXXMethodDecl, BinaryOperator, CXXCatchStmt and
CXXForRangeStmt down to IwyuAstConsumer, which means they don't see
through template instantiations any more.

Fixes issues: include-what-you-use#927, include-what-you-use#944, include-what-you-use#1003, include-what-you-use#1004, include-what-you-use#1110, include-what-you-use#1149
kimgr added a commit to kimgr/include-what-you-use that referenced this issue Jan 7, 2023
InstantiatedTemplateVisitor does special traversal to collect a bunch of
AST nodes to ignore, for various reasons. It also builds a resugar-map
to associate desugared types with their original spelling (e.g.
re-binding canonical types to any typedef used for template
instantiation).

CanIgnoreType uses these node sets to determine whether Type instance
can be ignored.

But CanIgnoreType is not used everywhere, so several paths would still
report types that were really known to be undesirable. This would lead
to types used inside templates to be reported in the instantiation
location, for example.

Make sure InstantiatedTemplateVisitor::ReportTypeUse really does ignore
types.

Fixes: include-what-you-use#1004, include-what-you-use#1110
@kimgr
Copy link
Contributor

kimgr commented Feb 23, 2023

I wonder what the string ctor is doing with the map. Will look into this in more detail as time permits.

I've dug deep into the regex tree of headers, and found a clue. Any non-trivial constructor for std::regex calls __detail::__compile_nfa<_Rx_traits>(...), which I think eventually ends up in regex_automaton.tcc, _StateSeq<>::_M_clone(), which in turn uses std::map in its implementation.

This is not an explanation for why include-what-you-use behaves the way it does, it just establishes the link between the std::regex constructor and std::map.

The IWYU behavior is caused by a long-standing problem with how we misinterpret types in template instantiations. All related issues are tagged template-spill .

kimgr added a commit to kimgr/include-what-you-use that referenced this issue Feb 26, 2023
InstantiatedTemplateVisitor does special traversal to collect a bunch of
AST nodes to ignore, for various reasons. It also builds a resugar-map
to associate desugared types with their original spelling (e.g.
re-binding canonical types to any typedef used for template
instantiation).

CanIgnoreType uses these node sets to determine whether Type instance
can be ignored.

But CanIgnoreType is not used everywhere, so several paths would still
report types that were really known to be undesirable. This would lead
to types used inside templates to be reported in the instantiation
location, for example.

Make sure InstantiatedTemplateVisitor::ReportTypeUse really does ignore
types.

Fixes: include-what-you-use#1004, include-what-you-use#1110
kimgr added a commit to kimgr/include-what-you-use that referenced this issue Mar 5, 2023
CanIgnoreType is a little subtle, being a virtual method with very
different implementations for the primary visitor vs. the
instantiated-template visitor.

InstantiatedTemplateVisitor builds a resugar-map to associate canonical
types with their original spelling (e.g.  re-binding canonical types to
any typedef used for template instantiation). Its derived CanIgnoreType
uses these node sets to determine whether Type instance can be ignored,
e.g. because it's a default template param or it's supposed to be
provided by the template.

But CanIgnoreType was not used everywhere, so several paths would still
report types that were really known to be undesirable. This would lead
to types used inside templates to be reported in the instantiation
location, for example.

Change both the base and the instantiation-template implementations of
ReportTypeUse to call CanIgnoreType and always skip over ignored types.

Fixes: include-what-you-use#1004, include-what-you-use#1110
kimgr added a commit to kimgr/include-what-you-use that referenced this issue Mar 5, 2023
CanIgnoreType is a little subtle, being a virtual method with very
different implementations for the primary visitor vs. the
instantiated-template visitor.

InstantiatedTemplateVisitor builds a resugar-map to associate canonical
types with their original spelling (e.g. re-binding canonical types to
any typedef used for template instantiation). Its derived CanIgnoreType
uses this map to determine whether Type instance can be ignored,
e.g. because it's a default template param or it's supposed to be
provided by the template, etc.

But CanIgnoreType was not used everywhere, so several paths would still
report types that were really known to be undesirable. This would lead
to types used inside templates to be reported in the instantiation
location, for example.

Change both the base and the instantiation-template implementations of
ReportTypeUse to call CanIgnoreType and always skip over ignored types.

Fixes: include-what-you-use#1004, include-what-you-use#1110
kimgr added a commit to kimgr/include-what-you-use that referenced this issue Mar 5, 2023
CanIgnoreType is a little subtle, being a virtual method with very
different implementations for the primary visitor vs. the
instantiated-template visitor.

InstantiatedTemplateVisitor builds a resugar-map to associate canonical
types with their original spelling (e.g. re-binding canonical types to
any typedef used for template instantiation). Its derived CanIgnoreType
uses this map to determine whether Type instance can be ignored,
e.g. because it's a default template param or it's supposed to be
provided by the template, etc.

But CanIgnoreType was not used everywhere, so several paths would still
report types that were really known to be undesirable. This would lead
to types used inside templates to be reported in the instantiation
location, for example.

Change both the base and the instantiation-template implementations of
ReportTypeUse to call CanIgnoreType and always skip over ignored types.

This harmonization would not have been possible without the recent
tireless work of bolshakov-a improving resugaring and template handling
in general.

Fixes: include-what-you-use#1004, include-what-you-use#1110
kimgr added a commit to kimgr/include-what-you-use that referenced this issue Mar 5, 2023
CanIgnoreType is a little subtle, being a virtual method with very
different implementations for the primary visitor vs. the
instantiated-template visitor.

InstantiatedTemplateVisitor builds a resugar-map to associate canonical
types with their original spelling (e.g. re-binding canonical types to
any typedef used for template instantiation). Its derived CanIgnoreType
uses this map to determine whether Type instance can be ignored,
e.g. because it's a default template param or it's supposed to be
provided by the template, etc.

But CanIgnoreType was not used everywhere, so several paths would still
report types that were really known to be undesirable. This would lead
to types used inside templates to be reported in the instantiation
location, for example.

Change both the base and the instantiation-template implementations of
ReportTypeUse to call CanIgnoreType and always skip over ignored types.

This harmonization would not have been possible without the recent
tireless work of bolshakov-a improving resugaring and template handling
in general.

Fixes: include-what-you-use#1004, include-what-you-use#1110
kimgr added a commit that referenced this issue Mar 5, 2023
CanIgnoreType is a little subtle, being a virtual method with very
different implementations for the primary visitor vs. the
instantiated-template visitor.

InstantiatedTemplateVisitor builds a resugar-map to associate canonical
types with their original spelling (e.g. re-binding canonical types to
any typedef used for template instantiation). Its derived CanIgnoreType
uses this map to determine whether Type instance can be ignored,
e.g. because it's a default template param or it's supposed to be
provided by the template, etc.

But CanIgnoreType was not used everywhere, so several paths would still
report types that were really known to be undesirable. This would lead
to types used inside templates to be reported in the instantiation
location, for example.

Change both the base and the instantiation-template implementations of
ReportTypeUse to call CanIgnoreType and always skip over ignored types.

This harmonization would not have been possible without the recent
tireless work of bolshakov-a improving resugaring and template handling
in general.

Fixes: #1004, #1110
@kimgr kimgr added this to the iwyu 0.20 milestone Mar 5, 2023
@kimgr
Copy link
Contributor

kimgr commented Mar 5, 2023

Fixed on master as of ffe6c3f.

@kimgr kimgr closed this as completed Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants