Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iwyu proposes include without path, albeit include with path exist already #364

Closed
gunrot opened this issue Oct 11, 2016 · 9 comments
Closed

Comments

@gunrot
Copy link

gunrot commented Oct 11, 2016

blabla.h should add these lines:
# include "x.h"             // for stuff

blbla.h should remove these lines:
- #include "a/b/c/x.h"  // lines 17-17

x.h is in somepath/a/b/c, somepath and somepath/a/b/c are in include path.

@kimgr
Copy link
Contributor

kimgr commented Feb 19, 2017

I think I agree :-)

Can you come up with a self-contained repro case? Without it, I don't think we have much to work on, and I would vote to close.

@bungeman
Copy link
Contributor

bungeman commented Mar 2, 2018

Here is a testcase for what I think is the same issue iwyu-testcase.zip.

Depending on the order of the original includes in src/makefoo.cpp iwyu will either be happy or unhappy. In the original project the reason for a setup like this is because some header files need to be able to work with fewer include paths than most of the source files. As a result, some of the header files need to be more explicit about the path to other header files. The same header file may be included differently in multiple places and iwyu seems to just pick one (the last one?).

There are multiple ways to resolve this. One way is to simply prefer the name in 'source' file if it's already there. Another is to try to always use the 'simplest' way available.

@bungeman
Copy link
Contributor

bungeman commented Mar 9, 2018

To clarify what is happening, consider the following case (which is like the iwyu-testcase above)

private/foo.h

#ifndef Foo_DEFINED
#define Foo_DEFINED
struct Foo { Foo() { } };
#endif

private/bar.h

#ifndef Bar_DEFINED
#define Bar_DEFINED
#include "../private/foo.h"
class Bar { Foo foo; };
#endif

src/makefoo.cpp

#include "foo.h"
#include "bar.h"
Foo MakeFoo() { Bar bar; (void)bar; return Foo(); }

Run iwyu

$ include-what-you-use -Iprivate -c src/makefoo.cpp
src/makefoo.cpp should add these lines:
#include "../private/foo.h"  // for Foo

src/makefoo.cpp should remove these lines:
- #include "foo.h"  // lines 1-1

The full include-list for src/makefoo.cpp:
#include "../private/foo.h"  // for Foo
#include "bar.h"             // for Bar

Verbose 8 output: foo_first.txt

However, had the includes been in a different order beforehand making src/makefoo.cpp look like

#include "bar.h"
#include "foo.h"
Foo MakeFoo() { Bar bar; (void)bar; return Foo(); }

Run iwyu

$ include-what-you-use -Iprivate -c src/makefoo.cpp
(src/makefoo.cpp has correct #includes/fwd-decls)

Verbose 8 output: foo_second.txt

then everything looks ok. In some sense, I think iwyu is wrong both technically and also not doing what a user wants.

From the technical side when "foo.h" comes first in makefoo.cpp, all declarations should come from its "foo.h" include (since the "../private/foo.h" include from private/bar.h has no content due to the guard macro). When "foo.h" comes second, no declarations should come from its "foo.h" (since the "../private/foo.h" include from private/bar.h has the content and defines the guard macro). So the results here seem backward from what I would expect.

From the user side the desired outcome is to always get the "simplest" path to the include for this translation unit. This may just mean that if the main compilation unit already has a spelling of that path, that's the one which should be used.

@kimgr
Copy link
Contributor

kimgr commented Mar 10, 2018

Thanks! Your comment from a week ago had escaped me. Isn't include name length a pretty good approximation for "simplest"? I can't come up with any examples where we should prefer longer names if a shorter one was used in the translation unit. Maybe public API includes that are also included with a relative name?

@kimgr
Copy link
Contributor

kimgr commented Mar 10, 2018

There was an idea floating around years ago that IWYU should primarily operate on existing #include lines and just move, copy or delete them, depending on use circumstances (the exception would be "discovered" includes, such as <new> for placement new, etc).

I think that would need some kind of internal/external context so that includes can be freely moved and deleted in the internal context (close to the main source file), but only copied from the external context (system or third-party headers) to the internal.

That's about how clear this is in my head, i.e. not at all, but the idea has a certain je-ne-sais-quoi.

@kimgr kimgr changed the title iwyu proposes include without path , albeit include with path exist already iwyu proposes include without path, albeit include with path exist already Mar 10, 2018
@kimgr
Copy link
Contributor

kimgr commented Mar 10, 2018

I've been looking around the stack of issues, and there are several that seem related to this. I'm reluctant to mention them, because they will be linked for eternity, but here's a list of arbitrary numbers you may want to do something with: 16, 144, 273, 374, 401.

It looks like IWYU already does choose the shortest spelling of an include, in ConvertToQuotedInclude:
https://github.com/include-what-you-use/include-what-you-use/blob/master/iwyu_path_util.cc#L180

I'll try and debug your example from that focus point when I find some time.

@kimgr
Copy link
Contributor

kimgr commented Mar 10, 2018

This diff makes your testcase pass:

diff --git a/iwyu_path_util.cc b/iwyu_path_util.cc
index cc8eed6..b30bbe5 100644
--- a/iwyu_path_util.cc
+++ b/iwyu_path_util.cc
@@ -127,7 +128,7 @@ string GetCanonicalName(string file_path) {
 
 string NormalizeFilePath(const string& path) {
   llvm::SmallString<128> normalized(path);
-  llvm::sys::path::remove_dots(normalized);
+  llvm::sys::path::remove_dots(normalized, /*remove_dot_dot=*/true);
 
 #ifdef _WIN32
   // Canonicalize directory separators (forward slashes considered canonical.)

I would be surprised if that solved all problems, but it seems reasonable for path normalization to collapse dots. That said, I think LLVM's remove_dots is too simplistic:

  • It only works for one path style at a time (windows, posix, native) -- so a path on Windows with mixed separators like path/to\../file.h doesn't look like it will be properly collapsed
  • Its docs say it collapses all ../ except leading. That's probably not good enough for us

We can build our own normalization logic to work around these micro-issues, but it seems like there is still some bigger issue lingering.

@bungeman
Copy link
Contributor

While there may be other issues as well, I've been using your diff for the past few days and it sure makes my life easier, so I'd support it.

The only thing I can think of to add (and this seems to be an issue for the existing code) is to ensure after removing dots that the new relative path when resolved against the current base refers to the same underlying file as the original relative path when resolved against its original base. In other words, guard against symbolic links mucking things up.

@kimgr
Copy link
Contributor

kimgr commented Jul 21, 2020

The remove_dotdot fix went in for #690, so let's close this for now.

@kimgr kimgr closed this as completed Jul 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants