-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Fix #12821 (GUI: exclude folders in imported project) #7645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f29f42d
to
572dedf
Compare
See #7540. I had some issues with the behavior which I have not outlined in that PR yet because the code contains a bug which made the tests all pass although that should not have. Also |
@@ -157,7 +157,7 @@ def test_gui_project_loads_relative_vs_solution_2(tmp_path): | |||
def test_gui_project_loads_relative_vs_solution_with_exclude(tmp_path): | |||
proj_dir = tmp_path / 'proj2' | |||
shutil.copytree(__proj_dir, proj_dir) | |||
create_gui_project_file(os.path.join(tmp_path, 'test.cppcheck'), root_path='proj2', import_project='proj2/proj2.sln', exclude_paths=['b']) | |||
create_gui_project_file(os.path.join(tmp_path, 'test.cppcheck'), root_path='proj2', import_project='proj2/proj2.sln', exclude_paths=['b/']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike that we need to rely on some options ending with /
so they are treated as a path. That also complicated things while working on my PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I think the path matching is currently trying to do too much which makes the syntax weird and difficult to implement. Ideally I would like it to make it simpler but I've been hesitant to do so because I don't know what might break because of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally I would like it to make it simpler but I've been hesitant to do so because I don't know what might break because of it.
I dislike it also.. but it is good imho that you try to limit the effects for now.. it's already a big PR anyway.
@@ -450,7 +450,7 @@ class TestImportProject : public TestFixture { | |||
project.fileSettings = {std::move(fs1), std::move(fs2)}; | |||
|
|||
project.ignorePaths({"*foo", "bar*"}); | |||
ASSERT_EQUALS(2, project.fileSettings.size()); | |||
ASSERT_EQUALS(1, project.fileSettings.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this match less? That obviously looks like a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because *
was changed to match any number of characters, including zero. I don't really have a strong opinion on that, but that's how globs usually behave.
lib/pathmatch.cpp
Outdated
if (caseSensitive) | ||
mRegex = std::regex(regex_string, std::regex_constants::extended); | ||
else | ||
mRegex = std::regex(regex_string, std::regex_constants::extended | std::regex_constants::icase); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to print the regular expression and its respective input with the --debug-ignore
option.
See also https://trac.cppcheck.net/ticket/12268. And if you work on a ticket please assign yourself to it in Trac. |
I only noticed you were also working on PathMatch after I made this, sorry about that. I agree the standard regex is slow and I'd rather use pcre, but I didn't want to make it a hard dependency just for this. For the common use case of PathMatch, which is matching maybe a few hundred files or so against a handful of patterns, the performance penalty of using |
No problem. I started working on it thinking it would a low-hanging fruit which wasn't the case. And I forgot to tag it with the ticket. And the one you worked on was no properly triaged.
Right, PCRE is currently not a hard dependency. So PCRE2 might not be a good replacement.
That might depend on the scenario. If there might be thousands of files it could be slow. It has been ages since I tried using
I will try to get the regex refactoring in. The first step is awkward but with some help we might get it into better shape fast. |
With a glob implementation inline in That is probably where my performance concerns might have come from since the latter is also used outside of path handling stuff. |
I mostly agree with this, for the three places where matchglob is currently used for something other than paths it makes sense to keep it that way, all other uses of matchglob should be changed to use PathMatch instead. |
I agree since it makes things more explicit. But I would still prefer if we only have a single implementation of the glob logic. |
lib/pathmatch.cpp
Outdated
if (Path::isAbsolute(path)) | ||
p = Path::simplifyPath(path); | ||
else | ||
p = Path::simplifyPath(Path::getCurrentPath() + "/" + path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the feeling it would be good to create a member mCurrentPath
that is used instead of the function call. It will be good for unit testing purposes. And avoids repeated getcwd
system calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PathMatch
already has mWorkingDirectories
which also uses CWD. That should be configured from the outside. The implementation should best be filesystem-agnostic (not taking the flawed case-sensitivity handling into mind right now).
417d477
to
692cb54
Compare
8819409
to
3348484
Compare
I've updated this to use PathMatch everywhere a pattern is matched against a file, so that file matching is consistent. While doing so I found that there are places where the performance of PathMatch is actually quite important (especially in suppressions), so I've opted to use a hand-written matcher instead of regex. Another noteworthy change is that directories can now be matched without a trailing '/'. This has the awkward side-effect that 'foo.cpp' matches 'foo.cpp/bar.cpp' when 'foo.cpp' happens to be a directory, and 'foo/' matches 'foo' even if foo is not a directory (enforcing directories for such patterns would require a file system lookup), but I still think it's preferable. Regarding matchglob, I think it makes sense to have a more general wildcard implementation in matchglob, and a separate implementation in PathMatch because I'm not convinced that the rules for path globs should be the same as for matching general wildcards in error id's and such. Path globs need to match on path component boundaries, and '*' and '?' do not match path separators. For matching path separators as well, '**' is used instead. These changes fix #12821, #12268, and #13997. Possibly related: #13983. |
{ | ||
const std::string& nativePath = toNativeSeparators(path); | ||
return c == '/' || c == '\\'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm techically in linux, backslash is not a path separator. you can create a directory or file with backslash in the name.
case '*': { | ||
bool slash = false; | ||
++s; | ||
if (*s == '*') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a change of logic that will probably mean that lots of users will have to adapt their configurations. We have documented that a single star matches path separators in some places I believe so it would make sense to look through documentation .. But overtime I have the feeling it is better to use **
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would make sense to look through documentation
I tried to quickly locate some such documentation and did not see it.
But adding a note about it in the releasenotes.txt wouldn't hurt at least.
*/ | ||
explicit PathMatch(std::vector<std::string> paths, bool caseSensitive = true); | ||
enum class Mode : std::uint8_t { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ehm.. Writing a enum is better. But not sure about the names here. I think I'd prefer to not abbreviate. I don't have a clear idea about what names I would prefer instead of Mode,icase,scase.
static bool isRelativePattern(const std::string &pattern) | ||
{ | ||
if (pattern.empty() || pattern[0] != '.') | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spontanously I feel that file1.c
should be a "relative pattern"?
@@ -157,7 +157,7 @@ def test_gui_project_loads_relative_vs_solution_2(tmp_path): | |||
def test_gui_project_loads_relative_vs_solution_with_exclude(tmp_path): | |||
proj_dir = tmp_path / 'proj2' | |||
shutil.copytree(__proj_dir, proj_dir) | |||
create_gui_project_file(os.path.join(tmp_path, 'test.cppcheck'), root_path='proj2', import_project='proj2/proj2.sln', exclude_paths=['b']) | |||
create_gui_project_file(os.path.join(tmp_path, 'test.cppcheck'), root_path='proj2', import_project='proj2/proj2.sln', exclude_paths=['b/']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally I would like it to make it simpler but I've been hesitant to do so because I don't know what might break because of it.
I dislike it also.. but it is good imho that you try to limit the effects for now.. it's already a big PR anyway.
@@ -3345,7 +3345,7 @@ class TestCmdlineParser : public TestFixture { | |||
const char * const argv[] = {"cppcheck", "-isrc\\file.cpp", "src/file.cpp"}; | |||
ASSERT(!fillSettingsFromArgs(argv)); | |||
ASSERT_EQUALS(1, parser->getIgnoredPaths().size()); | |||
ASSERT_EQUALS("src/file.cpp", parser->getIgnoredPaths()[0]); | |||
ASSERT_EQUALS("src\\file.cpp", parser->getIgnoredPaths()[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho it is preferable to always use /
as path separator internally. we don't have to ensure that backslash is handled properly everywhere then. would it cause problems to keep this behavior?
|
No description provided.