Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter redesign (forward slashes on Windows) #257

merged 6 commits into from Jun 3, 2018


Copy link

@latk latk commented May 22, 2018

As discussed in #191, this PR redesigns the filter system.

TL;DR; This is a breaking change that simplifies filters on all platforms. Filters will always use forward slashes, even on Windows. Please test and provide feedback.

How filters currently work

  • build_filter(): filters are compiled into regexes. Before that, the filter is transformed to a realpath(), i.e. an absolute path with symlinks resolved. We can't do that on Windows because backslashes in the filter pattern might be a regex escape, not a path separator. Instead, we prepend the current directory. This fails to produce a filter that can match if it starts with ..\\.

  • apply_filter_include_exclude(): The filters are matched against file paths that happen to be absolute paths.

How this PR implements filtering

A Filter is an object with a match() method. This object manages both the filter regex and the path transformation. Showing the relevant code snippets is easier than explaining:

def build_filter(regex):
    if os.path.isabs(regex):
        return AbsoluteFilter(regex)
        return RelativeFilter(os.getcwd(), regex)

class Filter(object):
    def __init__(self, pattern):
        self.pattern = re.compile(pattern)

    def match(self, path):
        os_independent_path = path.replace(os.path.sep, '/')
        return self.pattern.match(os_independent_path)

class AbsoluteFilter(Filter):
    def match(self, path):
        abspath = os.path.realpath(path)
        return super().match(abspath)

class RelativeFilter(Filter):
    def __init__(self, root, pattern):
        self.root = root

    def match(self, path):
        abspath = os.path.realpath(path)
        relpath = os.path.relpath(abspath, self.root)
        return super().match(relpath)

class AlwaysMatchFilter(Filter):
    def __init__(self):

    def match(self, path):
        return True

class DirectoryPrefixFilter(Filter):
    def __init__(self, directory):
        os_independent_dir = directory.replace(os.path.sep, '/')
        pattern = re.escape(os_independent_dir + '/')

So the build_filter() method just dispatches to the relative or absolute case. The AlwaysMatchFilter is used as a null filter for --gcov-filter, and DirectoryPrefixFilter is used when no --filter was specified, in which case the --root is used. This is in line with the current behaviour, though gcovr -r /foo and gcovr -r /foo -f /foo/ will continue to mean slightly different things iff that path contains a symlink.

Importantly, the filter pattern is no longer transformed as a path before being compiled into a regex! This means we can now run the same filtering code on Windows and Unix. To use filters that match relative paths, we turn that path into a relative path.

One remaining difference is that Windows doesn't seem to support Unix-like symlinks. The relevant test cases still fail, because ln -s creates a copy instead of a symlink. If anyone knows how to enable “real” symlink simulations for the test cases, please submit a PR.

This is a breaking change on all systems. Because the filter regexes are no longer treated as paths, there are some subtle changes on Unix (e.g., this filter will no longer work: -f ./subdir/). The large change for Windows is that filters must use forward slashes. All of these are OK:

gcovr --filter C:/foo/bar/  --filter subdirectory/ --filter ../

These are not:

gcovr --filter C:\wrong\slashes --filter ./wrong-current-directory

Please test this PR on your real projects. I am determined to push this through because it fixes many problems, but I'm hesitant to merge before it's been properly validated. And if you have any suggestions regarding the behaviour of these filters, now is the time to share them.

latk added 5 commits May 20, 2018
This introduces a Filter class with various subclasses for relative and
absolute patterns. This should bring filters on Windows to feature
parity with Unix.
filter-relative tests relative filters,
filter-absolute tests absolute filters.
This showed that continuing to use "realpath" is important for
filtering, in order to resolve symlinks. Doesn't work on Windows.
Copy link

@codecov codecov bot commented May 22, 2018

Codecov Report

Merging #257 into master will increase coverage by 0.73%.
The diff coverage is 97.18%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #257      +/-   ##
+ Coverage   88.76%   89.5%   +0.73%     
  Files          13      13              
  Lines        1478    1515      +37     
  Branches      267     269       +2     
+ Hits         1312    1356      +44     
+ Misses        108     105       -3     
+ Partials       58      54       -4
Impacted Files Coverage Δ
gcovr/ 83.1% <100%> (-0.42%) ⬇️
gcovr/tests/ 100% <100%> (ø) ⬆️
gcovr/tests/ 97.59% <100%> (+0.02%) ⬆️
gcovr/ 94.15% <93.33%> (+5.12%) ⬆️
gcovr/ 68.75% <97.56%> (+6.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 17dfaf6...bbc2155. Read the comment docs.

Copy link

@lisongmin lisongmin commented May 22, 2018

Great, I like it. I will test it later when i free from work.

Copy link

@lisongmin lisongmin commented May 23, 2018

Test on mingw64 with mingw64-x86_64-python3

test relative path with regex : pass

-> % gcovr -s -p  --html-detail -j 4 -f "[.]{2}/(src|include)/" -o coverage/c.html
lines: 99.8% (2195 out of 2200)
branches: 60.7% (1618 out of 2664)

test abs path with unix syntax : pass

-> % gcovr -s -p -j 4 -f "/work/skybilityha/ha-engine/src" -o coverage.txt
lines: 100.0% (948 out of 948)
branches: 64.7% (741 out of 1145)
-> % gcovr -s -p -j 4 -f "C:/msys64/work/skybilityha/ha-engine/src" -o coverage.txt
lines: 100.0% (948 out of 948)
branches: 64.7% (741 out of 1145)

test abs path with windows syntax: needed a check here

-> % gcovr -s -p -j 4 -f "C:\\msys64\\work\\skybilityha\\ha-engine\\src" -o coverage.txt
Traceback (most recent call last):
  File "C:\msys64\home\lison\.local\bin\", line 11, in <module>
    load_entry_point('gcovr', 'console_scripts', 'gcovr')()
  File "C:/msys64/work/github/gcovr\gcovr\", line 476, in main
    options.filter = [build_filter(f) for f in options.filter]
  File "C:/msys64/work/github/gcovr\gcovr\", line 476, in <listcomp>
    options.filter = [build_filter(f) for f in options.filter]
  File "C:/msys64/work/github/gcovr\gcovr\", line 224, in build_filter
    return AbsoluteFilter(regex)
  File "C:/msys64/work/github/gcovr\gcovr\", line 231, in __init__
    self.pattern = re.compile(pattern)
  File "C:/msys64/mingw64/lib/python3.6\", line 233, in compile
    return _compile(pattern, flags)
  File "C:/msys64/mingw64/lib/python3.6\", line 301, in _compile
    p = sre_compile.compile(pattern, flags)
  File "C:/msys64/mingw64/lib/python3.6\", line 562, in compile
    p = sre_parse.parse(p, flags)
  File "C:/msys64/mingw64/lib/python3.6\", line 855, in parse
    p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
  File "C:/msys64/mingw64/lib/python3.6\", line 416, in _parse_sub
    not nested and not items))
  File "C:/msys64/mingw64/lib/python3.6\", line 502, in _parse
    code = _escape(source, this, state)
  File "C:/msys64/mingw64/lib/python3.6\", line 401, in _escape
    raise source.error("bad escape %s" % escape, len(escape))
sre_constants.error: bad escape \m at position 2

Copy link

@lisongmin lisongmin commented May 23, 2018

Read design again, it seems "\" is not valid in filter, so above testcase is not a problem.

But it is strange we select Abs or relative filter via input regex, and the input regex is not treated as a path(abs path) later.

Copy link
Member Author

@latk latk commented May 23, 2018

Thank you @lisongmin for your tests.

The “bad escape” error from the regex module doesn't look very user friendly, and only occurs on recent Python 3 versions. I'll try to implement a warning when a user writes a filter that cannot be a regex for a path, e.g. if it contains an escaped backslash \\ or an unknown escape.

The Absolute and Relative filters describe what kind of path they are trying to match. So we do not transform the filter, but transform the path into the format that the filter is expecting.

@latk latk force-pushed the unified-filters branch from ccb53ff to ac36380 Jun 3, 2018
@latk latk force-pushed the unified-filters branch from ac36380 to bbc2155 Jun 3, 2018
@latk latk merged commit 7657fdf into gcovr:master Jun 3, 2018
4 checks passed
@latk latk deleted the unified-filters branch Jun 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants