-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project specific exclusion (configuration?) filter file #229
Comments
This might indeed be a useful addition – thanks for the suggestion. If we are going to add any configuration file, it shouldn't only be possible to ignore/exclude some patterns, but also add other options (there are multiple different kinds of filters and excludes). To implement a configuration file, we need to determine
In the meanwhile, invoking gcovr through a shell script etc. would allow you to specify default arguments. E.g. as a Posix-Shell script: #!/bin/sh
exec gcovr --some default --args "$@" |
Trouble I found when doing a quick proof of concept implementation (https://github.com/thejk/gcovr/tree/ignore) is finding the files and in what directories to expect them. If we concentrate just on the different filters:
When searching for gcov/gcda/gcno files --exclude-directories is used. I mostly use meson for new projects and have old projects using autofoo, meson enforces that you use a separate build directory and autofoo these days encourage it. The trouble I have here is figuring out where a good place for the ignore files would be. Another more generic option would perhaps be to use --config-file or something but then we're back at having to add a option to the gcovr command which was what I wanted to skip. So, the POC branch that can only exclude source files works for my little case but it's probably not generic enough |
As you've pointed out, the build directory would not be a good place for gcovr config files because the build directory should not be version-controlled. Instead, some combination of
might be sensible. The root would usually be the appropriate place. But before we continue, can you explain an example scenario where persistent excludes through a config file would be helpful? I'm not familar with Meson etc., but in most build systems I would either drive gcovr through a separate script that contains my defaults, or through a target in the build system. If that target is written properly, I can add control extra options through variables, no extra config files needed. I want gcovr to be a helpful tool, but I don't want to add features that don't quite solve actual problems that I'm then stuck maintaining for backwards-compatibility: in open source, “no” is temporary, but “yes” is forever. I think the main use case of a gcovr ignore file would be to permanently exclude libraries or autogenerated code that lives within the project root (but not actually your src dir)? Perhaps an invocation style like |
meson generates default gcovr targets, current released version generates I have a pull request (mesonbuild/meson#3145) for improving support for gcovr 3.1 that has not yet been merged. Then the target becomes Now I learn that you can do Trouble with making more clever targets is that meson (as most buildsystems) will allow you to put your source wherever, I mean it's common to put source files in $project_root/src and test code in $project_root/test but you could go $project_root/code ;) So my main case that got me thinking about this: When I generate a report in either build-coverage or build-test I don't want to include any *.gcda from the other build dir, hence my meson patch to change --root to $build_dir. For the case above So, all that said, I can understand if the reaction here is that meson should make better targets and leave it possible for users to give a list of excludes or such. I just thought a directory sensitive ignore file could be kind of neat ;) |
Thank you @thejk for that explanation. It was very helpful. My opinion on this is:
I would love to see a prototype for configuration file support. My general requirements would be:
|
A quick test to see what ArgumentParser could do: Uses a simple key = value config format with # as comment. Non argument options (such as --print-summary) is supported but looks a bit weird (syntax is key without = or value). See commit message for a more complete example. So problems with this approach:
|
Thank you, that looks fantastic! Pretty much exactly what I had in mind. One fundamental difference I'd like to see is that config files are processed before other arguments. That way, the command line args can override config file defaults. This can be done by extracting the --config argument with a separate parser. It may also be helpful to disallow config files to reference other config files, until a particular behaviour can be designed. Regarding your problems:
TBH at this point I'm 40% convinced to just write custom argument parsing code that solves these problems. |
(… time passes) I found the ConfigArgParse module which enhances argparse with better config file support. Maybe we can just use that? It seems pretty much ideal from their README, but I cannot figure out the necessary details due to their lack of documentation. Unfortunately, their bw2/ConfigArgParse#117 PR suggests dropping Python 2.6 support. If that is merged, we can't switch directly. Instead, it may be necessary to fall back to argparse or an older ConfigArgParse version for 2.6, so that gcovr can still be used, even without config file support. |
It's hard to see from reading the code but the config file is parsed before the command line arguments (in the second call to parser.parse_args). It works because I prepend '@configfile' to the list of arguments. If configfile has |
Based on some light research I think an ini style config file may cover all the bases with the added benefit of it being part of the standard lib via the Sample ini content [gcovr]
html = True
html_details = True
fail_under_line = 90.0
exclude =
build
test
vendor Minimal functional example: import configparser
ini_content = r"""
[gcovr]
html = True
html_details = True
fail_under_line = 90.0
exclude =
build
test
vendor
"""
config = configparser.ConfigParser()
config.read_string(ini_content)
print(print(config.sections()))
# => ['gcovr']
print(dict(config['gcovr']))
# => {'html': 'True', 'html_details': 'True',
# 'fail_under_line': '90.0', \
# 'exclude': '\nbuild\ntest\nvendor'} The key detail here will be crafting an appropriate list for argparse from the Config file naming and precedence is another issue from a user standpoint. I think a basic approach is best. Look for a single file in the CWD from a predetermined list of names and the first one that is found we take and ignore the rest. Potential list of "default" config names in order of precedence: Any initial thoughts? If I'm way off base feel free to hurl tomatoes or other rotten fruits. |
What about parsing the commandline for @file and replacing it with the parsed content oft the file. After this you can usw the "normal" commandline parser. This is the way I did it for Perl's Getopt::Long also supporting @file in a file. |
Thank you @Spacetown for that suggestion. Python's argparse has built-in support for
I'm still strongly considering to implement that solution because it's comparatively simple – literally less code & documentation than discussion in this issue. The only realistic alternative would be to write a custom argument parser. That's not impossible, but still a tremendous amount of work (I tried). Or maybe we only have to write custom validation and can keep the core argument parser? I don't know. I took a second look at ConfigArgParse which I still like, but it doesn't seem to support multiple values per argument well (it requires a TOML-style list like I took a peek at pytest which was mentioned here, and they have a lot of code just for config/argument parsing support (see https://github.com/pytest-dev/pytest/tree/master/src/_pytest/config). To be fair they also have to deal with plugins which adds more complexity. Maybe some part of their approach is reusable? I do however take those massive customizations on top of argparse as a strong indication that vanilla argparse is insufficient. Regarding using bog-standard configparser files as suggested by @stadelmanma: This trick with multiline arguments is good to know. But parsing the file is the easier part, while integrating it into the argument parsing is very tricky. The trick is to know the config file name before parsing the arguments so that the command line arguments can override the config file entries, but the command line arguments might override the default config file. Also, a config file has no data model – everything is a string. Does TL;DR: config files are difficult. I still haven't seen an elegant solution. I'm probably a bit too perfectionist here. But: no is temporary, yes is forever. I'll happily merge an imperfect solution as long as it doesn't introduce excessive technical debt and forwards compatibility issues. (Premature lock-in into the complete configparser syntax could be such an issue.) What now? This issue doesn't have high priority for me. If anyone wants to take a shot at it:
|
I find the |
The interpretation of Merging options and handling types are a bit of a pain, a good workaround might be a lightweight subclass of argparse.ArgumentParser that allows for more introspection on the arguments registered. With the additional feature of adding a hook to tell if the argument was actually supplied on the command line or not in order to differentiate between someone using a default value and the parser applying the default value. If I have time I'll try to get a minimally functional example partially based on the above prototype working with the exception of using ini file syntax instead of the basic key, value store. In the final product |
I did a quick test with the Another quick hack (7fd2c93) accepts I have added the test config but I forgot to run the stylecheck. |
Just a quick update that I do read all of the comments and code examples. Thank you to everyone for your suggestions. I won't have much time during the next two weeks to continue work on this, but will revisit later. One alternative that I want to think more about before committing to any approach is to build a layer on top of argparse that performs validation, and that layer can include config file parsing. This would avoid many of the problems of having to work within the argparse constraints, at the cost of maybe 300 lines of extra code plus tests. This is similar to the “introspect argparse to interpret a config file” idea, but is not constrained to argparse's data model because we can implement our own. |
@latk I'm working on moving the validation up and out of argparse. However, I'm setting things up in a very similar way since there system works well and I didn't see a need to reinvent the wheel. It will be awhile before the code is functional and tested since I have very limited free time right now. Relevant Branch: https://github.com/stadelmanma/gcovr/tree/move-argument-validation-outside-of-argparse |
@latk let me know what you think about some of the changes in the branch mentioned above. It doesn't trim down the code much as things stand but it moves a majority of the logic associated with config option setup to not be specific to argparse. That would allow us to define alternative config setup schemes with minimal code duplication (i.e. which ever file format(s) are chosen). Using this system most of the validation can still occur in the same place and the remainder of the code in I haven't added anything specific to config files yet since I didn't want to build up a huge body of work dependent on a config option setup scheme you didn't like. |
@stadelmanma I just read the diff. Thank you so much for working on this! I think your design absolutely moves in the right direction :) One great thing that I want to call out is that you write docstrings for everything! Two comparatively small concerns:
Oh, and this is the first time I've seen a legitimate use of Python's Anyway, I'd prefer to merge a (possibly imperfect) configuration refactoring before moving on to implement actual config files, simply to keep your work closer to the master branch. What do you think? |
Re point 1: I agree the dictionary method is not the best route. My first inclination was to just use a list but then I got hung up on how to deal with the option groups and instances where the In principle it wouldn't be an issue to add structure to support option groups as well as additional logic to provide the actual option flag strings. However it raises the issue of how argparse "specific" do we want to make this system? To a point I don't see a whole lot of harm in it because eventual config file logic can just ignore that info (it might even be useful somehow?) and argparse makes up the primary API for gcovr usage. I'd love some more thoughts on this from your point of view. Re point 2: Your assumption is correct that class only exists to tell us what values the user provided. A better route would be the creation of a sparse dictionary since we could then drop the class altogether. I was shooting for the same level of convenience by trying to make the default logic work. However, we can use the Lastly, I'm all for merging a smaller set of changes in and handling new features as separate pull requests. Smaller diffs make everyone happy 👍 |
The last couple of commits address point number 2 using |
@latk when you have time let me know what you think of the current diff in https://github.com/stadelmanma/gcovr/tree/move-argument-validation-outside-of-argparse. Essentially the GcovrConfigOption class now just wraps the argparse option arguments in an easily accessible fashion with some extra logic to automatically build the option groups. While the class is mainly argparse specific it should be descriptive enough to support a myriad of future config schemes. |
@stadelmanma Thank you for your work, that looks like a solid foundation for integrating config files. Please open a pull request! There are some smaller issues that need to be addressed in a review, but the design is fine :) Your branch has accumulated a lot of commits so I'll probably squash-merge them, or you can rewrite the history first. |
@latk PR is up. I'll let you handle merging commits in your own preferred fashion but I did go ahead and rebase against master to avoid any potential merge conflicts. |
Thanks to @stadelmanma's work in #279 there's now a config abstraction layer over argparse. It is now possible with moderate effort to integrate config file support. Roughly:
|
@latk thanks for merging that in, I've not had the spare time available to do much coding. I hope to have some time to work on an actual config file implementation in the coming weeks. What are your thoughts on the range of config files to support? For sanity I'd say we should only parse a single config file if multiple "candidates" are present in the same project and only look for files in the CWD. The Given that Next up would be any language specific project files we'd like to piggy back off of (helps to keep directory clutter to a minimum). I think each of these would best be handled as separate PRs provided by users who would like the feature assuming the language isn't too esoteric and the format isn't something hard to parse.
I don't think C/C++ has a standard project file? |
Finding the config file should be kept simple, so that it could be extended in the future. E.g., use only the first file that is found of:
I would prefer a config file format that is reasonably close to the command line arguments, in particular that allows multiple values for the same option. This is necessary to support multiple Therefore, I'd prefer a simple ad-hoc format, e.g.:
Sketch of a parser: def parse_rcfile(filename):
with io.open(filename) as f: # TODO encoding
for lineno, line in enumerate(f, 1):
line = re.sub(r'#.*$', '', line) # remove (trailing) comments
line = line.strip()
if not line: # skip empty lines, incl. comment lines
continue
# TODO handle syntax errors
key, value = re.match(r'([\w-]+) \s* = \s* (.*)', line, re.X).group(1, 2)
yield filename, lineno, key, value I don't want to provide explicit support for specific build systems, because that is a lot of complexity and difficult to test and maintain. However, I'd be open to declaring a config plugin interface after we've gained more experience with one config system. |
Agreed on keeping config file search limited and simple as well as deferring support for build systems until a later date. However, I'm not a big fan of creating our own config syntax, even if it is kept simple, since there are already a plethora of config file formats in the world for people to keep track of. Going with a well known format has the benefit of immediate familiarity for a vast majority of the user base (probably) and it can be parsed, generated, etc using a well tested set of 3rd party tools (i.e. pyyaml, and toml). Those tools also remove the need for us to implement our own parser code so the complexity of TOML/YAML parsing is bypassed (both work on Py2.X and Py3.X). If we would want to go the route of using our own config file I'd suggest using a syntax like .clang_complete where it is literally just a listing of command line options. The main benefit I see with this route is that we won't need to document how command line options map back to config file keys. We would also avoid adding another dependency to gcovr. The file could be parsed like so: opts = []
config_options = argparse.Namespace()
def parse_rcfile(filename):
with io.open(filename) as f: # TODO encoding
for lineno, line in enumerate(f, 1):
opts = shlex.split(line.strip(), comments=True) # this ignores content after a '#' by default
parser.parser_args(opts, namespace=config_options)
# TODO catch error and emit error with lineno
# TODO merge namespaces based on action type |
Config files have been implemented in PR #281. Please install the development version from GitHub to test them. In the docs I've marked this feature as experimental: if you find any issues with the design I'm willing to make backwards-incompatible fixes (such as renaming config keys, or tweaking the syntax). Thank you very much to everyone who contributed in this discussion :) |
When using gcovr in multiple projects and especially when used as generic targets like in meson build system I often wished I could add just one or more exclude rule. But also, would be nice to just be able to run
gcovr -r .. -o report
without having to remember exactly what excludes are useful/needed for this project.So, I purpose to add something like a .gitignore file but for gcovr, like .gcovrignore (yes imaginative I know) where each line will act as an extra --exclude filter relative to the ignore file.
To start with perhaps just look in the root dir but would be nice to check all searched directories.
The text was updated successfully, but these errors were encountered: