Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project specific exclusion (configuration?) filter file #229

Closed
thejk opened this issue Mar 1, 2018 · 29 comments

Comments

@thejk
Copy link
Contributor

@thejk thejk commented Mar 1, 2018

When using gcovr in multiple projects and especially when used as generic targets like in meson build system I often wished I could add just one or more exclude rule. But also, would be nice to just be able to run gcovr -r .. -o report without having to remember exactly what excludes are useful/needed for this project.

So, I purpose to add something like a .gitignore file but for gcovr, like .gcovrignore (yes imaginative I know) where each line will act as an extra --exclude filter relative to the ignore file.
To start with perhaps just look in the root dir but would be nice to check all searched directories.

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Mar 2, 2018

This might indeed be a useful addition – thanks for the suggestion.

If we are going to add any configuration file, it shouldn't only be possible to ignore/exclude some patterns, but also add other options (there are multiple different kinds of filters and excludes).

To implement a configuration file, we need to determine

  • how this integrates with argument parsing, to avoid duplicate validation code, and
  • whether a config file leads to any security considerations.

In the meanwhile, invoking gcovr through a shell script etc. would allow you to specify default arguments. E.g. as a Posix-Shell script:

#!/bin/sh
exec gcovr --some default --args "$@"
@thejk

This comment has been minimized.

Copy link
Contributor Author

@thejk thejk commented Mar 6, 2018

Trouble I found when doing a quick proof of concept implementation (https://github.com/thejk/gcovr/tree/ignore) is finding the files and in what directories to expect them.

If we concentrate just on the different filters:

  • --filter and --exclude
  • --gcov-filter and -gcov-exclude
  • --exclude-directories

When searching for gcov/gcda/gcno files --exclude-directories is used.
When processing the gcov files --gcov-filter and --gcov-exclude are used.
When finding sources from the gcov files --filter and --exclude are used.
(Please correct me if I'm wrong, but to the best of my understanding that's the gist of it)

I mostly use meson for new projects and have old projects using autofoo, meson enforces that you use a separate build directory and autofoo these days encourage it.
So, I have my build directory which is not commited and rather temporary,
in the build directory all my gcda, gcno (or gcov) files are so I run gcovr with --root=build, it will find the gcda files which will have source paths like ../src/foo.c.

The trouble I have here is figuring out where a good place for the ignore files would be.
For --exclude/--filter, ie for source files, the answer is easy enough, put the file in .. or ../src (relative --root) and we can search for the ignore rules while walking the tree to find the source files. This is what my POC branch does.
For --exclude-directories and --gcov-exclude/--gcov-include the answer isn't as clear to me, I mean putting them in the build directory would work and it's nice that they are in the root directory but the build directory shouldn't have any commited or persistent files. Putting the files in the project directory would be a good place but a rule that says that we search for files at the parent of --root just feels wrong.
Could give a --config-dir option or something but feels forced and not really helpful.

Another more generic option would perhaps be to use --config-file or something but then we're back at having to add a option to the gcovr command which was what I wanted to skip.

So, the POC branch that can only exclude source files works for my little case but it's probably not generic enough

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Mar 6, 2018

As you've pointed out, the build directory would not be a good place for gcovr config files because the build directory should not be version-controlled. Instead, some combination of

  • an explicit --config location,
  • the --root directory, or
  • any parent directories of the CWD

might be sensible. The root would usually be the appropriate place.

But before we continue, can you explain an example scenario where persistent excludes through a config file would be helpful? I'm not familar with Meson etc., but in most build systems I would either drive gcovr through a separate script that contains my defaults, or through a target in the build system. If that target is written properly, I can add control extra options through variables, no extra config files needed.

I want gcovr to be a helpful tool, but I don't want to add features that don't quite solve actual problems that I'm then stuck maintaining for backwards-compatibility: in open source, “no” is temporary, but “yes” is forever.

I think the main use case of a gcovr ignore file would be to permanently exclude libraries or autogenerated code that lives within the project root (but not actually your src dir)? Perhaps an invocation style like cd build; gcovr . --root ../src -o ../coverage.txt might already satisfy that, using the undocumented but stable search path feature?

@thejk

This comment has been minimized.

Copy link
Contributor Author

@thejk thejk commented Mar 7, 2018

meson generates default gcovr targets, current released version generates gcovr -r $project_root -o $build_dir/meson-logs/coverage.txt. cwd is $build_dir. $project_root and $build_dir are absolute paths and $build_dir is a subdir of $project_root. As you can see this doesn't work well since the changes to --root in gcovr 3.1 (all *.gcda files under $project_root are picked up instead of just from $build_dir).

I have a pull request (mesonbuild/meson#3145) for improving support for gcovr 3.1 that has not yet been merged. Then the target becomes gcovr -r $build_dir -o $build_dir/meson-logs/coverage.txt (if gcovr version >= 3.1).

Now I learn that you can do gcovr $build_dir -r $project_dir -o ... the achieve almost the same result (output source paths become build_dir/../src instead of ../src).

Trouble with making more clever targets is that meson (as most buildsystems) will allow you to put your source wherever, I mean it's common to put source files in $project_root/src and test code in $project_root/test but you could go $project_root/code ;)

So my main case that got me thinking about this:
project dirs:
src - source code
test - tests
subprojects/googletest-release-1.8.0 - wrapped gtest
build-coverage - build dir with coverage flags, using gcc
build-test - build dir with coverage flags and release flags, using gcc
build - build dir without coverage flags, using clang

When I generate a report in either build-coverage or build-test I don't want to include any *.gcda from the other build dir, hence my meson patch to change --root to $build_dir.
Then I want to exclude all sources from subprojects/ as I have no wish to count coverage on gtest headers.

For the case above gcovr . --root ../src would be too limiting, I want to include test and I have generated sources in the $build_dir that also should be included. Also somewhat impossible to make generic as someone else could put their code in ../code as said.

So, all that said, I can understand if the reaction here is that meson should make better targets and leave it possible for users to give a list of excludes or such. I just thought a directory sensitive ignore file could be kind of neat ;)

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Mar 14, 2018

Thank you @thejk for that explanation. It was very helpful. My opinion on this is:

  1. Generic build system targets are usually going to be futile. If there is a target, it should be configurable, similar to how the compiler invocation is usually quite configurable with many options and flags.

  2. Many of the more complicated use cases are better expressed by includes/--filter than by --exclude. A gcovrignore file would then be a bit too simple to serve many users. Also, I do not want to add another filtering mechanism on top of the existing filters.

  3. Gcovr would benefit from a built-in way to store configuration, since the necessary command line options can sometimes get rather lengthy. “Just write a shell script” is not a suitable solution, especially for cross-platform development. I would prefer this configuration mechanism to be a project-specific config file that can be extended to store arbitrary configuration, even if at the start only filterand exclude fields are allowed.

I would love to see a prototype for configuration file support. My general requirements would be:

  • should use an existing config parser/format
  • should integrate with argparse so that validation code can be reused
  • must allow multiple entries for the same key
  • must allow comments in the config file
  • must discuss how command line flags can override or extend configuration file entries
@thejk

This comment has been minimized.

Copy link
Contributor Author

@thejk thejk commented Mar 19, 2018

A quick test to see what ArgumentParser could do:
thejk@e8be4af

Uses a simple key = value config format with # as comment. Non argument options (such as --print-summary) is supported but looks a bit weird (syntax is key without = or value). See commit message for a more complete example.

So problems with this approach:

  • Somewhat confusing errors, unknown options in the config file will give given as errors by argparser but not reference the line number (or even the file)
  • Positional arguments such as search_paths are not supported
  • I don't like the syntax for options without arguments
  • Arguments that start with @ will be treated as config file to be read if and only if there is a gcovr.conf file or --config is given which can be confusing.
@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Mar 20, 2018

Thank you, that looks fantastic! Pretty much exactly what I had in mind.

One fundamental difference I'd like to see is that config files are processed before other arguments. That way, the command line args can override config file defaults. This can be done by extracting the --config argument with a separate parser. It may also be helpful to disallow config files to reference other config files, until a particular behaviour can be designed.

Regarding your problems:

  • error messages: are basically impossible to fix. One strategy could be to catch any argument errors, and amend the message with the config file name if applicable.
  • search_paths in config files: can perhaps be solved by adding a named parameter
  • keys without values: this is difficult/impossible. If we were to allow YAML-ish values such as option = yes for an --option flag, the config system would have to have a concept of types (like the standard library ConfigParser), or we'd have to change the argument parser to allow that as well. Seems tricky.
  • @config: can only be avoided ignoring the built-in config file support, and feeding args into the parser manually.

TBH at this point I'm 40% convinced to just write custom argument parsing code that solves these problems.

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Mar 20, 2018

(… time passes)

I found the ConfigArgParse module which enhances argparse with better config file support. Maybe we can just use that? It seems pretty much ideal from their README, but I cannot figure out the necessary details due to their lack of documentation.

Unfortunately, their bw2/ConfigArgParse#117 PR suggests dropping Python 2.6 support. If that is merged, we can't switch directly. Instead, it may be necessary to fall back to argparse or an older ConfigArgParse version for 2.6, so that gcovr can still be used, even without config file support.

@thejk

This comment has been minimized.

Copy link
Contributor Author

@thejk thejk commented Mar 21, 2018

It's hard to see from reading the code but the config file is parsed before the command line arguments (in the second call to parser.parse_args). It works because I prepend '@configfile' to the list of arguments.

If configfile has root = foo and the command line is --root bar then the final list of arguments will be --root foo --root bar which argparse will parse as root: bar

@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Jun 14, 2018

Based on some light research I think an ini style config file may cover all the bases with the added benefit of it being part of the standard lib via the configparser module. Value-less arguments would get applied using arg_name = True (but their presence alone is all that is really needed). Multiple value arguments like filter aren't "directly" supported but multi-line strings are, so splitting a key's value via newline characters would be sufficient to mimic the behavior.

Sample ini content

[gcovr]
html = True
html_details = True
fail_under_line = 90.0
exclude = 
    build
    test
    vendor

Minimal functional example:

import configparser

ini_content = r"""
[gcovr]
html = True
html_details = True
fail_under_line = 90.0
exclude = 
    build
    test
    vendor
"""

config = configparser.ConfigParser()
config.read_string(ini_content)

print(print(config.sections()))
# => ['gcovr']

print(dict(config['gcovr']))
# => {'html': 'True', 'html_details': 'True',
#     'fail_under_line': '90.0', \
#     'exclude': '\nbuild\ntest\nvendor'}

The key detail here will be crafting an appropriate list for argparse from the config object. I don't exactly remember the semantics but I don't think we will have to worry about escaping things like whitespace since I think sys.argv is already treated "as is". I chose a convention of turning dashes into underscores when mapping command line args to keys (just like argparse) but they could be left as dashes all the same.

Config file naming and precedence is another issue from a user standpoint. I think a basic approach is best. Look for a single file in the CWD from a predetermined list of names and the first one that is found we take and ignore the rest. Potential list of "default" config names in order of precedence: gcovr.ini, setup.cfg. This mimic's the python library pytest's treatment of config files however any naming convention could be used.

Any initial thoughts? If I'm way off base feel free to hurl tomatoes or other rotten fruits.

@Spacetown

This comment has been minimized.

Copy link
Contributor

@Spacetown Spacetown commented Jun 16, 2018

What about parsing the commandline for @file and replacing it with the parsed content oft the file. After this you can usw the "normal" commandline parser. This is the way I did it for Perl's Getopt::Long also supporting @file in a file.
Edit:
I see that there is already a option fromfile_prefix_chars='@' for the ArgumentParser. What about using this?

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Jun 16, 2018

Thank you @Spacetown for that suggestion. Python's argparse has built-in support for @args file arguments; @thejk's above prototype (e8be4af) uses this functionality to implement config files. My criticism of that (in addition to his points in #229 (comment)) is:

  • it introduces another metacharacter that users might have to escape (in theory – but in practice no one will name their files @foo)
  • this @argssyntax violates my personal aesthetics
  • the file syntax is very inflexible
  • The above prototype uses a multi-pass approach that first parses the arguments to see if a --config option was given, then re-parses the arguments with the given or default config file. It works, but it's weird.

I'm still strongly considering to implement that solution because it's comparatively simple – literally less code & documentation than discussion in this issue.

The only realistic alternative would be to write a custom argument parser. That's not impossible, but still a tremendous amount of work (I tried). Or maybe we only have to write custom validation and can keep the core argument parser? I don't know.

I took a second look at ConfigArgParse which I still like, but it doesn't seem to support multiple values per argument well (it requires a TOML-style list like exclude = [foo, bar, baz]). That's not good enough here, because each of those arguments could be a complicated regex.

I took a peek at pytest which was mentioned here, and they have a lot of code just for config/argument parsing support (see https://github.com/pytest-dev/pytest/tree/master/src/_pytest/config). To be fair they also have to deal with plugins which adds more complexity. Maybe some part of their approach is reusable? I do however take those massive customizations on top of argparse as a strong indication that vanilla argparse is insufficient.

Regarding using bog-standard configparser files as suggested by @stadelmanma: This trick with multiline arguments is good to know. But parsing the file is the easier part, while integrating it into the argument parsing is very tricky. The trick is to know the config file name before parsing the arguments so that the command line arguments can override the config file entries, but the command line arguments might override the default config file. Also, a config file has no data model – everything is a string. Does foo = True translate to the argument --foo=True or --foo? This would be solvable by introspecting the argparse parser internals, but those are underdocumented (I tried).


TL;DR: config files are difficult. I still haven't seen an elegant solution.

I'm probably a bit too perfectionist here. But: no is temporary, yes is forever. I'll happily merge an imperfect solution as long as it doesn't introduce excessive technical debt and forwards compatibility issues. (Premature lock-in into the complete configparser syntax could be such an issue.)

What now? This issue doesn't have high priority for me. If anyone wants to take a shot at it:

  • please perform more analysis of other projects. How are they addressing these problems? (In particular: multiple values for arguments, changing the source file location through command line parameters, integration with argparse).
  • optionally, write an issue comment with a small design document for this feature.
  • most likely it makes sense to flesh out @thejk's prototype and add tests + documentation.
@latk latk removed this from the gcovr 4.0 milestone Jun 17, 2018
@Spacetown

This comment has been minimized.

Copy link
Contributor

@Spacetown Spacetown commented Jun 17, 2018

I find the @args syntax not wired, even gcc uses it.
@thejk's above prototype (e8be4af) uses this but overrides the convert_arg_line_to_args to remove comments and add the --. I would prefer to write the -- into the config file. The second parse step is only needed for searching the config files in the default directory if no --config=... is given in the commandline.
It would be much clearer if the configfiles are prepend to the command line with a @ in the order "current working directory" and than the value of --root.

@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Jun 18, 2018

The interpretation of foo = True and alternatively foo = False is up to the code implementation. It would be simple to add a basic "truth" check for some common values i.e. false and 0 causing the key to be ignored and everything else counting as "truthy".

Merging options and handling types are a bit of a pain, a good workaround might be a lightweight subclass of argparse.ArgumentParser that allows for more introspection on the arguments registered. With the additional feature of adding a hook to tell if the argument was actually supplied on the command line or not in order to differentiate between someone using a default value and the parser applying the default value.

If I have time I'll try to get a minimally functional example partially based on the above prototype working with the exception of using ini file syntax instead of the basic key, value store. In the final product --root would effect config file discovery and if desired the new options --config or @file option could reign as king and no config file discovery would take place at all.

@Spacetown

This comment has been minimized.

Copy link
Contributor

@Spacetown Spacetown commented Jun 18, 2018

I did a quick test with the @syntax (314c724). It works also recursive but relative paths are relative through the cwd and not to the configuration file because of the implementation in argparse.

Another quick hack (7fd2c93) accepts @and --config=...but overrides the _read_args_from_files of argparse which is called if fromfile_prefix_chars is called. Extensins are encapsulated with # Added: begin and # Added: end.

I have added the test config but I forgot to run the stylecheck.

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Jun 30, 2018

Just a quick update that I do read all of the comments and code examples. Thank you to everyone for your suggestions. I won't have much time during the next two weeks to continue work on this, but will revisit later.

One alternative that I want to think more about before committing to any approach is to build a layer on top of argparse that performs validation, and that layer can include config file parsing. This would avoid many of the problems of having to work within the argparse constraints, at the cost of maybe 300 lines of extra code plus tests. This is similar to the “introspect argparse to interpret a config file” idea, but is not constrained to argparse's data model because we can implement our own.

@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Jul 3, 2018

@latk I'm working on moving the validation up and out of argparse. However, I'm setting things up in a very similar way since there system works well and I didn't see a need to reinvent the wheel. It will be awhile before the code is functional and tested since I have very limited free time right now.

Relevant Branch: https://github.com/stadelmanma/gcovr/tree/move-argument-validation-outside-of-argparse

@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Jul 25, 2018

@latk let me know what you think about some of the changes in the branch mentioned above. It doesn't trim down the code much as things stand but it moves a majority of the logic associated with config option setup to not be specific to argparse. That would allow us to define alternative config setup schemes with minimal code duplication (i.e. which ever file format(s) are chosen).

Using this system most of the validation can still occur in the same place and the remainder of the code in __main__.py doesn't need changed. Lastly, it provides a way to differentiate true default values and instances where the user provided a default.

I haven't added anything specific to config files yet since I didn't want to build up a huge body of work dependent on a config option setup scheme you didn't like.

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Jul 25, 2018

@stadelmanma I just read the diff. Thank you so much for working on this!

I think your design absolutely moves in the right direction :) One great thing that I want to call out is that you write docstrings for everything!

Two comparatively small concerns:

  • I'm not a fan that the argument information is now split between two places and that the ArgumentParser is assembled by pulling option objects out a dictionary. That is somewhat error-prone and verbose.

    I'd rather be able to write a loop over all gcovr options and stuff them into the correct argument group. This would require that the GCOVR_CONFIG_OPTIONS list gets some structure, e.g. by being turned into a dict of lists or by adding another field to the option class.

    It's not important how, but I would like to see this concern addressed.

  • I'm not sure whether the GcovrConfigOptionValue class is necessary. If I read this correctly you need the user_provided metadata in order to merge the config namespaces

    I think this might be unnecessary if the argument parsing process (and later the config parsing process) creates a sparse dictionary without defaults. Then it might be possible to resolve this through simple precedence. First take the config values, then overwrite or append the argparse values, then fill in any missing values with defaults. Should we stop telling argparse about the defaults, we will have to pre-format the help message strings.

    But I'll trust your judgement here. You wrote something that (presumably?) works and that is good :)

Oh, and this is the first time I've seen a legitimate use of Python's del operator^^

Anyway, I'd prefer to merge a (possibly imperfect) configuration refactoring before moving on to implement actual config files, simply to keep your work closer to the master branch.

What do you think?

@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Jul 26, 2018

@latk

Re point 1: I agree the dictionary method is not the best route. My first inclination was to just use a list but then I got hung up on how to deal with the option groups and instances where the metavar wasn't the same as the long option name (short option names aren't always consistent either).

In principle it wouldn't be an issue to add structure to support option groups as well as additional logic to provide the actual option flag strings. However it raises the issue of how argparse "specific" do we want to make this system? To a point I don't see a whole lot of harm in it because eventual config file logic can just ignore that info (it might even be useful somehow?) and argparse makes up the primary API for gcovr usage. I'd love some more thoughts on this from your point of view.

Re point 2: Your assumption is correct that class only exists to tell us what values the user provided. A better route would be the creation of a sparse dictionary since we could then drop the class altogether. I was shooting for the same level of convenience by trying to make the default logic work. However, we can use the add_option_to_parser function to automatically format our help messages with the default value instead of hardcoding it into the string.

Lastly, I'm all for merging a smaller set of changes in and handling new features as separate pull requests. Smaller diffs make everyone happy 👍

@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Jul 26, 2018

The last couple of commits address point number 2 using argparse.SUPPRESS. Now instead of a loop to strip out the wrapper object there is a basic "for in" loop to add in default values that are missing from the options namespace.

@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Sep 9, 2018

@latk when you have time let me know what you think of the current diff in https://github.com/stadelmanma/gcovr/tree/move-argument-validation-outside-of-argparse.

Essentially the GcovrConfigOption class now just wraps the argparse option arguments in an easily accessible fashion with some extra logic to automatically build the option groups. While the class is mainly argparse specific it should be descriptive enough to support a myriad of future config schemes.

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Sep 9, 2018

@stadelmanma Thank you for your work, that looks like a solid foundation for integrating config files. Please open a pull request! There are some smaller issues that need to be addressed in a review, but the design is fine :) Your branch has accumulated a lot of commits so I'll probably squash-merge them, or you can rewrite the history first.

@stadelmanma stadelmanma referenced this issue Sep 9, 2018
0 of 1 task complete
@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Sep 9, 2018

@latk PR is up. I'll let you handle merging commits in your own preferred fashion but I did go ahead and rebase against master to avoid any potential merge conflicts.

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Sep 14, 2018

Thanks to @stadelmanma's work in #279 there's now a config abstraction layer over argparse. It is now possible with moderate effort to integrate config file support. Roughly:

  • find or write a config file parser that can produce (filename, lineno, key, value) tuples

  • process those tuples in accordance to the new GcovrConfigOption datamodel, i.e. coerce the typeand perform the action, respecting the nargs setting if necessary. If anything fails, the filename/lineno can be used to produce helpful error messages.

  • integrate these parts into the main() so that command line arguments (--config?) are already available, then merge the option values before filling in any remaining defaults. This allows the command line to override config file values, or e.g. append additional --filters to those in a config file.

@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Sep 19, 2018

@latk thanks for merging that in, I've not had the spare time available to do much coding. I hope to have some time to work on an actual config file implementation in the coming weeks.

What are your thoughts on the range of config files to support? For sanity I'd say we should only parse a single config file if multiple "candidates" are present in the same project and only look for files in the CWD. The --root option should affect the config file search and if added a --config option could allow manual file selection.

Given that gcovr is "project" agnostic you may want some config file specifically for it (i.e. .gcovr.(ext)) or some other name, and in some reasonably human readable format (i.e. INI, TOML, YAML, etc.). This file would take precedence over any others present. The name and format I leave up to your personal preference and/or what you think would work best for the majority of your users.

Next up would be any language specific project files we'd like to piggy back off of (helps to keep directory clutter to a minimum). I think each of these would best be handled as separate PRs provided by users who would like the feature assuming the language isn't too esoteric and the format isn't something hard to parse.
Examples:

  • Python - "setup.cfg" and "pyproject.toml"
  • Node.js - "package.json"

I don't think C/C++ has a standard project file?

@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Sep 21, 2018

Finding the config file should be kept simple, so that it could be extended in the future. E.g., use only the first file that is found of:

  • --config
  • .gcovr file in the --root
  • .gcovr file in the current working directory.

I would prefer a config file format that is reasonably close to the command line arguments, in particular that allows multiple values for the same option. This is necessary to support multiple filters. As mentioned earlier I dislike Python's INI/ConfigParser implementation because it cannot do this (leading to multiline-value workarounds). YAML and TOML allow arbitrary JSON-like data structures, but YAML is an extremely complex format and TOML does not feature convenient syntax for lists of strings.

Therefore, I'd prefer a simple ad-hoc format, e.g.:

filter = lib/foo
filter = src/
# this is a comment
output = build/coverage/index.html
# a boolean option; the value should be parsed for store_true/store_false options
html-details = yes

Sketch of a parser:

def parse_rcfile(filename):
  with io.open(filename) as f:  # TODO encoding
    for lineno, line in enumerate(f, 1):
      line = re.sub(r'#.*$', '', line)  # remove (trailing) comments
      line = line.strip()

      if not line:  # skip empty lines, incl. comment lines
        continue

      # TODO handle syntax errors
      key, value = re.match(r'([\w-]+) \s* = \s* (.*)', line, re.X).group(1, 2)
      yield filename, lineno, key, value

I don't want to provide explicit support for specific build systems, because that is a lot of complexity and difficult to test and maintain. However, I'd be open to declaring a config plugin interface after we've gained more experience with one config system.

@stadelmanma

This comment has been minimized.

Copy link
Contributor

@stadelmanma stadelmanma commented Sep 22, 2018

Agreed on keeping config file search limited and simple as well as deferring support for build systems until a later date.

However, I'm not a big fan of creating our own config syntax, even if it is kept simple, since there are already a plethora of config file formats in the world for people to keep track of. Going with a well known format has the benefit of immediate familiarity for a vast majority of the user base (probably) and it can be parsed, generated, etc using a well tested set of 3rd party tools (i.e. pyyaml, and toml). Those tools also remove the need for us to implement our own parser code so the complexity of TOML/YAML parsing is bypassed (both work on Py2.X and Py3.X).

If we would want to go the route of using our own config file I'd suggest using a syntax like .clang_complete where it is literally just a listing of command line options. The main benefit I see with this route is that we won't need to document how command line options map back to config file keys. We would also avoid adding another dependency to gcovr.

The file could be parsed like so:

opts = []
config_options = argparse.Namespace()
def parse_rcfile(filename):
  with io.open(filename) as f:  # TODO encoding
    for lineno, line in enumerate(f, 1):
      opts = shlex.split(line.strip(), comments=True) # this ignores content after a '#' by default
      parser.parser_args(opts, namespace=config_options)
      # TODO catch error and emit error with lineno
  # TODO merge namespaces based on action type
@latk latk referenced this issue Sep 29, 2018
@latk latk removed the help wanted label Oct 11, 2018
@latk

This comment has been minimized.

Copy link
Member

@latk latk commented Oct 11, 2018

Config files have been implemented in PR #281. Please install the development version from GitHub to test them. In the docs I've marked this feature as experimental: if you find any issues with the design I'm willing to make backwards-incompatible fixes (such as renaming config keys, or tweaking the syntax).

Thank you very much to everyone who contributed in this discussion :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.