Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] [PoC] Configuration generation #10058

Conversation

leandrolanzieri
Copy link
Contributor

@leandrolanzieri leandrolanzieri commented Sep 27, 2018

What's the goal of this PR and its scope?

This PR to proposes a first workaround on "How to handle the configuration of modules in RIOT?".

This work is in conjuction with @jia200x

Here we are presenting an idea of what a solution to some of the issues found at the RIOT summit might look like.

This is a PoC intended to find consensus on the usage of configuration files (which generate header files as output), the YAML serialization format, and the proposed schema.

The scope of this PR is configuration format and mechanism (not how to integrate it to the build system), and only defines application configurations (but can be extended to all RIOT modules). We took the LoRaWAN example as a reference.

This PR is not by any means intended to be merged.

What's the proposal?

Draft 2:

  1. Declare and describe configurations in a well-known data structure (YAML is proposed):
    Define name, default values, description, types and properties. See here for an example. (The purpose of the header file on the PR is just to show the output of the configuration generator)

  2. Generate xxx_config.h files from these Configuration file with #ifndef CONFIG_XXX #define CONFIG_XXX VALUE #endif.

  3. Either:
    3a. RIOT is shipped with these xxx_config.h files in modules directories: They are regenerated only if the configuration file changes, and the config header is never meant to be edited by developers.
    3b. xxx_config.h files are generated on every build (make all) in the build directory .

Please note adding tools on top (e.g CLI/GUI for exposing configurations) is almost straightforward if we parse from Configuration Files instead of header files.

How to override default configuration values is the next discussions. With 3a and 3b is possible to use CFLAGS, a standard header file, riotbuild.h, or any other mechanism.

Important: xxx_config.h are a subset of the information in the Configuration files (YAML). All metadata should be included in the YAML file, and the xxx_config.h file is only the interface for injecting configurations in the corresponding C files.

FAQ

What's the problem with declaring configurations in header files?

We started working with the following proposal:

/**
 * @brief maximum number of cooks per soup
 *
 * fmt:int:min=1:max=10
 */
#ifndef CONFIG_KITCHEN_MAX_COOKS_PER_SOUP
#define CONFIG_KITCHEN_MAX_COOKS_PER_SOUP 1U
#endif

We were planning to add these configurations in xxx_config.h files on each module, so they can be included in C files that require these configurations.

These files were supposed to be used to extract information about params (name, default values, restrictions) and override values via CFLAGS or another header file.

However, we found some issues:

  • Extracting params information is not straightforward. In general C files are not intended to be parsed. E.g there's mixed Doxygen and a meta-configuration for each parameter, there can be corner cases.
  • Documentation generation. With this format we would need to replace the fmt line with the configuration description, and probably add | defined(DOXYGEN). Also, the proposed format is dependent to Doxygen.
  • It's hard to override in a multiple hierarchy (e.g CPU defines configurations that could be overridden by the board, and then also by the application)
  • It's not easy to describe dependencies with this format (e.g current configuration needs another configuration to have a certain value).
  • Validation of the format doesn't come out of the box and we would need to add a custom tool.
  • This doesn't prevent naming collisions.

Why YAML?

  • It's 100% standard, well known and tested (and easy to replace to another serialization format):
    YAML is a serialization standard across multiple languages, and wide used on different projects. Has multiple features such as comments, support for many data types and field reference. It can be easily transform to JSON.
  • There are off-the-shelf solutions to parse:
    There is support for: C/C++/Crystal/Ruby/Python/Java/Pearl/C#/.NET/Golang/PHP/JavaScript/OCaml/ActionScript/Haskell/Dart/Rust/Nim
  • There's a standard mechanism for semantics validation (Schema):
    By defining a SCHEMA (on another YAML of JSON file) you can validate the format of the configuration, document and version any changes that may by applied in the future. There are standard tools like Rx.
  • It's extensible and human-readable:
    Can be easily read and modified. Also easy to generate.
    version: 0.1.1
    name: my-lib
    authors: [just me]
    license: LGPL-2.1
    copyright: [2018 just]
    configuration:
    -group:
        name: timing
        parameters:
        -name: measurement_period
          description: Perior for reading sensor in seconds
          value: 10
          type: int
          properties:
            max: 60
    
    In the future more types, restriction or other metadata can be added in an easy way.
  • Supports multiple lines and comments:
    name: period
    value: 10
    properties:
        max: 60 # Something bad happens if it's more than 1 minute
    

Some possible features (for further discussions)

  • Configuration parameter validations:
    Indicating constraints on the accepted parameters.
  • Configuration of dependencies:
    • value: othermodule.parameter
  • Declare dependencies (USEMODULEs) in that same file:
    Also any configuration options for those dependencies.
  • All modules have a configuration file:
    This way it's easy to declare metadata such as versions.
  • Group configuration modules:
    • Use wildcards (e.g enable debug in gnrc.* modules)
  • Have configuration values generated as static const instead of preprocessor #defines.

Archived drafts:

Draft 1

  • Declare configurations in YAML files for each module:
    Define name, default values, description, types and properties. See here for an example. (The purpose of the header file on the PR is just to show the output of the configuration generator)
  • Be able to override params from configuration files:
    By using an override parent instead of configuration (could taste like override: gnrc.ieee802154.default_channel=11). The inheritance priority is not in the scope of this PR. The key feature, however, is that we will be able to handle coflicts in a well-defined and safe way.
  • Generate header_files during make all:
    We were thinking of keeping module_config.h format, and these files should be imported with #include module_config.h.
  • Optional: Generate YAML files with a GUI/CLI:
    Such as ncurses, ENV vars, etc. Here we are limiting ourselves to defining a core mechanism, not an interface.

@jia200x jia200x added TF: Config Marks issues and PRs related to the work of the Configuration Task Force Discussion: RFC The issue/PR is used as a discussion starting point about the item of the issue/PR labels Sep 27, 2018
@jia200x
Copy link
Member

jia200x commented Sep 27, 2018

maybe @kaspar030, @kYc0o , @aabadie and @cladmi are interested in this

@kYc0o
Copy link
Contributor

kYc0o commented Sep 27, 2018

If I understand that correctly, do you propose to use both .yaml and .h?

@jia200x
Copy link
Member

jia200x commented Sep 27, 2018

@kYc0o we forgot to indicate the header file here is just to show the output. We will update the description. (EDIT: Done)

We are originally proposing to generate the documentation with YAML, and produce this xxx_config.h files to be consumed from within the C files. Sort of how riotbuild.h file is used.

The question is: should this generated C files with default values be included in the upstream as well? (e.g put a at86rf2xx_config.h in the at86rf2xx.h). The current PR assumes this file doesn't exist until you generate them.

I've seen this in other OSs, and although the information is duplicated in the YAML and header file, the header file is never meant to be touched by developers.

Copy link
Contributor

@jcarrano jcarrano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of my comments were discussed separately with @leandrolanzieri and @jia200x . I'm putting them here so that we can all consider the alternatives.

I have not looked at the implementation of the tool (and I hope others don't either) because I think is beyond the point (it's a proof of concept).

I think we should write a doc explaining the format.

@@ -0,0 +1,23 @@
/* This is an automatically generated configuration file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: this file is for demonstration purposes, so that the app can be compiled without the tool, but would normally not be committed.

/* This is an automatically generated configuration file
* DO NOT EDIT, the content will be overwritten.
*/
#ifndef APP_CONFIG_CONFIG_H
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need #ifndef if the config format already includes a way for overriding / changing variables?


RIOT_CONFIG_FILE = riot.yaml

APP_H = app_config.h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The app_config.h, being a build artifact, should be placed somewhere in the build dir. @cladmi can you suggest something?

return value


if __name__ == '__main__':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could integrate lazysponge (#9634) functionality in this thing.

@@ -32,8 +32,7 @@
#include "net/loramac.h"
#include "semtech_loramac.h"

/* Messages are sent every 20s to respect the duty cycle on each channel */
#define PERIOD (20U)
#include "app_config.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for requiring an explicit #include if the config header and not having defines magically rain from the heavens.

name: lorawan
parameters:
- name: appkey
macro: APPKEY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The macro field can be optional, and default to the name in caps.

macro: APPEUI
description: LoRaWAN Application EUI
value: 0xBBBBBBBBBBBBBBBB
type: byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to stick to names defined in stdint.h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

description: LoRaWAN Application Key
value: 0xAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
type: byte
properties:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not shown here, but max and min are also properties.

@jcarrano
Copy link
Contributor

What do people think about the type of config variables. I'd like to have some kind of enum type, with a limited number of options, which are symbolic and not numeric.

For example, if you want to select the region for LoRa, you have to choose from a handful of values and the number used to represent them does not matter.

@jcarrano
Copy link
Contributor

I've seen this in other OSs, and although the information is duplicated in the YAML and header file, the header file is never meant to be touched by developers.

The point in including generated files is so that the user does not need to run the tool again (maybe he doesn't have it). For example, in Python packages, if you are using Cython it is common to include the generated files so that the package can be compiled without having Cython installed.

I don't think this will be the case in RIOT, as the tool will always be available and in any case it should be a super easy to install tool.

license: LGPL-2.1
copyright: [2018 Inria]
configuration:
- group:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of nesting levels and the verbosity can be reduced by using dicts of dicts instead of lists of dicts:

version: 0.1.0
name: lorawan-example
authors:
- name: Alexandre Abadie
  email: alexandre.abadie@inria.fr
license: LGPL-2.1
copyright: [2018 Inria]
configuration:
  lorawan: # group disappears
    appkey: # name disappears and so does macro
      description: LoRaWAN Application Key
      value: 0xAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
      type: byte
      properties:
        size: 16
    appeui:
      description: LoRaWAN Application EUI
      value: 0xBBBBBBBBBBBBBBBB
      type: byte
      properties:
        size: 8
    deveui:
      description: LoRaWAN Device EUI
      value: 0xCCCCCCCCCCCCCCCC
      type: byte
      properties:
        size: 8
  period:
    description: Seconds to wait between messages
    type: int
    value: 20

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open question with this approach: how do you know some name is a an option and not a namespace for other options without complicated guessing. Maybe option containers can be prefixed with a character (@ ??) or start with a capital letter.

@kYc0o
Copy link
Contributor

kYc0o commented Sep 27, 2018

Extracting params information is not straightforward. In general C files are not intended to be parsed. E.g there's mixed Doxygen and a meta-configuration for each parameter, there can be corner cases.

Are we forced to use doxygen on this header files? I think doc can still be there and the parsers (yes, very probable you'll need to write your parser by yourself) may skip it.

Be able to overrida params from configuration files:
By using an override parent instead of configuration (could taste like override: gnrc.ieee802154.default_channel=11). The inheritance priority is not in the scope of this PR.

The advantage of using header files is that you can do #ifndef for the parameters and just define them somewhere else, no need to state explicitly you want to override them.

The key feature, however, is that we will be able to handle coflicts in a well-defined and safe way.

You mean that the variable might be defined in two places? If that occurs, then we have another problem that cannot be solved by a "conflict detector".

Generate header_files during make all:
We were thinking of keeping module_config.h format, and these files should be imported with #include module_config.h.

This would slow down a lot the CI. The proposal was to create them one by one by hand, which takes a lot of time but it only need to be done once.

Configuration parameter validations:
Indicating constraints on the accepted parameters.

This is interesting indeed, but I'd say is really complicated to do it at low level. I'd say, if you're touching the files directly you know what you're doing. If not, use the IDE which already restrict the parameters. I don't know all the possibilities but I guess there should be a way to do it, but not straightforward.

Configuration of dependencies:

    value: othermodule.parameter

Please don't! We trying here to separate concerns, and here you're mixing things. Configuration values should be independent of anything, only restricted, if anything, as you stated before.

Declare dependencies (USEMODULEs) in that same file:
Also any configuration options for those dependencies.

Again, please don't, that's another topic which needs a different approach (probably yaml but in a different way).

All modules have a configuration file:
This way it's easy to declare metadata such as versions.

+1! (can be done in headers)

Group configuration modules:
Use wildcards (e.g enable debug in gnrc.* modules)

That IMHO needs another approach, some kind of config inheritance but always through header files. Using an external tool (yes, yaml it's a tool after all) makes things more complex.

Have configuration values generated as static const instead of preprocessor #defines.

This reduces a lot the scope of the variables and are less easy to handle as configuration options.

Overall, I think there's still a misunderstanding on what are module configurations and an application variant.

  • Module configurations are limited to the scope of the module, they shouldn't touch other modules unless there's an inheritance schema.

  • Application variants are applications which use, or not, certain module and thus have a different behaviour. For this, dependencies need to be explicitly declared, as modules depend directly from others. You can still configure this variant in a per module basis.

So, I propose to stay on the configuration dilemma and not mix with buildsystem config, which will need some love but in a separated issue.

@kYc0o
Copy link
Contributor

kYc0o commented Sep 27, 2018

@jcarrano I suggest first to reach consensus on the approach before tackling the approach in itself. I don't see any advantages of discussing what's done right or wrong in the implementation if the implementation can be changed anyways.

@jia200x
Copy link
Member

jia200x commented Sep 27, 2018

hi @kYc0o

This would slow down a lot the CI. The proposal was to create them one by one by hand, which takes a lot of time but it only need to be done once.

This can be solved with keeping the _config.h files in the upstream, so no need to write them by hand (and the CI won't generate them every time). When there are changes in the configuration, it would be required to regenerate them (it's possible to check this from within the CI).

The advantage of using header files is that you can do #ifndef for the parameters and just define them somewhere else, no need to state explicitly you want to override them.

If we follow the approach from above, we can still do this. The YAML file would only define how the _config.h will look, but we are still able to define configs everywhere. In fact, that's why we are surrounding the configurations in #ifndef.

So, I propose to stay on the configuration dilemma and not mix with buildsystem config, which will need some love but in a separated issue.

I think this is accurate and probably we shouldn't mix build system config, dependencies, etc yet.

Please don't! We trying here to separate concerns, and here you're mixing things. Configuration values should be independent of anything, only restricted, if anything, as you stated before.

I think this is more in the direction of global configurations. E.g, a lot of modules might need to read "THREAD_STACKSIZE" kind of variables. Maybe is more accurate to rename it to "read global configurations". What do you think?

My whole point is:

  1. a YAML file can be really good for describing configurations. Very easy to extend, standard mechanism to do semantic validations and standard.
  2. If we treat this Configuration files as metadata description, they can be invisible to RIOT. Let's see it this way:

Instead of manually adding _config.h files to the modules (e.g drivers/at86rf2xx), we can generate them from these riot.yaml files. So, every module will have its config header files with params surrounded by #ifdef ... #endif. We would still have the benefits of header files (define variables from where we want, not overloading the CI, etc). If a riot.yaml file changes, we run a test to check it matches the corresponding _config.h file.

So, header files declaring params are never touched by the developer.
What do you think?

@jcarrano I suggest first to reach consensus on the approach before tackling the approach in itself. I don't see any advantages of discussing what's done right or wrong in the implementation if the implementation can be changed anyways.

Sure, let's try to find consensus in the usage of a explicit Config File for generating header files.

@jia200x
Copy link
Member

jia200x commented Sep 27, 2018

one more thing... if we add other layers on top (validation of params, CLI, etc) it's way easier to expand a YAML file than a header file. I can agree with some benefits of having header files with macro preprocessors and override values with new #defines. But also think parsing a header file for exposing params, validations, etc is not the best way to go

@dylad
Copy link
Member

dylad commented Sep 27, 2018

This looks promising but how are we suppose to use it and what are the Python dependencies required ?

@leandrolanzieri
Copy link
Contributor Author

leandrolanzieri commented Sep 28, 2018

This looks promising but how are we suppose to use it and what are the Python dependencies required ?

With this simple implementation you would just place your app configurable parameters on the YAML file, add the Makefile dependency for de xx_config.h file, and the generator is called on make all. Right now the script only depends on the argparser and PyYaml.

@jcarrano
Copy link
Contributor

@kYc0o

Are we forced to use doxygen on this header files?

No, but then there is even MORE work to be done in parsing.

The advantage of using header files is that you can do #ifndef for the parameters and just define them somewhere else, no need to state explicitly you want to override them.

Same answer as @jia200x. Let me add that it's better if one does not try to circumvent the mechanism.

If that occurs, then we have another problem that cannot be solved by a "conflict detector".

The idea is not to solve the conflict, only to detect and report it and avoid hard-to-find bugs.

This is interesting indeed, but I'd say is really complicated to do it at low level

If C had static asserts we would not need it so much. In any case, the limits also act as documentation.

value: othermodule.parameter
Please don't! We trying here to separate concerns, and here you're mixing things.

I'm not very convinced of this either, but we wanted to put it under consideration.

Declare dependencies (USEMODULEs) in that same file:
Also any configuration options for those dependencies.

Again, please don't, that's another topic which needs a different approach (probably yaml but in a different way).

The idea with this is:

  • Modules declare their configuration variables
  • An application declares its configuration variables and how it wants the modules it's using to be configured

That IMHO needs another approach, some kind of config inheritance but always through header files. Using an external tool (yes, yaml it's a tool after all) makes things more complex.

We are already stretching the limits of what can be done with Make (sometimes because what we are doing is inherently complex, sometimes because make is an awkward language and people tend to get things wrong). If we use headers (that means the C preprocessor) I'm afraid we will run into the same situation, and end up with something that looks more like a mashup of hacks.

That for me is the main argument against a system using header files. The thing is, I still have to hear a good argument for them. That they don't need any extra tooling is an illusion. What is the RIOT make system but a giant tool distributed in many files and directories. And the current system of "config.h" does not work without this tooling without some non-trivial manual work.

This would slow down a lot the CI.

I'd not rush any performance prediction yet. And even if it slows down the CI at first, it has the potential to speed up things in the long run by preventing unnecessary rebuilds and by isolating which parts of the code could have possibly been affected by a PR.

Overall, I think there's still a misunderstanding on what are module configurations and an application variant.

How is a module different from an application?

Similarities

  • Both depend on other modules.
  • Both have configurable parameters.
  • Both are compiled into a static archive and then linked.

Differences

  • The application is the top of the hierarchy (no one depends on the application)
  • The final target of the application is an executable and not only an archive.

So, I propose to stay on the configuration dilemma and not mix with buildsystem config, which will need some love but in a separated issue.

Ok, but what do you do with the stuff that is configured at build time?

@jcarrano I suggest first to reach consensus on the approach before tackling the approach in itself. I don't see any advantages of discussing what's done right or wrong in the implementation if the implementation can be changed anyways.

Agreed.

@jcarrano
Copy link
Contributor

I know this is not about implementation, but I want to clear up any doubts regarding build times. Hopefully I can convince everyone it is not an issue and it's not worth discussing it now.

Parsing the config files and generating the H files can be done extremely fast:

  • The parser in this PoC is a library written in C (an argument against having our home-grown format).
  • The rest is string manipulation.
  • In short scripts like this interpreter startup time can be a significant percentage of run-time

If anything should be of concern, that's the last point. Solving it is not impossible: make a single invocation of the script process several files. Not trivial, but not impossible either.

@jia200x
Copy link
Member

jia200x commented Sep 28, 2018

After some offline discussions, we will open a PR with the proposal for the xxx_config.h files and their Doxygen documentation, which is up to some point independent to this PR.

@jcarrano
Copy link
Contributor

jcarrano commented Oct 1, 2018

As an example of what are our goals with a configuration system, consider this (from #10075)

Ensure #define DFLL_CLK1and #define CLOCK_CORECLOCK 48000000u are correctly set in periph_conf.h

I would like not to have to touch board code to change a setting.

@jia200x jia200x added this to In progress in Configurations TF via automation Dec 6, 2018
@jia200x jia200x moved this from In progress to Postponed in Configurations TF Dec 6, 2018
@tcschmidt
Copy link
Member

How do we proceed, @leandrolanzieri @jia200x, whith #10626 merged and #10077 closed?

@stale
Copy link

stale bot commented Jan 3, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Jan 3, 2020
@fjmolinas
Copy link
Contributor

@leandrolanzieri @jia200x with all the work going on in kconfig I'm thinking this can be closed? Current work seems to have diverged...

@stale stale bot removed the State: stale State: The issue / PR has no activity for >185 days label Jan 3, 2020
@leandrolanzieri
Copy link
Contributor Author

Yes, this can be closed now. Thanks!

Configurations TF automation moved this from Postponed to Done Jan 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion: RFC The issue/PR is used as a discussion starting point about the item of the issue/PR TF: Config Marks issues and PRs related to the work of the Configuration Task Force
Projects
Development

Successfully merging this pull request may close these issues.

None yet

8 participants