Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Configuration file format #832

Closed
Niraj-Kamdar opened this issue Jul 19, 2020 · 5 comments · Fixed by #846
Closed

Discussion: Configuration file format #832

Niraj-Kamdar opened this issue Jul 19, 2020 · 5 comments · Fixed by #846

Comments

@Niraj-Kamdar
Copy link
Contributor

There are variety of config formats out there but most popular are toml, yaml, json and ini. PEP has recommended toml for the config files. See: PEP-518 for details.
Here is the summarized features of each formats:

Feature TOML YAML JSON CFG/INI
Well-defined yes yes yes  
Real data types yes yes yes  
Reliable Unicode yes yes yes  
Reliable comments yes yes    
Easy for humans to edit yes ??   ??
Easy for tools to edit yes ?? yes ??
In standard library     yes yes

So, parsers for toml and yaml aren't in standard library but popular packages for both formats exist in PYPI. We can use json but it isn't very easy for human to edit and don't provide comments supports. python does have parser for INI format in standard library but INI file format does not have any formal specifications.

@terriko, @pdxjohnny what's your opinions on the matter?

@Niraj-Kamdar
Copy link
Contributor Author

snippets of how our config file will look in each of the formats:

TOML

[input]

# Directory to scan
directory = "test/assets"

# To supplement triage data of previous scan
# Currently we only support csv and json file.
input_file = "test/csv/triage.csv"

[checker]

# list of checkers you want to skip
skips = ["python", "bzip2"]

# list of checkers you want to run
runs = ["curl", "binutils"]

YAML

input:
  # Directory to scan
  directory: test/assets
  # To supplement triage data of previous scan currently we only support csv and json file.
  input_file: test/csv/triage.csv

checker:
  skips:  # list of checkers you want to skip
    - python
    - bzip2
  runs:  # list of checkers you want to run
    - curl
    - binutils

JSON

{
  "input": {
    "directory": "test/assets",
    "input_file": "test/csv/triage.csv"
  },
  "checker": {
    "skips": [
      "python",
      "bzip2"
    ],
    "runs": [
      "curl",
      "binutils"
    ]
  }
}

INI

[input]

; Directory to scan
directory = test/assets

; To supplement triage data of previous scan
; Currently we only support csv and json file.
input_file = test/csv/triage.csv

[checker]

; list of checkers you want to skip
skips = [python,bzip2]

; list of checkers you want to run
runs = [curl,binutils]

@anthonyharrison
Copy link
Contributor

@Niraj-Kamdar I was thinking this morning of raising an issue suggesting that a config file was created as there now an increasing number of options within the tool.

I have never come across TOML before but I see that it is gaining traction in the python community although the use case seems to be aimed at building tools. I think that being able to include comments in any configuration file is essential so that effectively removes JSON as an option. My preference would be INI as it is well known (across both Linux and Windows), is supported by the Python standard library (using configParser) and is easy to edit in a standard text editor. I don't see any difference in the format of an INI file from a TOML file (I understand TOML supports nested sections - is that necessary?) and that INI comments can also support the # format.

@Niraj-Kamdar
Copy link
Contributor Author

Yes, TOML is very similar to INI file but INI file does not have any built-in type support and It also lacks formal specification. It parses everything as string. So, we have to process data parsed by configparser to convert it into something usable.
Our example data can be parsed as following dictionary:

{
    "checker": {
        "runs": "[curl,binutils]",  # This has to be transformed into list 
        "skips": "[python,bzip2]"
    },
    "input": {
        "directory": "test/assets",
        "input_file": "test/csv/triage.csv"
    },
}

So, parsing INI file won't be as easy as TOML or YAML which supports complex datatypes by default. It is also not easy to parse other datatypes like integer, float etc.

On the other hand TOML supports complex data types by default.

{
    'checker': {
        'runs': ['curl', 'binutils'],  # this is correctly parsed as list
        'skips': ['python', 'bzip2']
    },
    'input': {
        'directory': 'test/assets',
        'input_file': 'test/csv/triage.csv'
    },
}

@Niraj-Kamdar
Copy link
Contributor Author

Niraj-Kamdar commented Jul 19, 2020

So, INI is easy to read and edit by human but it's not so easy to parse by machine. On the other hand JSON is very easy to parse by machine but not so easy to edit by human. I have included them in our options because they are still being used as config files and they are in standard library.

On the other hand TOML and YAML are both easily readable by both machine and human and both are now widely being used instead of above traditional format.

@terriko
Copy link
Contributor

terriko commented Jul 22, 2020

To summarize our discussion today:

The top contenders among our team seem to be TOML (readable, familar to python folk and close enough to INI for skill transfer for windows folk) and YAML (which might be a better fit for the dev-ops community that we hope will be among the biggest users of cve-bin-tool).

Parsers for both formats produce similar python structures, so it should be possible to swap between them (or support multiple formats if needed, though there's reasonable concern about that being confusing for users or dangerous when people cut/paste from multiple formats into a single file).

I think consensus was to start with TOML but be open to YAML if it turns out to be a sticking point for users going forwards.

@Niraj-Kamdar Niraj-Kamdar changed the title Discussion: Configuration file fomat Discussion: Configuration file format Jul 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants