Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert default.cfg to YAML files #274

Merged
merged 30 commits into from
Feb 1, 2022
Merged

Convert default.cfg to YAML files #274

merged 30 commits into from
Feb 1, 2022

Conversation

joaomcteixeira
Copy link
Member

@joaomcteixeira joaomcteixeira commented Jan 14, 2022

Convert modules' default.cfg to YAML files for better integration with other platforms (such as i-VRESSE/workflow-builder#1) and as a documentation source for documentation generating scripts. This PR is the result of our bi-weekly meetings.

These changes do NOT affect the HADDOCK3 toml-like user configuration file.

The proposed yaml structure for modules' default config is as follows (OUTDATED):

# this was the first version and was DISCARDED after several considerations - read further
basic:
  parameter_name:
     default: 1
     min: 0
     max: 9999
     type: integer
     hoover: mini doc sentence
     doc: extended information about the parameter
  parameter_name_2:
     default: some string
     max: 9999
     type: string
     hoover: mini doc sentence
     doc: extended information about the parameter
  # more parameters here
intermediate:
  # ... intermediate parameters here
guru:
  # ... guru parameters here

I think it is easier to maintain if the expert levels are subdictionaries in the yaml file (this is how it is implemented in this PR, but can be changed). Nested dictionaries are also supported as is the case for the topoaa module.

Personally, I like the yaml structure and is giving very nice feeling and results while developing this PR.

The original default.cfg files were converted to YAML with this script.

@sverhoeven what other keys per parameter do we need for IVRESSE, any additional requirements?

YAML pros against JSON:

  1. accepts comments
  2. accepts multiline strings easily

YAML cons against the current format:

  1. it's harder to quickly inspect the parameters by eye because there are much more lines
  2. (edit) We solved this by creating the haddock-cfg CLI that translates default.yml to user readable haddock3 config files
  3. I am not sure how sensitive is YAML to indentation.

Implemented a CLI to retrieve the default configuration in the nice human-readable haddock3 format, for example:

haddock3-cfg -h
haddock3-cfg -m topoaa

It will generate a haddock3 config with helper messages like the example below:

[module]
# basic
par1 = value # $min 0 $max 10 $hoover This is one sentence help text $doc This is a long doc to explain the variable correctly
par2 = value # $min 10 $max 100 $hoover This is one sentence help text $doc This is a long doc to explain the variable correctly

# intermediate
par3 = value

This generated config can be copy-pasted directly to the user's haddock3 config file.

TODO:

  • Once we remove the unnecessary parameters from the defaults.cfg (in another Pull Request) the new defaults.yml can be regenerated before merging this PR.
  • remove the old defaults.cfg

After this PR is accepted you need to perform the following commands from inside the haddock3 folder and conda environment:
conda env update --file requirements.yml --prune
python setup.py develop --no-deps

@joaomcteixeira joaomcteixeira self-assigned this Jan 14, 2022
@joaomcteixeira joaomcteixeira added the feature New feature request label Jan 14, 2022
@joaomcteixeira joaomcteixeira added this to In Progress in Features via automation Jan 14, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jan 14, 2022

Codecov Report

Merging #274 (883e5f6) into main (0bc4777) will increase coverage by 1.04%.
The diff coverage is 95.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #274      +/-   ##
==========================================
+ Coverage   55.01%   56.06%   +1.04%     
==========================================
  Files          53       55       +2     
  Lines        3439     3530      +91     
==========================================
+ Hits         1892     1979      +87     
- Misses       1547     1551       +4     
Impacted Files Coverage Δ
src/haddock/modules/analysis/clustfcc/__init__.py 11.59% <66.66%> (-0.64%) ⬇️
src/haddock/modules/__init__.py 41.90% <80.00%> (+6.03%) ⬆️
src/haddock/gear/yaml2cfg.py 95.55% <95.55%> (ø)
src/haddock/__init__.py 95.00% <100.00%> (+0.26%) ⬆️
src/haddock/libs/libhpc.py 20.56% <100.00%> (ø)
src/haddock/libs/libio.py 66.66% <100.00%> (+16.66%) ⬆️
src/haddock/modules/analysis/caprieval/__init__.py 22.64% <100.00%> (ø)
src/haddock/modules/analysis/seletop/__init__.py 47.82% <100.00%> (ø)
...haddock/modules/analysis/seletopclusts/__init__.py 28.57% <100.00%> (ø)
src/haddock/modules/refinement/emref/__init__.py 24.19% <100.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0bc4777...883e5f6. Read the comment docs.

@joaomcteixeira
Copy link
Member Author

Note, these only affect the defaults inside the core code. The user config will remain as is, simple and beautiful 😜

@amjjbonvin
Copy link
Member

amjjbonvin commented Jan 14, 2022 via email

@joaomcteixeira
Copy link
Member Author

Thinking deeply about it, considering issues related to the architecture, maintainability, and user perspective, I think what is best is to prepare a CLI where the user can retrieve a default HADDOCK3 configuration with all the parameters and explanations. Or (maybe even better) we can have all the parameters nicely explained in the documentation web page by retrieving automatically all the information from the YAML files. Updating the YAML would automatically update the configuration documentation page. I think this is best.

@amjjbonvin
Copy link
Member

I suggest to first check carefully the current cfg files and remove anything not needed making sure that all examples still work. And only then convert to this yaml format. Will save us effort in editing the yaml... We need to process a lot of data and this requires manual editing to define the correct min/max values and all the relevant descriptions.

@joaomcteixeira
Copy link
Member Author

Sounds good. I have the script to regenerate the yamls, so we can leave this as a draft. I will prepare what's left, and when the parameters have been parsed, we recreate the yamls and merge this one. Also, I thought we can have the basic, intermediate and guru as the first layers of the YAML. Might be easier that way to move parameters up and down the layers and for us to quickly know which ones belong to which layer.

@amjjbonvin
Copy link
Member

amjjbonvin commented Jan 16, 2022 via email

@joaomcteixeira
Copy link
Member Author

joaomcteixeira commented Jan 16, 2022

Added a CLI to extract the yaml configuration defaults to a haddock3 config file with the corresponding help message and ready to be copy+pasted to any haddock3 config file. If a proof of concept that the YAML is infact a good way to go because any API can extract all the info from there and there is a single place where we define everything related to the parameters.

Fetch this branch and run:

python src/haddock/clis/cli_cfg.py -h or try python src/haddock/clis/cli_cfg.py -m topoaa

You will get a file haddock3_topoaa.cfg in the CWD.

Once this PR is accepted, the python setup.py develop --no-deps will render the 'CLI` system-wide available. It is a good way for users to obtain info also.

And, naturally, we ended up answering #55

@joaomcteixeira
Copy link
Member Author

Done. I have updated the first message of this PR, please read it.
This PR is ready to review and merge after the defaults.cfg files are clean from the unnecessary parameters.
I like the result of the default YAML idea.

@joaomcteixeira
Copy link
Member Author

Made some updates: git pull origin yamlcfg, they are as follows:

  • @amjjbonvin that's a good suggestion for the CLI. I have implemented it.
  • Updated code also after Remove input layer in topoaa module. #300
  • @amjjbonvin YAML accepts multiline strings. I have updated your topoaa/defaults.yml config file. See the differences here. If you use colon : you need to have the strings in quotes also.
  • If you run tox -e py39 -- tests/test_modules_general.py unittest will try to read all the yml configs and spot any errors. It might be tricky at first the tox output but at least you'll see if there are some red lines appearing.

@joaomcteixeira joaomcteixeira marked this pull request as ready for review February 1, 2022 10:50
@joaomcteixeira
Copy link
Member Author

@amjjbonvin this PR is ready to be merged. You can continue adding documentation to the defaults.yml files here or we can merge and continue adding documentation later in other PRs. I think it's best we merge. Let me know.

@amjjbonvin
Copy link
Member

amjjbonvin commented Feb 1, 2022 via email

Copy link
Member

@amjjbonvin amjjbonvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to merge indeed and then keep adding description to the yams files.

@joaomcteixeira
Copy link
Member Author

Since you expect the text to be between quotes should this also be the case when there is “No help yet” ?

No. You only need quotes when your strings have special characters like colon :. We have unittest to ensure all yaml configs are parseable. When you are editing a YAML config you can then run tox -e py39 -- tests/test_modules_general.py to see if any errors arise. You need to be in the haddock3 conda environment for this to work and in the main haddock3 repository folder.

@sverhoeven we are merging 😉

@joaomcteixeira joaomcteixeira added this to the v3.0.0 stable release milestone Feb 1, 2022
@joaomcteixeira joaomcteixeira merged commit aea7503 into main Feb 1, 2022
Features automation moved this from In Progress to Done Feb 1, 2022
@joaomcteixeira joaomcteixeira deleted the yamlcfg branch February 1, 2022 13:32
@joaomcteixeira joaomcteixeira added the integration Integration with other platforms: servers, workflows, interfaces, etc label Feb 2, 2022
@joaomcteixeira joaomcteixeira added the yaml default parameters Anything related to the YAML configuration files for default parameters label Jul 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improve docs feature New feature request integration Integration with other platforms: servers, workflows, interfaces, etc yaml default parameters Anything related to the YAML configuration files for default parameters
Projects
Features
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants