-
Notifications
You must be signed in to change notification settings - Fork 766
Improve combine_remediations.py #7810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve combine_remediations.py #7810
Conversation
|
Skipping CI for Draft Pull Request. |
7101e23 to
ef14ad6
Compare
ef14ad6 to
792646f
Compare
|
rebased && renamed cmake targets and script combine_remediations.py |
yuumasato
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me, thank you for the great commit messages.
Possibly should speed up the build, but I haven't measured the time yet.
It seems to have improved by almost a minute, from 7m20s to 6m35s.
matejak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR introduces great changes, and the only things that I suggest are related to the code layout.
|
|
||
|
|
||
| def parse_args(): | ||
| p = argparse.ArgumentParser() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about describing the function that the script performs, i.e. that it puts all remediations, templated or not, to the output directory.
| remediation_obj = remediation_cls(fix_path) | ||
| remediation_obj.associate_rule(rule) | ||
| fix = remediation.process(remediation_obj, env_yaml) | ||
| if fix: | ||
| output_file_path = os.path.join(output_dirs[lang], rule.id_ + ext) | ||
| remediation.write_fix_to_file(fix, output_file_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to extract this to a function as well - given a path, we either write the fix into a directory, or we don't do anything.
| fix_path = None | ||
| # first look for a static remediation | ||
| rule_dir_remediations = remediation.get_rule_dir_remediations( | ||
| rule_dir, lang, product) | ||
| if len(rule_dir_remediations) > 0: | ||
| # first item in the list has the highest priority | ||
| fix_path = rule_dir_remediations[0] | ||
| if fix_path is None: | ||
| # check if we have a templated remediation instead | ||
| if os.path.isdir(language_fixes_from_templates_dir): | ||
| templated_fix_path = os.path.join( | ||
| language_fixes_from_templates_dir, rule.id_ + ext) | ||
| if os.path.exists(templated_fix_path): | ||
| fix_path = templated_fix_path | ||
| if fix_path is None: | ||
| # neither static nor templated remediation found | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to extract this into a function that returns the "best guess" fix path.
The interesting thing is that if a rule has a "manual" remediation, it is used in favor of the template remediation. I wonder whether this is a good approach - one can disable template content easily in the template invocation, so I would say that template should have precedence, or there could be a warning that there is ambiguous content available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But disabling a language in the rule.yml disables it for all products.
If there is a static remediation it may be because the template doesn't suit a specific product and (probably) cannot be easily incorporated into the template.
The directory build/${PRODUCT}/fixes_from_templates/shared/ is
never created, doesn't contain anything and isn't used anywhere.
This is probably a leftover from past.
Currently the combine_remediations.py script supports to pass multiple
directories as `fix_dirs` positional arguments. However, the
combine_remediations.py is called only at a single place where only a
single directory is provided as an argument. Therefore, we don't need to
have the script to support multiple positional arguments. Instead, we
can rename the argument to better describe its purpose and change it to
an option. As can be seen from the single call of
combine_remediations.py in CMake, this argument contains a path to the
directory where the remediations generated by the templating engine are
stored: `build/${PRODUCT}/fixes_from_templates/${LANGUAGE}` So we think
that a better name for the option will be `--fixes-from-templates-dir`.
The combine_remediation.py is called only at a single place. We know that the last part of the directory path in the --output-dir and --fixes-from-templates-dir is the language and has the same value as the value of --remediation-type argument which is a required argument. We can construct the path dynamically inside the script instead of having it hard-coded in CMake.
The combine_remediations.py script will be now able to collect multiple remediations types in a single run if the --remediation-type option will be specified multiple times.
In previous commit we have introduced the ability to combine multiple
remediation types at a single run of combine_remediations.py by
providing multiple --remediation-type options on the command line. In
this commit, we leverage this new feature, we execute
combine_remediations.py only once for each product. Therefore we don't
execute it for each remediation type. All the
generate-internal-${PRODUCT}-${LANGUAGE}-all-fixes CMake custom targets
are merged to the generate-internal-${PRODUCT}-all-fixes custom target.
That simplifies the build system.
This is done to allow further refactoring inside the
combine_remediations.py.
Instead of the rule YAML file path we will pass a Rule object so that we can use already existing instances of Rule object.
The function no longer need to modify the `fixes` dictionary, it's up to caller to put the processed remediation to the structure they wish. This change is done to be able to use the processed remediation separately in future code.
This is a major change of the flow of the program. The goal is to load and process compiled rule YAMLs only once, even if multiple remediation languages are generated. We will now iterate over compiled rules in `/build/product/rules` instead of iterating over all available remediation files.
The names of _ssg_combine_remediations and combine_remediations.py are
misleading a bit. AFAIU the macro and script are selecting or filtering
the appropriate remediations. Same for
_ssg_build_remediations_for_language, the macro that is actually
combining the fixes into a ${LANGUAGE}-fixes.xml.
This commit splits the code into 2 new functions: find_remeditation, which locates the path of the remediation to be used and process_remediation which processes the previously found file and saves it to the output directory.
792646f to
f3af3dd
Compare
|
refactores, updated according to review and rebased |
yuumasato
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me.
Pending approval from @matejak.
|
Perfect, I have performed manual XCCDF XML diff, and there were no changes, so congratulations for finishing this great piece of work! |
Description:
The script combine_remediations.py will be executed only once for each product instead of executing it for each remediation type. The script is now iterating directly over compiled rules. As a result, each compiled rule will be loaded and processed only once.
For more details please read the commit message of every commit.
Rationale:
Tries to simplify the build process, reduces the amount of CMake targets.
Possibly should speed up the build, but I haven't measured the time yet.