Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor templates v2 #9870

Merged
merged 13 commits into from
Nov 25, 2022
Merged

Conversation

evgenyz
Copy link
Member

@evgenyz evgenyz commented Nov 24, 2022

Description:

Replacement for #9835.

Improvement of previous attempt. The template generation is now more abstracted
with Entity-specific code delegated to the new Templatable mix-in.

See individual commits for more information.

Rationale:

  • This is a preparation for introduction of templated Platfroms/CPEs.

Review Hints:

@evgenyz evgenyz added the CPE-AL CPE Applicability Language label Nov 24, 2022
@github-actions
Copy link

Start a new ephemeral environment with changes proposed in this pull request:

Fedora Environment
Open in Gitpod

Oracle Linux 8 Environment
Open in Gitpod

@evgenyz evgenyz force-pushed the refactor_templates_v2 branch 2 times, most recently from 9903e46 to 4176d64 Compare November 24, 2022 05:12
Unify the code and remove duplication.
Introduce a basic test for template builder.
Move 'title' attribute to the base class XCCDFEntity. It is a base
type's (Item/Entity) element according the specs.

Introduce sidekick Templatable class for XCCDFEntities that
could be templated. It's a mix-in class that will help in rendering
templates for various XCCDFEntities in the future.
Move make_items_product_specific method from Rule into common module
as static function along with GLOBAL_REFERENCES (constants). This
would allow us to reuse code beyond Rule component. Now Templatable
can normalize/productify template variables on its own.

Clean up duplicate definition of add_sub_element function.
Clean up template.Builder and delegate entity-specific functions
to Templatable. The problem with legacy 'rule_id', 'rule_title' and
'_rule_id' template variables is localized in Templatable.

Rename LANGUAGES constant, TemplatingLang and TemplateType.XXX
in accordance with PEP8 and co.

Clean up and decompose methods of template.Builder.
Copy link
Collaborator

@jan-cerny jan-cerny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the CI fail seems legit, please ta a look at the failed tests

ssg/build_yaml.py Show resolved Hide resolved
@@ -309,3 +349,85 @@ def update_with(self, rhs):
updated_refinements = self._subtract_refinements(extended_refinements)
updated_refinements.update(self.refine_rules)
self.refine_rules = updated_refinements


class Templatable(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to add some doc text here to explain what "Templatable" is.

Copy link
Member Author

@evgenyz evgenyz Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On it.

# TODO: The name _rule_id is a legacy from the era when rule was the only
# context for a template. Preprocessors implicitly depend on this name.
# A better name is '_entity_id' (as in XCCDF Entity).
template_vars["_rule_id"] = self.id_
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a good comment above it but I don't understand why a _rule_id is add here and at the same time a rule_id is add in get_template_context. Both items are set to self.id_ . I have an impression that finally both of these variables will be passed to the template so it seems to be duplicate. Any thoughts?

Copy link
Member Author

@evgenyz evgenyz Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason all template.py operate with _rule_id and template.oval,ansible,etc along with marcos operate with rule_id.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be no deliberate decision regarding the initial underscore. The variable has been introduced long ago (see https://github.com/ComplianceAsCode/content/blame/453a1d53dc0e2b129fc3704545264dd076e17889/ssg/templates.py#L359) and the leading underscore was probably supplied as a preemptive measure to prevent overwrites. I would drop the assignment here, and also resort to the underscore-less usage elsewhere. We use short IDs everywhere, there is nothing to fear.

Copy link
Member Author

@evgenyz evgenyz Nov 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dictionary element is used in:

  • shared/templates/audit_rules_privileged_commands/template.py
  • shared/templates/yamlfile_value/template.py
  • shared/templates/file_groupowner/template.py
  • shared/templates/file_owner/template.py
  • shared/templates/file_permissions/template.py
  • shared/templates/sysctl/template.py
  • shared/templates/sebool/template.py
  • ssg/utils.py

The utils.py parts are also being called from different template.pys.

What is more important templates.Teamplate.preprocess is executed without extra environment (where rule_id and rule_title reside). On top of that _rule_id is transformed into _RULE_ID as the result of this process. It's a bit messy. So, we can't just drop it.

My suggestion is to leave it for another self-contained round of refactoring.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is to leave it for another self-contained round of refactoring.

I agree, please don't do this in this PR, we can get to that later, I have created a ticket to track this effort: #9882

ssg/templates.py Outdated

def build_rule(self, rule_id, rule_title, template, langs_to_generate, identifiers,
platforms=None):
def build_templatable_lang(self, templatable, lang):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer build_lang_for_templatable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, can do that.

@jan-cerny jan-cerny self-assigned this Nov 24, 2022
@jan-cerny jan-cerny added this to the 0.1.66 milestone Nov 24, 2022
@jan-cerny jan-cerny added the Infrastructure Our content build system label Nov 24, 2022
Clean up imports and fix small problems.
@evgenyz
Copy link
Member Author

evgenyz commented Nov 24, 2022

the CI fail seems legit, please ta a look at the failed tests

Fixed.

Returns list of languages that should be generated from a template
configuration, controlled by backends.
"""
from ..templates import LANGUAGES
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the import here and not at the top?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point there was a circular dependency problem. I'll check if it is solved now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still there. Unfortunately in order to solve it I would have to extract Builder from templates.py. It totally makes sense, but it will mess this PR. How about solving it in the next one?

The problem is that Builder wants build_yaml.Rule, Rule wants common, common wants templates. Also I'm open to alternative layout suggestions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch that. Fixed.

def get_template_backend_langs(self):
"""
Returns list of languages that should be generated from a template
configuration, controlled by backends.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
configuration, controlled by backends.
configuration, controlled by 'backends' key of the template.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ssg/templates.py Outdated

@classmethod
def load_template(cls, template_root_directory, name):
maybe_template = cls(template_root_directory, name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent nameing variables - here it's template_root_directory but in the constructor it's templates_root_directory with "s" so I think here it should also be templates_root_directory with "s".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, missed this one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

"kubernetes": TemplatingLang("kubernetes", ".yml", TemplateType.REMEDIATION, "kubernetes"),
"oval": TemplatingLang("oval", ".xml", TemplateType.CHECK, "oval"),
"puppet": TemplatingLang("puppet", ".pp", TemplateType.REMEDIATION, "puppet"),
"sce-bash": TemplatingLang("sce-bash", ".sh", TemplateType.REMEDIATION, "sce")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not suggesting any change, as it is code from before, but isn't it strange that "script check engine" is treated as "remediation"? The template type probably means something else than what it says.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. No idea, but it might be somehow connected to the way we process shell scripts. SCE implementation has a lot of quirks all over the place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a bug, it's used in _init_lang_output_dirs in the following way:

            if lang.template_type == TemplateType.CHECK:
                output_dir = self.checks_dir
            else:
                output_dir = self.remediations_dir

It might be hidden because we don't have any SCE template in our repository.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But let's not change it in this PR.

# templates and macros even if they are not rendered in a rule context.
# Better name for these variables are 'entity_id' and 'entity_title'.
return {
"rule_id": self.id_,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add entity_id as well, so the transition period can start.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather just rename it in a separate self-contained PR, without any backward-compatibility. We have enough legacy lingering around.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

Copy link
Collaborator

@jan-cerny jan-cerny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evgenyz Thanks for the changes. Now I think that the only thing that remains to be done here before merging is to address the issues reported by CodeClimate. There are 5 of them. I'm fine with waiving the issue about too many arguments in template class constructor. So please fix the congestive complexity in function make_items_product_specific and 3 very long lines.

Better names for templatable-related methods of the Bulder class.
Add/improve documentation.

Rearrange funtions. Fix formatting.

Remove dependency on templates.LANGUAGES from
extract_configured_backend_lang method. This method would now
take language dictionary from the caller and filter it based
on the template configuration.
Rearrange imports. Fix PEP8 formatting.

Fix inconsistent argument name in Template class factory.
Remove redundant check in make_items_product_specific. If anything
this will lower the cognitive complexity of the function.
This parameter is an extension to the standard specification:
https://github.com/editorconfig/editorconfig/wiki/EditorConfig-Properties#supported-by-a-limited-number-of-editors

The value 99 is in sync with sanity CI configuration for PEP8:

setup.cfg:
[pycodestyle]
max-line-length = 99
@evgenyz
Copy link
Member Author

evgenyz commented Nov 25, 2022

@evgenyz Thanks for the changes.

I just realized that I haven't pushed the branch last time. Can you please review the changes once again?

So please fix the congestive complexity in function make_items_product_specific and 3 very long lines.

Long lines are fixed. But I don't know what could be done with the function. It's complex but there is a reason for that.
I've removed one redundant expression from the if clause, though.

@evgenyz
Copy link
Member Author

evgenyz commented Nov 25, 2022

I've removed one redundant expression from the if clause, though.

And it cut the complexity from 11 to 10. Oh, well.

@jan-cerny
Copy link
Collaborator

But I don't know what could be done with the function. It's complex but there is a reason for that.

I think that you might silence the warning by extracting the block inside the for loop to a private function.

@jan-cerny
Copy link
Collaborator

But I don't know what could be done with the function. It's complex but there is a reason for that.

I think that you might silence the warning by extracting the block inside the for loop to a private function.

Another option would be to extract the check condition - eg. move

        if label in items_dict and not allow_overwrites and value != items_dict[label]:
            msg = (
                "There is a product-qualified '{item_q}' item, "
                "but also an unqualified '{item_u}' item "
                "and those two differ in value - "
                "'{value_q}' vs '{value_u}' respectively."
                .format(item_q=full_label, item_u=label,
                        value_q=value, value_u=items_dict[label])
            )
            raise ValueError(msg)

to a new function product_qualified_label_differs

@evgenyz
Copy link
Member Author

evgenyz commented Nov 25, 2022

product_qualified_label_differs

I don't fancy the idea of a function that return nothing but throw an exception (that won't be handled) from time to time just to pleasure some weird algorithm.

Extract shadowing / global refs check from make_items_product_specific.
It supposed to lower the cognitive complexity of the function.
@evgenyz
Copy link
Member Author

evgenyz commented Nov 25, 2022

Okay, that will do.

@matejak
Copy link
Member

matejak commented Nov 25, 2022

I don't fancy the idea of a function that return nothing but throw an exception (that won't be handled) from time to time just to pleasure some weird algorithm.

That's not the case of "weird algorithms". The original sensitivity of CodeClimate has been relaxed already, and the original threshold was anyway above the Clean Code recommendations that we liked.

In this case, the condition could use an explanatory variable, and extracting the whole thing to a method or function could work the same explanatory way. The new function would evaluate the condition, compose the message, and then throw an exception - that's three things, so fine by me.

@evgenyz
Copy link
Member Author

evgenyz commented Nov 25, 2022

the condition could use an explanatory variable

That did not work FYI. It wanted exceptions with messages out of the function.

@evgenyz
Copy link
Member Author

evgenyz commented Nov 25, 2022

That's not the case of "weird algorithms".

So, in your subjective opinion, the complexity of the original function was unbearable (11 of 7 that we consider borderline manageable)?

IMHO the function was longish (because of messages and docs), but quite readable and understandable.

@matejak
Copy link
Member

matejak commented Nov 25, 2022

I see that the complexity has been handled using different means, so that one is done.

Overall, I think that the PR is a step in the right direction, and it handles the tricky part in a way that is reasonable, so I am fine with merging it whenever @jan-cerny is.

@jan-cerny
Copy link
Collaborator

@evgenyz take a look at the github pages CI job, it seems to be a problem with the thing that we touched

@evgenyz
Copy link
Member Author

evgenyz commented Nov 25, 2022

That's not the case of "weird algorithms".

It considers

if not allow_overwrites and label in items_dict and value != items_dict[label]:
    ...

more readable than

if not allow_overwrites
    if label in items_dict and value != items_dict[label]:
        ...

and scores 1 penalty point for it.

With that approach and line length of 99 it would force us to do

if not allow_overwrites and label in items_dict and value != items_dict[label]\
   and something_else_pretty_long_and_informative_here():
   ...

which goes completely against common sense.

IMHO.

@evgenyz
Copy link
Member Author

evgenyz commented Nov 25, 2022

@evgenyz take a look at the github pages CI job, it seems to be a problem with the thing that we touched

Uh-oh. So, with the last change I tightened reference "deproductization" procedure and it uncovered a bug in the content.

In apple_os/auditing/service_com_apple_auditd_enabled there are two entries

...
    stigid: AOSX-14-001013
    stigid@ubuntu2004: UBTU-20-010182
...

and the second one is the offender for 2 reasons: 1) it's for Ubuntu 2) it breaks the non-shadowing policy.

My suggestion: fix the bugger (by removing it).

@jan-cerny
Copy link
Collaborator

yes please!

…td_enabled rule

The "stigid@ubuntu2004: UBTU-20-010182" entry is for Ubuntu
and it violates not-shadowing policy.
@codeclimate
Copy link

codeclimate bot commented Nov 25, 2022

Code Climate has analyzed commit 85e3553 and detected 1 issue on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 1

The test coverage on the diff in this pull request is 75.4% (50% is the threshold).

This pull request will bring the total coverage in the repository to 48.6% (1.3% change).

View more on Code Climate.

@jan-cerny jan-cerny merged commit c99a90e into ComplianceAsCode:master Nov 25, 2022
@evgenyz evgenyz deleted the refactor_templates_v2 branch November 25, 2022 18:19
@@ -22,7 +22,7 @@
def make_items_product_specific(items_dict, product_suffix, allow_overwrites=False):
new_items = dict()
for full_label, value in items_dict.items():
if "@" not in full_label and full_label not in new_items:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a modification of condition without any compensation, so no wonder that it may change the behavior of the function, as reported in #9894
new_items is not an empty dict for the whole time, but the items_dict ordering is different between Python2 and Python3, so the loop may behave differently in those interpreters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it is the case in that issue - pkgname@rhel7: pam_pkcs11 is supposed to take precedence over pkgname: openssl-pkcs11. What happened probably was that new_items["pkgname"] was set to pam_pkcs11, and it got overwritten by openssl-pkcs11. The condition protected product-qualified items from being overwritten by generic items.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like this case should fall right into https://github.com/evgenyz/content/blob/5c7cf430a1012bf3151872519f7873b99218f5cb/ssg/entities/common.py#L46 instead of being short-circuited.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise this no-overwrite policy depends on the order of dictionary items (which is not ordered in any way by definition).

Copy link
Member Author

@evgenyz evgenyz Nov 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it formulated anywhere besides the exception message? I can't find anything in the docs. Maybe I don't even understand it to the full extent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The funny thing is that CodeClimate was right about that method being too cognitively complex. I think (may be wrong though) that

        if not full_label.endswith(product_suffix):
            continue

is triggered in case that there is an override attempt of a product-qualified value by a generic value, so now it looks like to me that the exception is not thrown, but the value also isn't overwritten, so it works fine on Python3.
Anyway, only exhaustive unit-testing could bring sufficient level of confidence, and although we would like that, it's out of scope of this PR. The truth is that removing the second part of that conditional was a mistake, and the best thing to do is to fix it - we all know that anybody can make mistakes, and this one was a very educational one that plenty of people can learn from.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truth is that it ain't over yet: #9910

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not full_label.endswith(product_suffix):
            continue

happens after the possible exception.

Python 2 and 3 difference is in the order of items storage/emission. I've reproduced it in a test case using update().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the exception happened after this sequence. FTR, the original code can be viewed here:

def _make_items_product_specific(self, items_dict, product_suffix, allow_overwrites=False):

The issue #9894 was about product-qualified package, which is definitely not a "global reference". Therefore, the next line if label in items_dict and not allow_overwrites and value != items_dict[label] was one that could trigger an exception, but I don't recall our policy regarding allow_overwrites. That clause probably aims to enforce existence of either only product-qualified specs, or no product-qualified specs at all.
Naming sucks, but at least the commit messages that explain why the function got into that shape are quite solid.

Copy link
Member Author

@evgenyz evgenyz Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function has not been used for de-qualifying template variables before refactoring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CPE-AL CPE Applicability Language Infrastructure Our content build system
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants