Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change: Make paths bundle dynamic #339

Closed

Conversation

nickanderson
Copy link
Member

DO NOT MERGE: THIS IS ONLY FOR CONSIDERATION AND FEEDBACK

Instead of having a list of each path that we want to know about we could find
bins based on a pre defined path, just like PATH behaves on the cli.

This is risky, possibly too risky. I am unsure if this is offset by making the
paths bundle more usable.

Additionally some bins have characters in the name that make them generate
erros. This could probably be worked around.

Currently this implementation relies on GNU find. That may be an unacceptable
dependancy.

DO NOT MERGE: THIS IS ONLY FOR CONSIDERATION AND FEEDBACK

Instead of having a list of each path that we want to know about we
could find bins based on a pre defined path, just like PATH behaves on
the cli.

This is risky, possibly too risky. I am unsure if this is offset by
making the paths bundle more usable.

Additionally some bins have characters in the name that make them
generate erros. This could probably be worked around.

Currently this implementation relies on GNU find. That may be an
unacceptable dependancy.
@nickanderson
Copy link
Member Author

Get PATH from def.cf?
Tag path classes with path to bin?

@nickanderson
Copy link
Member Author

Perl instead of find?
Or go back to findfiles?

@zzamboni
Copy link
Contributor

zzamboni commented Oct 9, 2014

Very nice.

I would suggest removing the dependency on GNU find by using only standard find, but post-processing the output with perl (also a dependency, but arguably a more common one). The find command could be rewritten like this (untested in CFEngine, but I tested the command on my shell):

      "/usr/bin/find"
        args =>"$(SEARCH_PATH) -maxdepth 1 -type f -perm +rx -print | perl -pe 'BEGIN { print "^context=paths\n" }; chomp; s!^(.+/([^/]+))$!=path[$2]=$1\n!'",
    useshell => "true",
        comment => "Format the find commands output so that we end up with an
                    array of bins and paths for easy reference from within
                    policy.";

The index errors for weird characters could also be fixed by munging $2 before using it in the Perl snippet.

@nickanderson
Copy link
Member Author

If I'm going to hit Perl, maybe the find should be done in Perl as well.

@basvandervlies
Copy link
Contributor

Nice. I do not like the dependency on Perl. Just use the find utility it is installed on most platforms

@tzz
Copy link
Contributor

tzz commented Oct 14, 2014

Unfortunately findfiles() doesn't have filters like find. So for a pure CFE implementation, you'd have to collect into an array, then getvalues() on all the found files, and use filestat() or something similar to check their permissions. It will be slow. I recommend instead accepting -perm and -type switches (among the other standard find options) in findfiles(), which will not break anything since - is not a valid character at the beginning of an absolute path. For older agents, the fallback can use a Perl one-liner, which is fairly easy using File::Find.

@tzz
Copy link
Contributor

tzz commented Oct 15, 2014

@dannysauer may have a comment (from his help-cfengine post) :)

@dannysauer
Copy link

I feel like I shouldn't just leave Ted hanging, since he took the time to tag me. :)

perl -le 'print "^context=paths";map{map{print "=path[$_]=$D/$_" if(-f "$D/$_" and -x _)} readdir(D) if opendir D,$D=$_} split(/:/,$ENV{PATH});'

However, as a coworker found on Friday (while debugging our code which builds the audit policy we use), the "[" binary on a bunch of platforms will cause problems in CFEngine arrays like this. So, that's something to keep in mind with anything that dynamically builds arrays based on the contents of /usr/bin. So, to handle that specific case...

perl -le 'print "^context=paths";map{map{print "=path[$_]=$D/$_" if(!/\[/ and -f "$D/$_" and -x _)} readdir(D) if opendir D,$D=$_} split(/:/,$ENV{PATH});'

Personally, I like the idea of expanding findfiles as an even better option (though I'd rather see find_files just support naming a file_select body as one optional argument and allow CFEngine's file select capabilities to filter it rather than adding a bunch of arguments). :)

@kacf
Copy link
Contributor

kacf commented Oct 20, 2014

How about using the same approach as in https://github.com/cfengine/core/blob/master/tests/acceptance/dcs.cf.sub, which is tested and proven on all platforms we support?

@tzz
Copy link
Contributor

tzz commented Oct 20, 2014

@kacfengine listing all the possible executables is a losing proposition. Different users need different paths, from ifconfig (common) to feh (obscure) to git-subtree (optional). dcs.cf.sub has a very limited scope, that's why it works there.

@dannysauer if you're on 3.6, consider generating JSON data to avoid making a new variable for every array entry. The module protocol supports it.

I also agree the file_select option would be a nice solution, though it is much, much more verbose than the inline version, and introduces an external dependency (so you can't just copy+paste it anymore, like a typical find call).

@cfengine-review-bot
Copy link

Can one of the admins verify this patch?

@tzz
Copy link
Contributor

tzz commented Jan 5, 2015

I think this should be prioritized, it would make life easier. I would use findfiles though. If it's slow, let's fix that.

@nickanderson
Copy link
Member Author

I tried again for a bit to use findfiles()
It is not apparent to me how I can reliably pick the first or last path listed in the event of a duplicate bin like /bin/echo vs /usr/local/bin/echo.

bundle agent dynamic_paths
{
  vars:
#    any::
      "PATH"
        slist => {
                  "/tmp/one/*",
                  "/tmp/three/*",
                  "/tmp/two/*",
                 };
#                  "/var/cfengine/bin",
#                  "/usr/local/sbin",
#                  "/usr/local/bin",
#                  "/usr/sbin",
#                  "/sbin",
#                  "/bin",
#                };
# 
#      "REV_PATH"
#        slist => reverse("PATH"),
#        comment => "We reverse the path so that the first listed
#                    path is what wins";

      #"paths" slist => findfiles("/var/cfengine/bin/*", "/usr/local/sbin/*", "/usr/local/bin/*", "/usr/sbin/*", "/sbin/*", "/bin/*");

      # This produces an array with the path as the index, and the
      # value as the canonified form of the path.
      "canonified_path_map[$(PATH)]" string => canonify($(PATH));

      # This produces an array with the canonified form of the path as
      # the index, and the path as the value.
      "all_paths[$(canonified_path_map[$(PATH)])]" slist => findfiles("$(PATH)");

      # Generate a list of paths for each path we are searching.
      # This list will have to be further filtered to ensure the files are executables      "paths_$(canonified_path_map[$(PATH)])" slist => { "@(all_paths[$(canonified_path_map[$(PATH)])])" }; 

      # Would need to use lastnode() and build an array of paths[lastnod()]
      # 

  reports:
    # This will report each path that is found.
    "#paths_$(canonified_path_map[$(PATH)]) = $(paths_$(canonified_path_map[$(PATH)]))";
    "paths__tmp_one__ =  $(paths__tmp_one__)";
}

@neilhwatson
Copy link
Contributor

How is find any different that setting a path?

@nickanderson
Copy link
Member Author

I'm wondering if we do really want this feature if it would be best implemented in the C as a language level feature.

body agent control
{
default_command_paths_body => no_paths;
}

body command_paths no_paths
{

Current behavior

}

body command_paths common_paths
{

first one wins

search_path => {}
}

I don't know.

@tzz
Copy link
Contributor

tzz commented Jan 6, 2015

IMO the path list source should be a special environment variable, not PATH, and should be set by the user in def.cf (so they can make it PATH if they want).

bundle agent main
{
  vars:
      "paths" slist => string_split(getenv("CFENGINE_SEARCH_PATH", 99k), ":", 1k);
      "pfound[$(paths)]" slist => findfiles("$(paths)/*");
      "found_all" slist => getvalues(pfound);
      "found" slist => filter(".*[[].*", found_all, true, true, 100k); # some systems have the [ executable
      "base[$(found)]" string => lastnode($(found), "/");
      "path[$(base[$(found)])]" string => $(found);

  classes:
      "have_$(base[$(found)])" expression => "any";

  reports:
      "Searching path: $(paths)";
      "Found executables: $(found) (basename $(base[$(found)]))";
      "Path for $(base[$(found)]) =  $(path[$(base[$(found)])])";
}

output (note the first searched path wins):

% CFENGINE_SEARCH_PATH=/tmp/1:/tmp/2 cf-agent -KI -f ~/Dropbox/cf/test/test_dynamic_paths.cf          [master] 
R: Searching path: /tmp/1
R: Searching path: /tmp/2
R: Found executables: /tmp/2/ping (basename ping)
R: Found executables: /tmp/1/ping (basename ping)
R: Path for ping =  /tmp/1/ping

IMO it's not necessary to check for executable bits.

This is fairly slow currently with large lists.

Also, IMVHO, the proper way to speed up this and many other use cases is to implement variable persistence, like we have classes persistence. It's a common need and writing C code just for dynamic paths seems suboptimal.

@nickanderson
Copy link
Member Author

ENV would have to be set in policy as well wouldn't it?
https://github.com/cfengine/masterfiles/blob/master/controls/cf_agent.cf#L35

Without persistent variables currently, couldn't we cache this list of files in $(workdir)/state/paths.cache or something only regenerating that file when the software cache has changed?

@tzz
Copy link
Contributor

tzz commented Jan 6, 2015

The user can choose to set the environment externally or from policy...

I think caching this list would work, just save it in a JSON file. It's not great but would work, especially if the bundlestate function is accepted (because you'd save the state of the whole paths bundle).

@nickanderson
Copy link
Member Author

If ENV can be modified from outside the agent I would not use that, at least by default. Perhaps a commented out example.

@tzz
Copy link
Contributor

tzz commented Jan 6, 2015

The choice of the environment variable is up to the user. If you don't want it to be externally configurable, just make it a static slist in def.cf. The operation of the code doesn't change.

@estenberg
Copy link
Contributor

Created a feature ticket for a native implementation here: https://dev.cfengine.com/issues/6915

Please continue the discussion there, as this pull request will not be merged anyway.

@estenberg estenberg closed this Jan 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
9 participants