-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Policy for checking for arbitrary file existence #500
Comments
Hi @wesley-dean-flexion, for the fileNames like other GitHub Actions, is there a reason to not support the "Typescripty" way with GitHub Action configurations that support recursive wildcards? So instead of supporting just the example below from your report.
It could also support this?
You can look for specific files in specific subdirectories you can use the original method. In either case, this looks like a nice a recommendation and I would support this addition. I have similar needs for such functionality in allstar. 😄 |
@xee5ch that's a really good thought. It would be really nice if it could accept recursive globs ( It appears -- and please correct me if I'm wrong as this is really foundational to the whole idea -- that Allstar does not download / clone / fetch / pull the repositories being scanned; instead, it makes calls using the GitHub client library for Go: allstar/pkg/policies/security/security.go Lines 98 to 101 in 964a34c
(sidenote: Allstar is coded to use v59 of the library; v60 was released on February 29, 2024) I'm looking through the Github's GraphQL documentation to see if it's possible to use GraphQL to use globs to match files. |
On the GitHub forum, I found a post where someone wanted to fetch the content of So... I don't know. I'll write more as I learn more. |
@wesley-dean-flexion This is great, thanks for digging into this. The I'd suggest a new policy and to not touch the current "SECURITY.md" Allstar policy at all. As far as iterations, feel free to go at your own pace. If you want to start with a simple single-file policy, that is ok, or if you want to go all the way to multiple checks with either/or lists of files to find, that works as well. One thing to consider with the multiple-file checks is what to do about org vs repo level config. For policies that have single-variable options, the repo level config can override what is set at the org level (If overrides are allowed). When you have a list, then it is hard to remove/override an org level requirement at the repo level. Usually you would just add both lists together. For getting files, we have a cache-friendly wrapper around I agree that getting contents of files directly using the API is the most desired implementation, but that might not work well with globs. I'm not sure if there is a search API or something under the GraphQL side that will work better. In the end, the Scorecard policy downloads the tarball, so we can do that for this as well if we find a way to share the downloaded tarball with the Scorecard policy. |
@jeffmendoza this is amazing feedback. Thank you so much!!
I suspected that as I couldn't see where or how a call for that constant was used. Reading this comment in #469 led me to my questioning myself (i.e., clearly I'm missing something):
Yes, absolutely. Agreed.
My thinking with the iterations was three-fold:
Oh, that's a really good point. I suppose one could add an
So, maybe instead of a boolean ( Something to think about.... 🤔
Excellent. This is extremely helpful. Thank you!
I haven't found evidence to see if globs would be supported. It would be great if it could, but I don't know how much effort would be involved. While it would be more resource-intensive, we could fetch the list of files and replicate a glob match (against the list in-memory) instead of using a GraphQL query. I don't know what effort would be involved in fetching that list. Some examples showed how to perform a non-recursive query for the list of names of files in a repo and suggested that a recursive option may be available in future revisions of the the API. However, those conversations were several years old. That said, running a query on the ossf/allstar repo's I just ran a test with a path of So, if we were to build a list of files in a repository, that would likely result in several API calls (one per directory entry); however, if we just queried the files for a given policy check, we're talking about one query per file check. If we're looking for one..two..three locations for a file, it's highly probable a full repository scan would result in more API calls. We could codify a check to see if a I really, really appreciate your taking the time to review this, offer suggestions and corrections, and support the refinement of this idea. Tangentially, I noticed the Allstar list of future policies includes Dependabot. Cool. A file existence checker could look for the presence of |
@wesley-dean-flexion All this looks great. I support passing on globs for now and getting something helpful for many/most cases in sooner without it. The merge strategy you described sounds good. Most policies use the "allow override" from the OptConfig, so we will just need appropriate documentation on this policy that it follows the merge policy setting instead. |
@jeffmendoza Terrific. Thank you very much. Agreed on passing on globs for now -- seems nice to have, possibly for a future iteration. Agreed re: documentation. Perfect. Again, I really appreciate your taking the time to read, review, and respond to this thread and for being open to the conversation. Thank you. |
I was pointed to todogroup/repolinter earlier today. It looks like it can do what I wanted with this policy, so I'm having serious questions about whether or not it makes sense to duplicate it in Allstar. One nice thing that I like about Allstar (among many!) is that it interacts with GitHub's API without cloning the repository being examined. Repolinter is able to accept URLs to repositories hosted on SCMs with the Another nice thing I like about Allstar is that it includes the ability to be used on an organization level, not just on a repository level. I stumbled across a posting on organizationally required GitHub Actions that may be able to accomplish something similar. Based on that article -- and I haven't tested anything yet -- an opt-out model appears to be more effort than Allstar's model, especially for organizations with many repositories. It's entirely possible that I misinterpreted the source material or the functionality has changed since the article was posted. A third nice thing that I like about Allstar is the ability to trigger actions based on findings, such as by creating issues in repositories that are missing, for example, SECURITY.md files. I believe it may be possible to create an issue if a Repolint run failed (i.e., I'm curious, @jeffmendoza , if you have any thoughts. I have no ego in this so if you believe that it's not helpful to add this functionality to Allstar and would recommend this issue be closed / wont-do, that's totally fine with me. |
Policy for checking for arbitrary file existence
User Story
As a Security Engineer, so that I can enforce organization-wide policies regarding arbitrary files, I would like a new File Check policy that I may configure to meet my organization's needs.
Overview
OSSF Allstar current includes policies for checking the existence of various specific files. For example, consider SECURITY.md via pkg/policies/security/ ; this policy specifies a constant at security.md line 31 of
polName
with a value ofSECURITY.md
. In #469, @matias-jt asked if it was possible to verify if a file exists and @ArisBee responded with advice to update thepolName
variable. Similarly, @melmos asked in #450 if would be possible for a policy to check forsecurity.json
instead ofSECURITY.md
and it was suggested that the two issues may be duplicates with #469 referencing a general use-case while #450 is more specific.The suggested change in the value of
polName
would require recompiling the tool -- that is, the value is set at compile-time, not at configuration-time or run-time.Current Use-Case
My organization has multiple repositories. We have an organizational policy (governance, not Allstar policy) of requiring repositories to include a configuration files for Pre-Commit, a tool that supports hooks for running detect-secrets, gitleaks, etc.. That is, we want to iterate across all our repos and make sure they all have
.pre-commit-config.yaml
files -- basically exactly what Allstar does, but instead ofSECURITY.md
, we would look for.pre-commit-config.yaml
files. While we could fork Allstar, we're thinking that it may be possible to help others who are in similar situations (e.g., the aforementioned issues) by generalizing a policy in Allstar that could be configured separately.Alternative Implementation
High-level Goals
Without getting overly-prescriptive, it may be worthwhile to consider breaking down a new policy into smaller goals and iterate across them instead of trying to do the whole thing at once.
Iterative Steps
Iteration 1: variable filenames
Allow
polName
to be set via configuration file. The configuration file would function just like the other tools' configuration files (e.g., security.md line 30 which specifiesconfigFile = "security.yaml"
). The sample quickstart file would be extended to support an additional parameter, such asfileName
. At runtime, instead of referencing a constant, a variable would be extracted from the configuration file. Everything else would remain the same. That is, if the file in question exists, policy compliance is accepted; if it's missing, the repository is non-compliant and the usual remedies (block checks, create issues, etc.) are taken.It is recommended that basic checks are run to only allow tests for files in the repository, not arbitrary files on the filesystem.
fileName
string from the config filefileName
exists, passConsider the following configuration file:
Iteration 2: support file absence
Instead of having a policy check receive configuration via a string named
fileName
, the tool will accept a map with two keys offileName
andstate
wherefileName
is the name of the target file (as before) andstate
is eitherpresent
orabsent
(consider Ansible's ansible.builtin.file module which allowsabsent
and a variety of potential file types).The most significant aspect of this iteration isn't the supporting the absence of a file, it's changing the configuration approach
from accepting a string to accepting a map.
fileCheck
map from the config filefileName
string fromfileCheck
state
string fromfileCheck
fileName
exists andstate
==present
, passfilename
doesn't exist andstate
==absent
, passConsider the following configuration file:
Iteration 3: support multiple locations
Instead of having the policy check for a single specific location for a file (e.g.,
/SECURITY.md
) specified as a string, accept an array of locations to check. In the case of checking for file existence, if any of the locations match, consider the check passed; in checking for file absence, if any of the locations match, conside the check failed. For example, the SECURITY.md file may be located at:/SECURITY.md
/.github/SECURITY.md
/docs/SECURITY.md
The most significant change here is having the location be specified as an array rather than a string and iterating across the array instead of just checking one location.
fileCheck
map from the config filefileNames
array fromfileCheck
state
string fromfileCheck
fileNames
a. if
state
==present
and any match, pass; otherwise failb. if
state
==absent
and any match, fail; otherwise passConsider the following configuration file:
Iteration 4: support multiple checks
Currently, there's a 1:1 correspondence with regards to file checks and policies (e.g., the Security policy can't look for a CODEOWNERS file); this iteration supports multiple checks.
Instead of accepting a single map,
fileCheck
, accept an array of maps with each item representing the check fora single file (although it may be in multiple locations).
Consider the following configuration file:
files
array from the config filefiles
arrayfileNames
array from iterantstate
string from iterantfileNames
a. if
state
==present
and any match, pass; otherwise failb. if
state
==absent
and any match, fail; otherwise passThe text was updated successfully, but these errors were encountered: