-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Pass on: Automate Terraform Platform Detection for Lockfile Hashes #5013
base: main
Are you sure you want to change the base?
Conversation
new_req is already checked to be a provider in the calling function, so we can skip that check here and directly use it in the method
def hashes_object_regex | ||
/hashes\s*=\s*.*\]/m | ||
end | ||
|
||
def hashes_string_regex | ||
/(?<=\").*(?=\")/ | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These could be constants rather than methods
possible_architectures = %w( | ||
linux_amd64 | ||
darwin_amd64 | ||
windows_amd64 | ||
darwin_arm64 | ||
linux_arm64 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could extract this to be a constant
File.write(".terraform.lock.hcl", lockfile_hash_removed) | ||
|
||
SharedHelpers.run_shell_command("terraform providers lock -platform=#{arch} #{provider_source} -no-color") | ||
|
||
updated_lockfile = File.read(".terraform.lock.hcl") | ||
updated_hashes = extract_provider_h1_hashes(updated_lockfile, declaration_regex) | ||
next if updated_hashes.nil? | ||
|
||
# Check if the architecture is present in the original lockfile | ||
hashes.each do |hash| | ||
updated_hashes.select { |h| h.match?(/^h1:/) }.each do |updated_hash| | ||
architectures.append(arch.to_sym) if hash == updated_hash | ||
end | ||
end | ||
|
||
File.delete(".terraform.lock.hcl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could pull all of this out into a separate method?
def extract_provider_h1_hashes(content, declaration_regex) | ||
content.match(declaration_regex).to_s. | ||
match(hashes_object_regex).to_s. | ||
split("\n").map { |hash| hash.match(hashes_string_regex).to_s }. | ||
select { |h| h&.match?(/^h1:/) } | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts about this method:
- As @jurre commented below,
hashes_object_regex
should be a constant. - It'd be great to capture the array of hashes directly in the regex.
- Instead of splitting the
to_s
of a match, use thescan
method and enumerate over the results directly hashes_string_regex
should also be a constant, but it'd also be more useful if it baked in the match onh1
used here.
Putting everything together, it'd be something like this:
def extract_provider_h1_hashes(content, declaration_regex) | |
content.match(declaration_regex).to_s. | |
match(hashes_object_regex).to_s. | |
split("\n").map { |hash| hash.match(hashes_string_regex).to_s }. | |
select { |h| h&.match?(/^h1:/) } | |
end | |
HASHES_REGEX = /hashes\s*=\s\[(?<hashes>.*)\]/m.freeze | |
HASHES_H1_HASH_REGEX = /"h1:([\w\d\=\/]+)\"/.freeze | |
def extract_provider_h1_hashes(content, declaration_regex) | |
content.match(declaration_regex) | |
.string | |
.match(HASHES_OBJECT_REGEX)[:hashes] | |
.scan(HASHES_H1_HASH_REGEX) | |
.flatten | |
end |
Also, reading the Terraform docs, it looks like h1
is one of two hashing schemes. Do we need to handle zh
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or a simpler version that doesn't check the preceding hashes
key:
def extract_provider_h1_hashes(content, declaration_regex) | |
content.match(declaration_regex).to_s. | |
match(hashes_object_regex).to_s. | |
split("\n").map { |hash| hash.match(hashes_string_regex).to_s }. | |
select { |h| h&.match?(/^h1:/) } | |
end | |
H1_HASH_REGEX = /\"h1:([\w\d\=\/]+)\"/.freeze | |
def extract_provider_h1_hashes(content, declaration_regex) | |
content.match(declaration_regex) | |
.string | |
.scan(H1_HASH_REGEX) | |
.flatten | |
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are great suggestions for cleaning this up, thanks!
The h1
hashes are definitely platform specific and there is only one per architecture, which is why we compare on them to determine which architectures are used.
The zh
hashes might differer between architectures, but I'm not sure if we can reliably detect which ones are used in a project from them. It looks like there can be a varying number of zh
hashes across providers/architectures (so we don't know if we have detected all the architectures and exit early), and I'm not sure they're always unique across architectures if they use the same .zip
files.
Both h1
and zh
hashes are updated in the lockfile during #update_lockfile_declaration
, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Giving this a bit more thought, it might be nice to create a proper abstraction for Terraform lockfiles. Moving all of the business logic around extracting information and into methods in a new class would help keep FileUpdater
small and focused, and make it easier to test this functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be ideal. Perhaps in a follow-up PR though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a working solution, which is a good first step. But the current implementation needs to disable Metrics/AbcSize
to pass linting checks in a few places, so I think this PR would benefit from a quick refactoring pass before merging. I'd be happy to pair with you on that — just let me know! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a refactoring pass on some code already merged into Core, so you're right that we should focus on "making it right" and address linting checks this pass.
I'll ping you to set up some time to pair!
Co-authored-by: Mattt <mattt@github.com>
|
||
def lockfile_details(new_req) | ||
content = lock_file.content.dup | ||
provider_source = new_req[:source][:registry_hostname] + "/" + new_req[:source][:module_identifier] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what new_req
is, but I'm concerned about whether new_req[:source]
can be nil
(or their nested values for that matter).
This could be made safer by using dig
navigate the nested hashes and assigning registry_hostname
and module_identifier
to intermediate variables and handling nil
cases.
Also, prefer interpolation "#{}
to string concatenation.
provider_source = new_req[:source][:registry_hostname] + "/" + new_req[:source][:module_identifier] | |
registry_hostname = new_req.dig(:source, :registry_hostname]) | |
module_identifier = new_req.dig(:source, :module_identifier]) | |
# TODO: ??? if registry_hostname.nil? || module_identifier.nil? | |
provider_source = "#{registry_hostname}/#{module_identifier}" |
Refactors #4905 to lower the Rubocop
Metrics/AbcSize
andMetrics/PerceivedComplexity
This branch enables those checks for
#lookup_hash_architecture
, since we previously had them disabled.