Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Pass on: Automate Terraform Platform Detection for Lockfile Hashes #5013

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Nishnha
Copy link
Member

@Nishnha Nishnha commented Apr 19, 2022

Refactors #4905 to lower the Rubocop Metrics/AbcSize and Metrics/PerceivedComplexity

This branch enables those checks for #lookup_hash_architecture, since we previously had them disabled.

new_req is already checked to be a provider in the calling function, so we can skip that check here and directly use it in the method
Comment on lines 260 to 266
def hashes_object_regex
/hashes\s*=\s*.*\]/m
end

def hashes_string_regex
/(?<=\").*(?=\")/
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could be constants rather than methods

Comment on lines 118 to 124
possible_architectures = %w(
linux_amd64
darwin_amd64
windows_amd64
darwin_arm64
linux_arm64
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could extract this to be a constant

Comment on lines 138 to 153
File.write(".terraform.lock.hcl", lockfile_hash_removed)

SharedHelpers.run_shell_command("terraform providers lock -platform=#{arch} #{provider_source} -no-color")

updated_lockfile = File.read(".terraform.lock.hcl")
updated_hashes = extract_provider_h1_hashes(updated_lockfile, declaration_regex)
next if updated_hashes.nil?

# Check if the architecture is present in the original lockfile
hashes.each do |hash|
updated_hashes.select { |h| h.match?(/^h1:/) }.each do |updated_hash|
architectures.append(arch.to_sym) if hash == updated_hash
end
end

File.delete(".terraform.lock.hcl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could pull all of this out into a separate method?

@dependabot dependabot deleted a comment from anthillofoppies44 Apr 20, 2022
Comment on lines 97 to 102
def extract_provider_h1_hashes(content, declaration_regex)
content.match(declaration_regex).to_s.
match(hashes_object_regex).to_s.
split("\n").map { |hash| hash.match(hashes_string_regex).to_s }.
select { |h| h&.match?(/^h1:/) }
end
Copy link
Contributor

@mattt mattt Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts about this method:

  • As @jurre commented below, hashes_object_regex should be a constant.
  • It'd be great to capture the array of hashes directly in the regex.
  • Instead of splitting the to_s of a match, use the scan method and enumerate over the results directly
  • hashes_string_regex should also be a constant, but it'd also be more useful if it baked in the match on h1 used here.

Putting everything together, it'd be something like this:

Suggested change
def extract_provider_h1_hashes(content, declaration_regex)
content.match(declaration_regex).to_s.
match(hashes_object_regex).to_s.
split("\n").map { |hash| hash.match(hashes_string_regex).to_s }.
select { |h| h&.match?(/^h1:/) }
end
HASHES_REGEX = /hashes\s*=\s\[(?<hashes>.*)\]/m.freeze
HASHES_H1_HASH_REGEX = /"h1:([\w\d\=\/]+)\"/.freeze
def extract_provider_h1_hashes(content, declaration_regex)
content.match(declaration_regex)
.string
.match(HASHES_OBJECT_REGEX)[:hashes]
.scan(HASHES_H1_HASH_REGEX)
.flatten
end

Also, reading the Terraform docs, it looks like h1 is one of two hashing schemes. Do we need to handle zh as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or a simpler version that doesn't check the preceding hashes key:

Suggested change
def extract_provider_h1_hashes(content, declaration_regex)
content.match(declaration_regex).to_s.
match(hashes_object_regex).to_s.
split("\n").map { |hash| hash.match(hashes_string_regex).to_s }.
select { |h| h&.match?(/^h1:/) }
end
H1_HASH_REGEX = /\"h1:([\w\d\=\/]+)\"/.freeze
def extract_provider_h1_hashes(content, declaration_regex)
content.match(declaration_regex)
.string
.scan(H1_HASH_REGEX)
.flatten
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are great suggestions for cleaning this up, thanks!

The h1 hashes are definitely platform specific and there is only one per architecture, which is why we compare on them to determine which architectures are used.
The zh hashes might differer between architectures, but I'm not sure if we can reliably detect which ones are used in a project from them. It looks like there can be a varying number of zh hashes across providers/architectures (so we don't know if we have detected all the architectures and exit early), and I'm not sure they're always unique across architectures if they use the same .zip files.

Both h1 and zh hashes are updated in the lockfile during #update_lockfile_declaration, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving this a bit more thought, it might be nice to create a proper abstraction for Terraform lockfiles. Moving all of the business logic around extracting information and into methods in a new class would help keep FileUpdater small and focused, and make it easier to test this functionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be ideal. Perhaps in a follow-up PR though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a working solution, which is a good first step. But the current implementation needs to disable Metrics/AbcSize to pass linting checks in a few places, so I think this PR would benefit from a quick refactoring pass before merging. I'd be happy to pair with you on that — just let me know! 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a refactoring pass on some code already merged into Core, so you're right that we should focus on "making it right" and address linting checks this pass.

I'll ping you to set up some time to pair!

Co-authored-by: Mattt <mattt@github.com>

def lockfile_details(new_req)
content = lock_file.content.dup
provider_source = new_req[:source][:registry_hostname] + "/" + new_req[:source][:module_identifier]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what new_req is, but I'm concerned about whether new_req[:source] can be nil (or their nested values for that matter).

This could be made safer by using dig navigate the nested hashes and assigning registry_hostname and module_identifier to intermediate variables and handling nil cases.

Also, prefer interpolation "#{} to string concatenation.

Suggested change
provider_source = new_req[:source][:registry_hostname] + "/" + new_req[:source][:module_identifier]
registry_hostname = new_req.dig(:source, :registry_hostname])
module_identifier = new_req.dig(:source, :module_identifier])
# TODO: ??? if registry_hostname.nil? || module_identifier.nil?
provider_source = "#{registry_hostname}/#{module_identifier}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants