Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add native support for updating .terraform.lock.hcl #90

Merged
merged 32 commits into from
Jul 4, 2023
Merged

Conversation

minamijoyo
Copy link
Owner

@minamijoyo minamijoyo commented Jun 29, 2023

This pull request addresses a performance issue for updating .terraform.lock.hcl at scale by introducing a new tfupdate lock command.

The dependency lock file (a.k.a. .terraform.lock.hcl) was introduced in Terraform v0.14. While it works fine for simple setups, updating hundreds of lock files scattered across multiple directories is still challenging.

At that time, I avoided implementing the feature for updating lock files in tfupdate because:

  • The lock file format is the implementation details of Terraform.
  • Its format appears to be extensible for modules in the future.
  • It's easy to imagine supporting multiple Terraform versions would be hard.

Therefore, I worked around it by using the official terraform providers mirror and terraform providers lock commands. #32

I understood the root cause of this complexity is that the official Terraform Registry doesn't return h1 hash values, so I proposed changing the protocol of Terraform Registry in the upstream. hashicorp/terraform#27264

However, the situation became even worse while we were waiting for progress in the upstream:

  • With the rise of arm chips, especially m1 mac, the number of platforms to calculate has increased.
  • As growing my infrastructure to be managed, the number of directories to calculate has increased.
  • Starting from Terraform v1.4, lock file management is substantially required to enable provider caching. This means that we can no longer have the final resort of throwing it into .gitignore. provider cache becomes ineffective with 1.4.0-alpha release hashicorp/terraform#32205

After over two years of waiting for progress in the upstream, it's time to bite the bullet and reimplement it myself, knowing it's the implementation details.

This PR adds a new tfupdate lock command which updates .terraform.lock.hcl without Terraform CLI.

$ tfupdate lock --help
Usage: tfupdate lock [options] <PATH>
Arguments
  PATH               A path of directory to update
Options:
      --platform     Specify a platform to update dependency lock files.
                     At least one or more --platform flags must be specified.
                     Use this option multiple times to include checksums for multiple target systems.
                     Target platform names consist of an operating system and a CPU architecture.
                     (e.g. linux_amd64, darwin_amd64, darwin_arm64)
  -r  --recursive    Check a directory recursively (default: false)
  -i  --ignore-path  A regular expression for path to ignore
                     If you want to ignore multiple directories, set the flag multiple times.

Note that unlike the terraform providers lock command, the --platform flag requires two hyphens (--).

The tfupdate lock command parses the required_providers block in your configuration, downloads provider packages, and calculates hash values under the hood. The most important point is that it caches calculated hash values in memory, giving us a huge performance advantage when updating multiple directories using the -r (--recursive) option.

$ tfupdate lock --platform=linux_amd64 --platform=darwin_amd64 --platform=darwin_arm64 -r ./

To skip terraform init, we assume that all dependencies are pinned to a specific version (e.g. 3.2.1) in the required_providers block of the root module. Note that version constraint expressions (e.g. > 3.0, = 3.2.1 )or indirect dependencies via modules are not supported and are ignored.

Closes #68

Currently the Terraform Registry returns the zh hash for all platforms,
but not the h1 hash, so the h1 hash has to be calculated separately.
We need to calculate the values for each platform and merge the results.
To avoid multiple downloads and recalculations for each directory, the
results are cached in memory.
In order to update dependency lock files, we need to detect the version
constraints of the providers of the module, but the current Updater
interface can only handle a single file. Therefore, we introduce a new
concept called ModuleContext, which is supposed to provide the Updater
with information of the module that is currently being updated. Changes
to the Updater interface will be addressed later because it requires
massive rewriting.

Note that it does not actually re-implement the resolution of version
constraints in terraform init. It is very simplified for the use we
need. Version constraints only support simple constants and not
comparison operators. Ignore what cannot be interpreted.
Until now, the Updater interface has been implemented statelessly, but
in order to adding a feature for updating the dependency lock file, we
also need the module context. I've only adjusted the interface for now,
but this will also help us to achieve more smart rewriting in the
future. We also need to access the index to cache the hash value of the
provider, which is a global state between modules. I decided to access
the global context via the module context because it is always needed in
combination with the module context. In addition, the option and updater
instances were also moved to the global context, as they have not
changed within the process.
The Terraform dependency lock file (a.k.a. .terraform.lock.hcl) was
introduced in v0.14. While this is fine for simple use cases, there are
may challenges at scale. This is because terraform is based on
per-directory processing, and updating lock files using the official
terraform providers mirror and terraform providers lock commands is very
inefficient.

However, at that time, we avoid supporting for updating the lock file
for the following reasons:

- That the underlying reason for the complexity of updating the lock
  file is due to the lack of capability of the Terraform Registry; it
  does not return the h1 hash value.
- That the Registry protocol could be improved in the future.
- The lock file format implies to support not only providers but also
  modules.
- The lock file format is an implementation detail.
- It was easy to imagine that supporting multiple Terraform versions
  would be hard.

It got worse as time went on:

- With the rise of arm chips, the number of platforms required for
  locking has increased.
- As our infrastructure grown, the number of directories has increased
  accordingly.

However, no progress has been made on this issue in the upstream. So it
is time to reinvent the wheel by ourselves. Calculate hash values in
.terraform.lock.hcl and cache them to be able to update multiple
directories at once.
Initially, we thought of using the key of the zh hash as the platform,
but we found out that it also includes metadata such as manifest.json,
so we decided to use the filename as it is.
@minamijoyo minamijoyo changed the title [WIP] Add native support for updating .terraform.lock.hcl Add native support for updating .terraform.lock.hcl Jul 3, 2023
@minamijoyo minamijoyo merged commit ddd0715 into master Jul 4, 2023
5 checks passed
@minamijoyo minamijoyo deleted the lockfile branch July 4, 2023 01:50
@minamijoyo
Copy link
Owner Author

This feature has been released in v0.7.0 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support .terraform.lock.hcl files when upgrading providers
1 participant