Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Real skip_download, download gcloud cli on-demand #37

Closed
MajorBreakfast opened this issue Mar 26, 2020 · 11 comments
Closed

Real skip_download, download gcloud cli on-demand #37

MajorBreakfast opened this issue Mar 26, 2020 · 11 comments

Comments

@MajorBreakfast
Copy link

We're using the google project factory in our project and we're creating multiple projects (currently 17). Each project factory module installs this module 3 times. The cache folder is 110MB in size. Hence, we end up with a modules folder of 17 * 3 * 110MB = 5,6GB.

I would like to request that the cache/ folder is not included in the artifact and downloaded on-demand instead. We could then rely on the skip_download flag and use the gcloud cli from our pipeline container image.

The situation has gotten really bad for us because since yesterday we're running into GitHub rate limits.

@morgante
Copy link
Contributor

What's the rate limit you're encountering?

The cache folder is deliberately bundled with the module because this module was designed for locked down environments. In those environments, Terraform is not necessarily able to reach out to a random internet endpoint to download gcloud. Using the module allows us to avoid any requirement for internet connectivity.

@MajorBreakfast
Copy link
Author

MajorBreakfast commented Mar 26, 2020

The error is:

Error: Failed to download module
 Could not download module "project" (../library/project-creator/main.tf:1)
 source code from
 "https://api.github.com/repos/terraform-google-modules/terraform-google-project-factory/tarball/v7.1.0//*?archive=tar.gz":
 bad response code: 403.

and repeated > 2000 times until the job fails for different terraform modules. I cannot conclusively say that this module is at fault other that it is by far the largest and instantiated most often.

The cache folder is deliberately bundled with the module because this module was designed for locked down environments.

Terraform needs to download from GitHub, would it be possible to make the cli a separate artifact? Or as an alternative, could you provide a version of the project factory "lite edition" that does not need this module? Since, we're running our pipeline in a container with the gcloud cli preinstalled, the functionality that the cli is bundled with this module is a big overhead. The container image approach, that we're using, is also viable for locked down environments.

Edit: I just read up on private Terraform registries and I see now that a private terraform registry would make it so that the module is not downloaded from GitHub. Nevertheless, since Terraform duplicates the files for each module instance, the current approach scales badly with the amount of projects.

@morgante
Copy link
Contributor

morgante commented Mar 26, 2020

Got it. Unfortunately I suspect the rate-limit you're seeing is not related to the size of the module so I'm not sure caching it would help.

The container image approach, that we're using, is also viable for locked down environments.

My understanding is it's difficult/challenging for that to work with TFE.

We're using the google project factory in our project and we're creating multiple projects (currently 17).

I just want to note this is probably an anti-pattern if they're all in the Terraform state. We encourage splitting your project resources across multiple workspaces/state configs because each project factory invocation does include many different resources. I'd strongly consider rearchitecting this.

@eripa
Copy link

eripa commented Apr 7, 2020

Large objects are ideally managed using Git LFS

@marcus-foobar
Copy link
Contributor

marcus-foobar commented Apr 7, 2020

I backtracked this issue while exploring,
https://github.com/terraform-google-modules/terraform-google-project-factory

Basically, terraform-google-gcloud is not possible to use, or any other module depending on it.

$ cat main.tf
module "gcloud_disable" {
  source  = "terraform-google-modules/gcloud/google"
  version = "~> 0.5.0"
}
$ time terraform init
2020/04/07 09:22:26 [INFO] Terraform version: 0.12.18
2020/04/07 09:22:26 [INFO] Go runtime version: go1.13.4
2020/04/07 09:22:26 [INFO] CLI args: []string{"/usr/local/bin/terraform", "init"}
2020/04/07 09:22:26 [DEBUG] Attempting to open CLI config file: /Users/foobar/.terraformrc
2020/04/07 09:22:26 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2020/04/07 09:22:26 [DEBUG] checking for credentials in "/Users/foobar/.terraform.d/plugins"
2020/04/07 09:22:26 [DEBUG] checking for credentials in "/Users/foobar/.terraform.d/plugins/darwin_amd64"
2020/04/07 09:22:26 [INFO] CLI command args: []string{"init"}
Initializing modules...
2020/04/07 09:22:26 [TRACE] ModuleInstaller: installing child modules for . into .terraform/modules
2020/04/07 09:22:26 [DEBUG] Module installer: begin gcloud_disable
2020/04/07 09:22:26 [TRACE] ModuleInstaller: gcloud_disable is not yet installed
2020/04/07 09:22:26 [TRACE] ModuleInstaller: cleaning directory .terraform/modules/gcloud_disable prior to install of gcloud_disable
2020/04/07 09:22:26 [TRACE] ModuleInstaller: gcloud_disable is a registry module at terraform-google-modules/gcloud/google
2020/04/07 09:22:26 [DEBUG] gcloud_disable listing available versions of terraform-google-modules/gcloud/google at registry.terraform.io
2020/04/07 09:22:26 [DEBUG] Service discovery for registry.terraform.io at https://registry.terraform.io/.well-known/terraform.json
2020/04/07 09:22:26 [TRACE] HTTP client GET request to https://registry.terraform.io/.well-known/terraform.json
2020/04/07 09:22:27 [DEBUG] fetching module versions from "https://registry.terraform.io/v1/modules/terraform-google-modules/gcloud/google/versions"
2020/04/07 09:22:27 [TRACE] HTTP client GET request to https://registry.terraform.io/v1/modules/terraform-google-modules/gcloud/google/versions
2020/04/07 09:22:27 [DEBUG] found available version "0.3.0" for terraform-google-modules/gcloud/google
2020/04/07 09:22:27 [DEBUG] found available version "0.5.1" for terraform-google-modules/gcloud/google
2020/04/07 09:22:27 [DEBUG] found available version "0.5.0" for terraform-google-modules/gcloud/google
2020/04/07 09:22:27 [DEBUG] found available version "0.4.0" for terraform-google-modules/gcloud/google
2020/04/07 09:22:27 [DEBUG] found available version "0.1.0" for terraform-google-modules/gcloud/google
2020/04/07 09:22:27 [DEBUG] found available version "0.2.0" for terraform-google-modules/gcloud/google
2020/04/07 09:22:27 [DEBUG] looking up module location from "https://registry.terraform.io/v1/modules/terraform-google-modules/gcloud/google/0.5.1/download"
2020/04/07 09:22:27 [TRACE] HTTP client GET request to https://registry.terraform.io/v1/modules/terraform-google-modules/gcloud/google/0.5.1/download
Downloading terraform-google-modules/gcloud/google 0.5.1 for gcloud_disable...
2020/04/07 09:22:27 [TRACE] ModuleInstaller: gcloud_disable terraform-google-modules/gcloud/google 0.5.1 is available at "git::https://github.com/terraform-google-modules/terraform-google-gcloud?ref=v0.5.1"
2020/04/07 09:22:27 [DEBUG] will download "git::https://github.com/terraform-google-modules/terraform-google-gcloud?ref=v0.5.1" to .terraform/modules/gcloud_disable
2020/04/07 09:22:27 [TRACE] fetching "git::https://github.com/terraform-google-modules/terraform-google-gcloud?ref=v0.5.1" to ".terraform/modules/gcloud_disable"
2020/04/07 09:43:36 [TRACE] ModuleInstaller: gcloud_disable "git::https://github.com/terraform-google-modules/terraform-google-gcloud?ref=v0.5.1" was downloaded to .terraform/modules/gcloud_disable
2020/04/07 09:43:36 [TRACE] ModuleInstaller: gcloud_disable should now be at .terraform/modules/gcloud_disable
2020/04/07 09:43:36 [DEBUG] Module installer: gcloud_disable installed at .terraform/modules/gcloud_disable
2020/04/07 09:43:36 [TRACE] modsdir: writing modules manifest to .terraform/modules/modules.json
- gcloud_disable in .terraform/modules/gcloud_disable

Initializing the backend...
...
<REMOVED CONTENT>
...
* provider.null: version = "~> 2.1"
* provider.random: version = "~> 2.2"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

real	21m14.005s
user	0m18.385s
sys	0m18.321s

@MajorBreakfast
Copy link
Author

MajorBreakfast commented Apr 7, 2020

@eripa This issue is about the artifact. While it's true that the SDK is checked into Git, that's a separate issue. Terraform does not clone the Git repo, but instead downloads the artifact from GitHub, so git-lfs or not does not come into play.

@marcus-foobar Could you provide additional explanation for your output snippet? I can't quite decide in what respect it relates to this issue and what parts of the output are of relevance. Maybe it's just the last three lines (the duration)?

@eripa
Copy link

eripa commented Apr 7, 2020

@MajorBreakfast thanks, I reported the binaries-in-the-repo in #40 but it was closed referring this.

@marcus-foobar
Copy link
Contributor

@marcus-foobar Could you provide additional explanation for your output snippet? I can't quite decide in what respect it relates to this issue and what parts of the output are of relevance. Maybe it's just the last three lines (the duration)?

Correct. I should have highlighted what I was trying to show, sorry about that.

Lines, (see 09:22:27 -> 09:43:36)

2020/04/07 09:22:27 [TRACE] fetching "git::https://github.com/terraform-google-modules/terraform-google-gcloud?ref=v0.5.1" to ".terraform/modules/gcloud_disable"
2020/04/07 09:43:36 [TRACE] ModuleInstaller: gcloud_disable "git::https://github.com/terraform-google-modules/terraform-google-gcloud?ref=v0.5.1" was downloaded to .terraform/modules/gcloud_disable

and

real	21m14.005s

Waiting that long for a terraform init to complete is not a good user experience. While this is working, when not timing out, it renders the module unusable for some use-cases. I've tried some other terraform modules that depend on this module, and this issue makes those modules less attractive.

I don't see this as a bug, rather suggesting to refactor/redesign it, to get around this problem, that would enhance the end-user experience.

@morgante
Copy link
Contributor

morgante commented Apr 8, 2020

After seeing a number of issues reported about this, I'm willing to revisit removing gcloud from the module and handle downloading it dynamically at execution time.

If someone has a chance to do a PR adding this, I will review.

marcus-foobar pushed a commit to marcus-foobar/terraform-google-gcloud that referenced this issue Apr 10, 2020
@marcus-foobar
Copy link
Contributor

marcus-foobar commented Apr 10, 2020

@morgante I gave it a shot,
#41

To fully leverage this redesign; then we would have to rewrite history, see @eripa #40. The size of this repository will still be an issue if I understand how terraform works with modules (git) correctly.

@morgante
Copy link
Contributor

morgante commented Apr 16, 2020

This is now addressed by #41 and available in the 1.0.0 release.

I'm fairly confident we don't need to rewrite Git history or worry about that at all. When you reference a module from the Terraform registry, it just downloads the latest version not the full git history. I just did a test invocation and confirmed the .terraform directory does not include a gcloud cache:

$ ls .terraform/modules/cli/terraform-google-gcloud-1.0.0/cache
README.md

If anyone invokes the module from the registry and finds otherwise, we can investigate further but for now I think this is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants