Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The archive_file resource now zips directories during terraform plan and ignores depends_on #78

Open
ghost opened this issue Sep 1, 2020 · 19 comments

Comments

@ghost
Copy link

ghost commented Sep 1, 2020

This issue was originally opened by @rluckey-tableau as hashicorp/terraform#26064. It was migrated here as a result of the provider split. The original body of the issue is below.


Terraform Version

Terraform v0.13.1

Terraform Configuration Files

resource "null_resource" "config_and_jar" {
  triggers = {
    on_every_apply = uuid()
  }

  provisioner "local-exec" {
    command = join(" && ", [
      "rm -rf ${var.zip_input_dir}",
      "mkdir -p ${var.zip_input_dir}/lib",
      "wget ${var.jar_url}",
      "mv './${var.jar_filename}' '${var.zip_input_dir}/lib'",
      "cp -r '${var.config_directory}'/* '${var.zip_input_dir}'",
    ])
  }
}

data "archive_file" "lambda" {
  depends_on  = [null_resource.config_and_jar]

  output_path = var.zip_output_filepath
  source_dir  = var.zip_input_dir
  type        = "zip"
}

Debug Output

Crash Output

Expected Behavior

   # module.samplelambda.data.archive_file.lambda will be read during apply
   # (config refers to values not yet known)
  <= data "archive_file" "lambda"  {
       + id                  = (known after apply)
       + output_base64sha256 = (known after apply)
       + output_md5          = (known after apply)
       + output_path         = "lambda/temp/samplelambda/src.zip"
       + output_sha          = (known after apply)
       + output_size         = (known after apply)
       + source_dir          = "lambda/samplelambda/zip-temp"
       + type                = "zip"
     }

    ...

Job succeeded

Actual Behavior

 Error: error archiving directory: could not archive missing directory: lambda/samplelambda/zip-temp
   on lambda/main.tf line 52, in data "archive_file" "lambda":
   52: data "archive_file" "lambda" {
ERROR: Job failed: exit code 1

Steps to Reproduce

  1. terraform init
  2. terraform plan

Additional Context

This is running on GitLab CI/CD runners. It has been working fine for at least the past year on various versions of Terraform 0.12 up through Terraform 0.12.21.

In case it is useful, here are the provider versions being pulled when using Terraform 0.13.1:

 - Installed -/aws v3.4.0 (signed by HashiCorp)
 - Installed hashicorp/null v2.1.2 (signed by HashiCorp)
 - Installed hashicorp/local v1.4.0 (signed by HashiCorp)
 - Installed -/null v2.1.2 (signed by HashiCorp)
 - Installed hashicorp/aws v3.4.0 (signed by HashiCorp)
 - Installed -/archive v1.3.0 (signed by HashiCorp)
 - Installed -/local v1.4.0 (signed by HashiCorp)
 - Installed hashicorp/archive v1.3.0 (signed by HashiCorp)

Edit: Tested the same TF code with a 0.12.29 image and terraform plan passed with no issues, even though it is pulling the same provider versions.

 - Downloading plugin for provider "aws" (hashicorp/aws) 3.4.0...
 - Downloading plugin for provider "local" (hashicorp/local) 1.4.0...
 - Downloading plugin for provider "null" (hashicorp/null) 2.1.2...
 - Downloading plugin for provider "archive" (hashicorp/archive) 1.3.0...

References

@rluckey-tableau
Copy link

This does not seem likely to be an issue with the hashicorp/archive v1.3.0 provider since the problem occurs with that version of the provider on Terraform v0.13.1 but not with that version of the provider on Terraform v0.12.29.

@Mrg77
Copy link

Mrg77 commented Sep 7, 2020

I have the same problem on my configuration when i try to create an lambda :

Error: error archiving directory: could not archive missing directory: ./.terraform/temp/GTUKucTyx-0/lambda-response

on AWS-App_FE.tf line 115, in data "archive_file" "lambda_edge_function_files":
115: data "archive_file" "lambda_edge_function_files" {

Try to do this with terraform 0.13.2
All work with terraform 0.12.29

@Robin-Walter
Copy link

Same for me. All work with terraform 0.12.29 but not with version ^0.13

@rluckey-tableau
Copy link

Any traction on this at all? Still blocking upgrade to v0.13.

@nicolas-lopez
Copy link

Hello any news ?

@aaronsteers
Copy link

Anyone find a workaround or solution to this? I'm blocked on the same.

@dparker2
Copy link

This seems to be happening when terraform is refreshing the state. So it works (for me at least, w/ v0.13.5) on the initial apply, but if the source_dir is deleted and terraform refresh is ran, it complains with this error.

@aaronsteers
Copy link

aaronsteers commented Nov 19, 2020

@dparker2 - confirmed, I'm seeing the same.

My only workaround at this point is to run terraform state rm ____ to forcibly delete the upstream object which is not getting properly waited for otherwise.

@aaronsteers
Copy link

aaronsteers commented Nov 20, 2020

@apparentlymart - It looks like this might be the bug described in 0.14.0 release notes.

Can you confirm if this looks like it may be related to that issue?

...
core: Errors with data sources reading old data during refresh, failing to refresh, and not appearing to wait on resource dependencies are fixed by updates to the data source lifecycle and the merging of refresh and plan (hashicorp/terraform#26270)
...

... as listed here in 0.14.0-rc1 release notes regarding fix described here: hashicorp/terraform#26270

Also:

@jbardin
Copy link
Member

jbardin commented Nov 20, 2020

Thanks @aaronsteers, this type of issue will be resolved with 0.14.
There isn't anything to fix in the archive_file data source, if the external directory is removed during refresh the data source is expected to fail. Eliminating the separate refresh phase avoids this possibility entirely.

@aaronsteers
Copy link

Thanks, @jbardin ! By chance is there an expected release date for 0.14 or a ticket I can follow which would cover that timeline?

For my case, we've already upgraded to 0.13.4, including our our state files - so rolling back to 0.12.x would be a very difficult option for us.

@aaronsteers
Copy link

Update: I found the forum on Discuss - looks like ETA for 0.14.0 GA is next week.

https://discuss.hashicorp.com/t/terraform-v0-14-0-rc1-released/15752/14

I'll start testing with the RC1 with the expectation/hope we'll make a quick transition once released.

@antondemidov
Copy link

Still have the problem with terraform 0.14

@aaronsteers
Copy link

aaronsteers commented Dec 4, 2020

@antondemidov - I can't speak to any remnants of the 0.13.x behavior, because my project is basically working again...

However, I do know there was a workaround described in another ticket (long ago, I'm afraid), in which there was a recommendation to pipe dependencies through a null_data_source. Based on my own research, that workaround may also be required. I'm still using it in my own projects - and if I remove it, I do lose the wait on dependencies as you are describing.


UPDATE: this stack overflow answer seems to describe the workaround concisely: https://stackoverflow.com/a/58585612/4298208

And the related issues pointed to in the stackoverflow are:

@antondemidov
Copy link

Thanks, @aaronsteers. I'll take a look

btw, I've found some workaround. Just to use null_resource for zipping file. Not very terraform way, but it's working :) Posting here for reference:

resource "null_resource" "libs_layer" {
  triggers = {
    policy_sha1 = sha1(file("${path.module}/../../../../../authentication-service/authservice/lambda/verifyoauth_dev/requirements.txt"))
  }

  provisioner "local-exec" {
    command = <<CMD
    virtualenv -p python3.7 files/venv-libs/python;
    files/venv-libs/python/bin/pip install -r ${path.module}/../../../../../authentication-service/authservice/lambda/verifyoauth_dev/requirements.txt;
    cd files/venv-libs;
    zip -r ../verify_oauth_lambda_layer.zip python/lib/python3.7/site-packages;
CMD

    interpreter = ["bash", "-c"]
  }
}

@nweatomv
Copy link

@jbardin So other than the workaround mentioned by @antondemidov how is one to migrate from 12 to 14, if the 14 migration docs specifically mention that you need to do at least one apply with 13 before going to 14 this seems like it is kind of a blocker for anyone migrating from 12, if they cannot migrate to 13. I tried the null_data_source workaround mentioned by @aaronsteers and it did not help.

@aaronsteers
Copy link

aaronsteers commented Dec 27, 2020

@antondemidov - RE:

btw, I've found some workaround. Just to use null_resource for zipping file. Not very terraform way, but it's working :)

I was actually going in this direction also when you posted this. In terms of the right 'terraform way', I think the solution is worth merit.

New platform-agnostic solution for using native zip on any OS.

I followed the same general approach as @antondemidov - except that I needed it to work on Windows as well as Linux-based systems. (Caveat: Even though I've now tested this on both Ubuntu and Windows 10, I don't know that I have the icacls statements exactly correct yet.)

  1. Calculate local hashes of source files, a hash of the hashes, and an abbrev of the hash of hashes:
locals {
  source_files_hash = join(",", [
    for filepath in local.source_files :
    filebase64sha256("${var.local_metadata_path}/${filepath}")
  ])
  unique_hash   = md5(local.source_files_hash)
  unique_suffix = substr(local.unique_hash, 0, 4)
}
  1. Declare some needed paths and detect the OS:
locals {
  is_windows              = substr(pathexpand("~"), 0, 1) == "/" ? false : true
  temp_artifacts_root     = "${path.root}/.terraform/tmp"
  temp_build_folder       = "${local.temp_artifacts_root}/${var.name_prefix}lambda-zip-${local.unique_suffix}"
  local_requirements_file = fileexists("${var.lambda_source_folder}/requirements.txt") ? "${var.lambda_source_folder}/requirements.txt" : null
}
  1. All in one null_resource step:
    1. Copy source data to temp folder (using a new tmp subfolder under the local .terraform directory).
    2. Optionally pip-install dependencies to the folder if requirements.txt exists.
    3. Make sure the execute permissions is set on files before zipping.
    4. Compress the contents.
resource "null_resource" "create_dependency_zip" {
  triggers = {
    version_increment = 1.1 # can be bumped to manually force a refresh
    source_files_hash = local.source_files_hash
  }

  provisioner "local-exec" {
    interpreter = local.is_windows ? ["Powershell", "-Command"] : ["/bin/bash", "-c"]
    command = join(local.is_windows ? "; " : " && ", flatten(
      # local.local_requirements_file == null ? [] :
      local.is_windows ?
      [
        [
          "echo \"Creating target directory '${abspath(local.temp_build_folder)}'...\"",
          "New-Item -ItemType Directory -Force -Path ${abspath(local.temp_build_folder)}",
          "echo \"Copying directory contents from '${abspath(var.lambda_source_folder)}/' to '${abspath(local.temp_build_folder)}/'...\"",
          "Copy-Item -Force -Recurse -Path \"${abspath(var.lambda_source_folder)}/*\" -Destination \"${abspath(local.temp_build_folder)}/\"",
          "echo \"Granting execute permissions on temp folder '${local.temp_build_folder}'\"",
          "icacls ${local.temp_build_folder} /grant Everyone:F",
          "icacls ${local.temp_build_folder}/* /grant Everyone:F",
        ],
        local.local_requirements_file == null ? [] : !fileexists(local.local_requirements_file) ? [] :
        [
          "echo \"Running pip install from requirements '${abspath(local.local_requirements_file)}'...\"",
          "${local.pip_path} install --upgrade -r ${abspath(local.local_requirements_file)} --target ${local.temp_build_folder}",
        ],
        [
          "sleep 3",
          "echo \"Changing working directory to temp folder '${abspath(local.temp_build_folder)}'...\"",
          "cd ${abspath(local.temp_build_folder)}",
          "echo \"Zipping contents of ${abspath(local.temp_build_folder)} to '${abspath(local.local_dependencies_zip_path)}'...\"",
          "ls",
          "tar -acf ${abspath(local.local_dependencies_zip_path)} *",
        ]
      ] :
      [
        [
          "echo \"Creating target directory '${abspath(local.temp_build_folder)}'...\"",
          "set -e",
          "mkdir -p ${local.temp_build_folder}",
          "echo \"Copying directory contents from '${abspath(var.lambda_source_folder)}/' to '${abspath(local.temp_build_folder)}/'...\"",
          "cp ${var.lambda_source_folder}/* ${local.temp_build_folder}/",
        ],
        local.local_requirements_file == null ? [] : !fileexists(local.local_requirements_file) ? [] :
        [
          "echo \"Running pip install from requirements '${abspath(local.local_requirements_file)}'...\"",
          "${local.pip_path} install --upgrade -r ${local.temp_build_folder}/requirements.txt --target ${local.temp_build_folder}",
        ],
        [
          "sleep 3",
          "echo \"Granting execute permissions on temp folder '${local.temp_build_folder}'\"",
          "chmod -R 755 ${local.temp_build_folder}",
          "cd ${abspath(local.temp_build_folder)}",
          "echo \"Zipping contents of '${abspath(local.temp_build_folder)}' to '${abspath(local.local_dependencies_zip_path)}'...\"",
          "zip -r ${abspath(local.local_dependencies_zip_path)} *",
        ]
      ]
    ))
  }
}
  1. Force a new S3 upload by embedding the unique_suffix as part of the S3 key name, which is based on the hash of all source files, excluding pip artifacts.
    • Note:
      • The normal way this resource tries to detect changes is the etag property - which we cannot use for our purposes because the etag (that is, the zip's file hash) cannot be known at plan time.
      • The S3 key change also triggers an update to the Lambda function, which doesn't seem to happen if the S3 file is updated in place.
resource "aws_s3_bucket_object" "s3_source_uploads" {
  bucket = var.dest_bucket
  key = "${var.dest_folder}/lambda-dependencies-${local.unique_suffix}.zip"
    ]
  )
  source   = "${local.local_dependencies_zip_path}"
}

Quick note regarding native zip capabilities on Windows and Linux-based systems.

  • Apparently Windows has been shipping rar.exe natively in the OS for a while, and it's also able to create zips (as demonstrated in the snippet above.
  • Ironically, in this case, Linux takes an additional prereq in this scenario, which is simply apt-get zip. (Runs in <4 seconds in my CI/CD environment.)
  • Windows also has PowerShell Compress-Archive commandlet - but this cannot be used in our scenarios due to a fact that it mutilates the file attributes (aka file permissions) that are needed for Linux-based systems. Without proper execute permissions on the zipped-up python files, they cannot be run in AWS Lambda. (Lambda will throw an error.)

Problems with using data.local_archive in combination with null_resource and/or null_data_source:

In retrospect, I think using the data archive resource needs additional enhancement before it's ready to support these use cases natively. For instance, if that provider were to take the equivalent of a pre-run local provisioner, we might be close to accomplishing the use case without extensive hackery. But in absense of native support for pre-zip steps, I think the above solution is the best because it has a short dependency tree and all side-effects are in the solitary null_resouce - pairing with a rename at the s3-key level to force simultaneous update of both if either is modified.

In Summary:

I don't know if others have gotten the older (pre-tf-0.13) solution working again, but I wanted to post here that the above method is working for me, and after successful apply it also provides a clean plan in CI/CD environments where no local files are cached between executions.

@joeltio
Copy link

joeltio commented Apr 6, 2021

Still have the problem with terraform 0.14

I don't seem to have the problem in 0.14. I'm using v0.14.9 with this smaller test:

# main.tf
resource "null_resource" "zip" {
  triggers = {
    # Not the best to use here but works well enough
    create_file = fileexists("${path.module}/folder_to_zip/new_file")
  }
  provisioner "local-exec" {
    working_dir = "folder_to_zip"
    command = "touch new_file"
  }
}

data "archive_file" "zip" {
  depends_on = [null_resource.zip]
  type        = "zip"
  source_dir  = "folder_to_zip"
  output_path = "output.zip"
}

Commands I ran:

mkdir folder_to_zip
terraform init
terraform apply

The output.zip contained the new_file. Stopping the terraform apply also produced no zip files and the command was not run.

Unless maybe I misunderstood the problem here.

@johnbarney
Copy link

Still have the problem with terraform 0.14

I don't seem to have the problem in 0.14. I'm using v0.14.9 with this smaller test:

# main.tf
resource "null_resource" "zip" {
  triggers = {
    # Not the best to use here but works well enough
    create_file = fileexists("${path.module}/folder_to_zip/new_file")
  }
  provisioner "local-exec" {
    working_dir = "folder_to_zip"
    command = "touch new_file"
  }
}

data "archive_file" "zip" {
  depends_on = [null_resource.zip]
  type        = "zip"
  source_dir  = "folder_to_zip"
  output_path = "output.zip"
}

Commands I ran:

mkdir folder_to_zip
terraform init
terraform apply

The output.zip contained the new_file. Stopping the terraform apply also produced no zip files and the command was not run.

Unless maybe I misunderstood the problem here.

I think this simply highlights a race condition. Touching a file is incredibly fast. Running pip install isn't. Not to mention your directory already exists before you start whereas some are attempting to move the source files to a new folder, build dependencies, then zipping. Local_exec and archive_file are running in parallel on different threads and thus you did not hit the problem. You can test this by replacing command = "touch new_file" with command = "sleep 20; touch new_file".

Seems sad but the only way I see to solve this issue is to do the zipping inside local_exec, which makes cross platform quite painful. Overall it seems to me to be a problem with how archive_file is implemented. A Resource is an object you wish Terraform to create an manage. A Datasource is an object that exists that you want Terraform to refer to. In my mind its obvious if we are creating an archive file that concept would be a Resource to Terraform. I'm sure there are good deeper level reasons why the team decided change this provider in the way they did. These guys are smart. But it seems to violate the design principals of Terraform itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests