Skip to content

Error when instance changed that has EBS volume attached #2957

@bloopletech

Description

@bloopletech

This is the specific error I get from terraform:

aws_volume_attachment.admin_rundeck: Destroying...
aws_volume_attachment.admin_rundeck: Error: 1 error(s) occurred:

* Error waiting for Volume (<vol id>) to detach from Instance: <instance id>
Error applying plan:

3 error(s) occurred:

* Error waiting for Volume (<vol id>) to detach from Instance: <instance id>
* aws_instance.admin_rundeck: diffs didn't match during apply. This is a bug with Terraform and should be reported.
* aws_volume_attachment.admin_rundeck: diffs didn't match during apply. This is a bug with Terraform and should be reported.

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

We are building out some infrastructure in EC2 using terraform (v0.6.0). I'm currently working out our persistent storage setup. The strategy I'm planning is to have the root volume of every instance be ephemeral, and to move all persistent data to a separate EBS volume (one persistent volume per instance). We want this to be as automated as possible of course.

Here is a relevant excerpt from our terraform config:

resource "aws_instance" "admin_rundeck" {
  ami = "${var.aws_ami_rundeck}"
  instance_type = "${var.aws_instance_type}"
  subnet_id = "${aws_subnet.admin_private.id}"
  vpc_security_group_ids = ["${aws_security_group.base.id}", "${aws_security_group.admin_rundeck.id}"]
  key_name = "Administration"

  root_block_device {
    delete_on_termination = false
  }

  tags {
    Name = "admin-rundeck-01"
    Role = "rundeck"
    Application = "rundeck"
    Project = "Administration"
  }
}

resource "aws_ebs_volume" "admin_rundeck" {
  size = 500
  availability_zone = "${var.default_aws_az}"
  snapshot_id = "snap-66fc2258"
  tags = {
    Name = "Rundeck Data Volume"
  }
}

resource "aws_volume_attachment" "admin_rundeck" {
  device_name = "/dev/xvdf"
  instance_id = "${aws_instance.admin_rundeck.id}"
  volume_id = "${aws_ebs_volume.admin_rundeck.id}"

  depends_on = "aws_route53_record.admin_rundeck"

  connection {
    host = "admin-rundeck-01.<domain name>"
    bastion_host = "${aws_instance.admin_jumpbox.public_ip}"
    timeout = "1m"
    key_file = "~/.ssh/admin.pem"
    user = "ubuntu"
  }

  provisioner "remote-exec" {
    script = "mount.sh"
  }

  provisioner "remote-exec" {
    inline = [
      "sudo mkdir -m 2775 /data/rundeck",
      "sudo mkdir /data/rundeck/data /data/rundeck/projects && sudo chown -R rundeck:rundeck /data/rundeck",
      "sudo service rundeckd restart"
    ]
  }
}

And mount.sh:

#!/bin/bash

while [ ! -e /dev/xvdf ]; do sleep 1; done

fstab_string='/dev/xvdf /data ext4 defaults,nofail,nobootwait 0 2'
if grep -q -F -v "$fstab_string" /etc/fstab; then
  echo "$fstab_string" | sudo tee -a /etc/fstab
fi

sudo mkdir -p /data && sudo mount -t ext4 /dev/xvdf /data

As you can see, this:

  • Provisions an instance to run Rundeck (http://rundeck.org/)
  • Provisions an EBS volume based off of a snapshot. The snapshot in this case is just an empty ext4 partition.
  • Attaches the voulme to the instance
  • Mounts the volume inside the instance, and then creates some directories to store data in

This works fine the first time it's run. But any time we:

  • make a change to the instance configuration (i.e. change the value of var.aws_ami_rundeck) or
  • make a change to the provisioner config of the volume attachment resource

Terraform then tries to detach the extant volume from the instance, and this task fails every time. I believe this is because you are meant to unmount the ebs volume from inside the instance before detaching the volume. The problem is, I can't work out how to get terraform to unmount the volume inside the instance before trying to detach the volume.

It's almost like I need a provisioner to run before the resource is created, or a provisioner to run on destroy (obviously #386 comes to mind).

This feels like it would be a common problem for anyone working with persistent EBS volumes using terraform, but my googling hasn't really found anyone even having this problem.

Am I simply doing it wrong? I'm not worried about how I get there specifically, I just would like to be able to provision persistent EBS volumes, and then attach and detach that volume to my instances in an automated fashion.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions