Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_eip — automatically disassociate eip when changing instance #3429

Closed
little-arhat opened this issue Oct 7, 2015 · 20 comments
Closed

Comments

@little-arhat
Copy link

If you create aws_eip in terraform with some instance, you get following error, trying to change instance argument of aws_eip

* aws_eip.serv: Failure associating EIP: Resource.AlreadyAssociated: resource eipalloc-* is already associated with associate-id eipassoc-*
    status code: 400, request id: []

Shouldn't Terraform automatically disassociate eip in order to apply changes?

@phinze
Copy link
Contributor

phinze commented Oct 12, 2015

Thanks for the report, @little-arhat. This looks like just an oversight to me - tagged as a bug and we'll get it looked at!

@george-makerbot
Copy link

Got hit by this today when I was trying to resize nat instances. When the nat instances had the lifecycle policy of create_before_destroy = true. It created the new instances correctly, but failed to reallocate the eips.

Luckily, this was discovered during testing, else this could have made a 5 minute maintenance become a 30 minute outage.

This bug makes it impossible to quickly resize instances that have eips associated with them.

@moneill
Copy link

moneill commented Jan 31, 2016

Just encountered the same issue. It would be great if Terraform could disassociate the EIP from an EC2 instance with create_before_destroy = true in advance of creating the new instance.

As it stands, if you have an EIP linked to an EC2 instance with create_before_destroy = true and make a change that would force a new EC2 resource, you end up with (1) a new instance being created, (2) the error @little-arhat noted, and (3) the old instance hanging around with the EIP attached to it.

@jn9999
Copy link

jn9999 commented Feb 1, 2016

What would be a good workaround when encountering this error? The EIP must not change.

@pdecat
Copy link
Contributor

pdecat commented May 12, 2016

I just encountered the same issue with 0.6.16 and an instance with create_before_destroy = true.

My quick work around was to detach the EIP by hand then relaunch the apply.

Then, I also had to destroy the old EC2 instance by hand.

@kalbasit
Copy link

I encountered this with v0.7.0 @pdecat workaround worked. @phinze any news regarding this?

@cmusser
Copy link

cmusser commented Dec 15, 2016

Still happens with 0.8.1. My specific use case is changing the user_data of an instance, which causes that instance to be dropped and recreated. The plan phase indicates it wants to change the EIP, terminate and recreate the instance and change an associated route 53 record. Same result as reported above.

@jmvbxx
Copy link

jmvbxx commented Feb 2, 2017

This happened to me today the first terraform apply on a new infrastructure using version 0.8.5. Is this going to be addressed in a future release? What's the proper workaround that doesn't require me manually attempting to disassociate IPs?

@billmoritz
Copy link

I am getting this error as well on 0.7.9. We are using create_before_destroy and this issue broke instance replacement. Is there a workaround?

Right now we have the old instances alive with the EIP attached and the new instances sitting in place. Can I manually move the EIP to the new instance? Will TF terminate the old instance after another plan/apply?

@catsby
Copy link
Contributor

catsby commented Mar 23, 2017

Hey everyone –

I've tried reproducing this on Terraform v0.7.9 and the current version in the master branch but with no luck. Here's the config I'm using:

provider "aws" {
  region = "us-west-2"
}

variable "server_count" {
  default = 4
}

resource "aws_eip" "ip" {
  count    = "${var.server_count}"
  instance = "${element(aws_instance.example.*.id, count.index)}"
  vpc      = true
}

resource "aws_instance" "example" {
  count                       = "${var.server_count}"
  ami                         = "ami-dfc39aef"
  instance_type               = "t2.nano"
  associate_public_ip_address = true

  tags {
    Name = "tf-issue-3429-repro"
  }

  lifecycle {
    create_before_destroy = true
  }
}

(gist here https://gist.github.com/catsby/0e278432486dfc572134d3ac71ca1758)

I've tried tainting random instances and even all of them, but no luck triggering the error. Can anyone take this example and modify it to demonstrate the error? Thanks!

@catsby catsby added the waiting-response An issue/pull request is waiting for a response from the community label Mar 23, 2017
@billmoritz
Copy link

@catsby I was able to reproduce with the following code. We are pointing a DNS record at the instance private IP.

provider "aws" {
  region = "us-west-2"
}

variable "server_count" {
  default = 4
}

resource "aws_eip" "ip" {
  count    = "${var.server_count}"
  instance = "${element(aws_instance.example.*.id, count.index)}"
  vpc      = true
}

resource "aws_instance" "example" {
  count                       = "${var.server_count}"
  ami                         = "ami-dfc39aef"
  instance_type               = "t2.nano"
  associate_public_ip_address = true
  subnet_id                   = "${aws_subnet.us-west-2b-public.id}"
  availability_zone           = "${aws_subnet.us-west-2b-public.availability_zone}"

  tags {
    Name = "tf-issue-3429-repro"
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_route53_record" "ec2_instance_fqdn" {
  count   = "${var.server_count}"
  zone_id = "${var.hosted_zone_id}"
  name    = "${format("%v-%02d.%v", var.role, count.index+1, var.domain)}"
  type    = "CNAME"
  ttl     = "60"
  records = ["${element(aws_instance.example.*.private_dns, count.index)}"]
}

variable "role" {
  default = "example"
}

variable "domain" {
  default = "exmaple.com"
}

variable "hosted_zone_id" {
  default = "XXXXXXXXXXXXX"
}

resource "aws_vpc" "example" {
	cidr_block = "10.0.0.0/16"
}

resource "aws_internet_gateway" "example" {
	vpc_id = "${aws_vpc.example.id}"
}

resource "aws_subnet" "us-west-2b-public" {
	vpc_id = "${aws_vpc.example.id}"

	cidr_block = "10.0.0.0/24"
	availability_zone = "us-west-2b"
}

resource "aws_route_table" "us-west-2-public" {
	vpc_id = "${aws_vpc.example.id}"

	route {
		cidr_block = "0.0.0.0/0"
		gateway_id = "${aws_internet_gateway.example.id}"
	}
}

resource "aws_route_table_association" "us-west-2b-public" {
	subnet_id = "${aws_subnet.us-west-2b-public.id}"
	route_table_id = "${aws_route_table.us-west-2-public.id}"
}

https://gist.github.com/billmoritz/b843faf1e413ba74e231e1eab9d5df85

@billmoritz
Copy link

Added the log output to the gist as well https://gist.github.com/billmoritz/b843faf1e413ba74e231e1eab9d5df85#file-output-log

@catsby catsby removed the waiting-response An issue/pull request is waiting for a response from the community label Apr 13, 2017
@catsby
Copy link
Contributor

catsby commented May 10, 2017

Hey @billmoritz can you confirm if this is still happening on more recent versions of Terraform for you? Are you still on v0.7.9? I'm still stuck here, I see the error clearly in your log but several of us have failed to reproduce this.

@billmoritz
Copy link

This was failing on v0.8.8. I recently moved to 0.9.5 and verified this was still failing.

I did however find a workaround by removing the instance parameter from the aws_eip resource and creating an aws_eip_association resource to tie the eip and instance together.

+resource "aws_eip_association" "eip_assoc" {
+  count         = "${var.server_count}"
+  instance_id   = "${element(aws_instance.example.*.id, count.index)}"
+  allocation_id = "${element(aws_eip.ip.*.id, count.index)}"
+}

 resource "aws_eip" "ip" {
   count    = "${var.server_count}"
-  instance = "${element(aws_instance.example.*.id, count.index)}"
   vpc      = true
}

After that the aws_eip_association was getting marked for recreation.

-/+ module.stargate.aws_eip_association.eip_assoc.0
    allocation_id:        "eipalloc-556d3d64" => "eipalloc-556d3d64"
    instance_id:          "i-03ba93821e80b6f1a" => "${element(aws_instance.ec2_instance.*.id, count.index)}" (forces new resource)
    network_interface_id: "eni-b3215c68" => "<computed>"
    private_ip_address:   "10.1.12.79" => "<computed>"
    public_ip:            "redacted" => "<computed>"

-/+ module.stargate.aws_eip_association.eip_assoc.1
    allocation_id:        "eipalloc-816c3cb0" => "eipalloc-816c3cb0"
    instance_id:          "i-0d3a713fcb4e780df" => "${element(aws_instance.ec2_instance.*.id, count.index)}" (forces new resource)
    network_interface_id: "eni-32c975fb" => "<computed>"
    private_ip_address:   "10.1.15.84" => "<computed>"
    public_ip:            "redacted" => "<computed>"

@catsby
Copy link
Contributor

catsby commented Jun 2, 2017

Hey @billmoritz can you give me an indication how reproducible this is for you? I'm using Terraform v0.9.5 and I've tainted each instance, individually and all at once, and I've adjusted the server count, and I can't reproduce this.

At this point I'm not sure what I'm missing. Is there any other step-by-step details that I'm not doing? If you can still reproduce this, please try and capture the full debug output by running the apply command with the TF_LOG=1 env set before. Ex:

$ TF_LOG=1 terraform apply <...>

Be sure to scrub that output of any secrets before sharing of course. If you need to share privately we can arrange that via another channel.

Thanks!

@catsby
Copy link
Contributor

catsby commented Jun 2, 2017

Also can you let us know what awsu exec is/does?

@billmoritz
Copy link

awsu exec Is a shim for adding in AWS Assumed Role credentials as environment variables.

@catsby
Copy link
Contributor

catsby commented Jun 15, 2017

Hey all – this issue has been moved to hashicorp/terraform-provider-aws#42, and I've opened a PR to fix it here: hashicorp/terraform-provider-aws#878

@catsby
Copy link
Contributor

catsby commented Jun 16, 2017

I just merged hashicorp/terraform-provider-aws#878 which should address this

@ghost
Copy link

ghost commented Apr 8, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests