Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot associate a public IP to a new isolated network (4.12.0.0) #3321

Closed
svanharmelen opened this issue May 9, 2019 · 32 comments · Fixed by #3489
Closed

Cannot associate a public IP to a new isolated network (4.12.0.0) #3321

svanharmelen opened this issue May 9, 2019 · 32 comments · Fixed by #3489

Comments

@svanharmelen
Copy link

ISSUE TYPE
  • Bug Report
COMPONENT NAME
Networking
CLOUDSTACK VERSION
4.12.0.0
CONFIGURATION

I'm running my tests against the simulator container which I configured with an advanced datacenter setup:
python /root/tools/marvin/marvin/deployDataCenter.py -i /root/setup/dev/advanced.cfg

OS / ENVIRONMENT

N/A

SUMMARY

When creating an isolated network and then try to assign a public IP to the network, I get the following error:

{"errorcode":431,"errortext":"Network with UUID:a45c36c1-c9c4-458e-838d-1c7c7eea80a6 is in allocated and needs to be implemented first before acquiring an IP address"}

This worked fine with the 4.11.2.0 simulator container, but now fails with 4.12.0.0.

STEPS TO REPRODUCE

Create a new isolated (DefaultIsolatedNetworkOfferingWithSourceNatService) network using the createNetwork API call. Then try to assign a public IP using the associateIpAddress API call.

EXPECTED RESULTS

Don't get any errors and get a network with a public IP that will be it's source NAT IP.

ACTUAL RESULTS

A network without a public IP and this error:

{"errorcode":431,"errortext":"Network with UUID:a45c36c1-c9c4-458e-838d-1c7c7eea80a6 is in allocated and needs to be implemented first before acquiring an IP address"}
@svanharmelen
Copy link
Author

@DaanHoogland as discussed previously, here is an issue to track this problem.

@DaanHoogland
Copy link
Contributor

tnx @svanharmelen I will try to reproduce on master to have a look see
@GabrielBrascher @wido I know there has been some networking stuff going into 4.12. Would that explain this?

@DaanHoogland
Copy link
Contributor

@svanharmelen can you add the API you use here (if you don't use the UI), please?

@DaanHoogland
Copy link
Contributor

i assumed associate ipaddress networkid=5edca120-c785-441c-88c3-bad2565f5ebb and get the same result indeed.

@svanharmelen
Copy link
Author

Cool, yeah that is indeed the call that I'm trying to make...

@DaanHoogland
Copy link
Contributor

@svanharmelen
Copy link
Author

Yep, that seems to be it. Good find 👍

So now the question is if this is actually the desired behavior, or that it was an oversight that this will make it harder (if not impossible) to configure a network properly before using it now?

@DaanHoogland
Copy link
Contributor

DaanHoogland commented May 10, 2019

#3125 is a fix which has this (for you undesired) result. Not sure which way to go with this, yet.
Good point above. it would seem that a more explicit forSourceNat flag would be in order....?

@rohityadavcloud rohityadavcloud added this to the 4.13.0.0 milestone May 10, 2019
@rohityadavcloud
Copy link
Member

This is a regression, thanks for reporting @svanharmelen we'll attempt to fix it soon for the next major/minor release but that does mean we cannot do anything about 4.12.0.0.

@svanharmelen
Copy link
Author

svanharmelen commented May 10, 2019

Ok... And is there a chance we can get a 4.12.1.0 or something? As this means both Terraform and Packer will not be able to play nice with 4.12.x.x (hashicorp/packer#7557).

I can fix that one specific issue, but updating to fully support 4.12.x.x will then not be possible.

@DaanHoogland
Copy link
Contributor

@svanharmelen i had a look at several solution paths and came across the old persistent flag. Is it feasible for you to use a network offering with the persistent flag on? It will be implemented even without instances.
I realise this may come down to terraform's requirements, but if it is only a test-bed requirement you have, this may work around the issue.

@svanharmelen
Copy link
Author

@DaanHoogland I will try to test with this today and let you know how it worked out.

@DaanHoogland
Copy link
Contributor

ok, if this workaround doesn't work, we'll have to tag ipadresses with their intended usage somehow in order to know which are for static and which for (one) for source NATting.

@DaanHoogland
Copy link
Contributor

@svanharmelen did you try that workaround?

@svanharmelen
Copy link
Author

Yes I did! And sorry for the late reply... But t doesn't seem to make a difference. I can do some more tests early next week (I hope).

@DaanHoogland
Copy link
Contributor

np, I hope we get to a solution that serves both use cases.

@svanharmelen
Copy link
Author

@DaanHoogland I tested it again and I think I didn't test it correctly last time, as now it worked as expected when I updated the network offering to be persistent.

So the good thing is that I could now test all the changes and make sure the changes work. The bad thing is that if I release an updated provider using this code, people using 4.12.0.0 will still hit this error.

As workaround I can tell them to use a persistent network offering, but a much better solution would be a bug fix release to mitigate this issue. Could that be an option?

@DaanHoogland
Copy link
Contributor

@svanharmelen, /me discussing a solution. will keep you informed.

@DaanHoogland
Copy link
Contributor

@svanharmelen I had a discussion about the problem. there are two possible fixes but the underlying problem will remain in both cases.

  1. start the network with a persistent offering and make sure you wait till it started before configuring further. as this is an automated test in a simulator there might be issues with timing which also might have caused the test to pass earlier while no real network traffic was possible, before (in theory, not confirmed!).
  2. implement a global configuration item that disables the check for 'allocated' vs. 'implemented' state, which will allow the old behaviour. This will fix your automarted test in simulator but most probably won't work in real hypervisors.

Can you test solution 1? I think it is more viable in production environments than solution 2.

of course there is a third solution, refactoring the way IpAddressManager works, but this will require the most time. I cannot even start planning that.

/cc @rhtyd @PaulAngus

@svanharmelen
Copy link
Author

@DaanHoogland as already mentioned, option 1 works and allows me to test all the logic using the simulator. So that is good news!

But my remaining issue (worry) is that when people are using the updated code against a real environment they will encounter the same issue. And then asking everyone to update their network offerings seems like as poor solution for them.

@DaanHoogland
Copy link
Contributor

DaanHoogland commented Jun 5, 2019

Yes, @svanharmelen.
I agree a real live environment behaves different, but the premise is that adding a static NAT address to an unimplemented network would not work anyway as the IP would be used for source-NAT. Hence the solution in #3125 containing the line https://github.com/apache/cloudstack/blob/master/server/src/main/java/com/cloud/network/IpAddressManagerImpl.java#L1433 . It may be that this fix would have only be needed on some hypervisors/network configurations in which case it is only half a fix.
Updating offerings is a poor solution as you say, unless we do it in an automated way.

@ustcweizhou
Copy link
Contributor

@DaanHoogland
if we add a new parameter issourcenat in AssociateIPAddrCmd to indicate whether the ip is used for source nat (default value is false), will the issue be solved ?

@DaanHoogland
Copy link
Contributor

DaanHoogland commented Jun 5, 2019

@ustcweizhou , I am not sure.
Would it make sense to ever set it to true? usually the source nat address is the one automatically added. Not sure if the use-case where the next one is specifically for source nat as well is really useful. I have been thinking about this kind of solution as well. Of course we then need to handle the value as well; This is somewhat I meant with "refactoring the way IpAddressManager works".
Another solution I was thinking of is to check whether there already is an address on the interface and first add an extra one if there is none yet.

@svanharmelen
Copy link
Author

Another solution I was thinking of is to check whether there already is an address on the interface and first add an extra one if there is none yet.

That could also work. In the Terraform network resource I added a source_nat_ip option that adds the NAT IP right after creating the network. This abstracts the problem for our users a little, but now fails with 4.12.x.x

@nlindblo
Copy link

@svanharmelen am I right in assuming that this will be fixed at next Cloudstack release 4.13 if none of the above options will move forward ?

@svanharmelen
Copy link
Author

@nlindblo please define this will be fixed? Not sure what you mean with this here 😉

@nlindblo
Copy link

@svanharmelen Sorry for being unclear, basically I mean if Packer & Terraform will work again after the 4.13 Cloudstack release or if I need to consider rolling back to 4.11 to continue using the Hashicorp tools

@svanharmelen
Copy link
Author

svanharmelen commented Jun 12, 2019

@nlindblo no problem! I think Packer should work as expected (it already updated to support 4.12. Terraform should work as well, except for creating certain network types.

If you encounter any specific errors that seems other than mentioned here, please open an issue with the related tool. Thanks!

@nlindblo
Copy link

@svanharmelen I'm still getting "cannot unmarshal string into Go struct field DeployVirtualMachineResponse.ostypeid of type int64" when running packer 1.4.1 as per hashicorp/packer#7557

@svanharmelen
Copy link
Author

If you look at the CHANGELOG you'll see that the CloudStack fix is not part of 1.4.1 but is merged into master and will be part of the 1.4.2 release.

@rohityadavcloud
Copy link
Member

Actual issue reference: https://issues.apache.org/jira/browse/CLOUDSTACK-4045 cc @shwstppr

@rohityadavcloud
Copy link
Member

  • Still allow IPs to be associated for network in allocated state, but allow to disassociate such IPs (when network is allocated)
  • Don't throw exeception

rohityadavcloud pushed a commit that referenced this issue Jul 14, 2019
Fixes #3321

This changes removes exception throwing while associating an IP address to a new isolated network which is in Allocated state. And it allows disassociating an IP address when it is used for source NAT purpose but network is in allocated state.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants