Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redeploy and monitor NAT forwarders #1

Open
choksi81 opened this issue May 29, 2014 · 9 comments
Open

Redeploy and monitor NAT forwarders #1

choksi81 opened this issue May 29, 2014 · 9 comments
Assignees

Comments

@choksi81
Copy link

There are no NAT forwarders left announcing. These must be redeployed ASAP -- we are holding up a bunch of people who are relying on our systems to work!

We also need to monitor the forwarders' availability, and be notified on problems. They have become critical infrastructure.

@choksi81
Copy link
Author

choksi81 commented Jun 2, 2014

The NAT Forwarders have been redeployed and are advertising under these two keys: 'NAT_AFFIX_FORWARDER' and 'NAT_AFFIX_FORWARDER--post-7209'.
There was an issue with the Forwarders advertising which has been addressed in r7272.
We still need some sort of monitoring system.

@choksi81
Copy link
Author

choksi81 commented Jun 2, 2014

Thanks! I see 10 forwarders for each key. I can write a very simple monitoring test to make sure that there are at least 10 visible on the advertise service. I'll get this up first.
As for testing the actual NAT traversal, we can first set up a nodemanager to run on a machine behind NAT. We'll then try to contact that node from blackbox.

@choksi81
Copy link
Author

choksi81 commented Jun 2, 2014

I have attached a patch that modifies the existing NAT test we have in trunk: seattle/trunk/integrationtests/nat/test_nat_servers_running.py. You may find the patch here: test_nat_servers_running.py.patch Download
The patch does the following:
Updates obsolete natlayer_rpc forwarding to use Affixes.
Disables old method of checking for server responsiveness. This remains a TODO for the time being.

@choksi81
Copy link
Author

choksi81 commented Jun 2, 2014

I have reviewed the patch and it seems fine to me.

@choksi81
Copy link
Author

choksi81 commented Jun 2, 2014

I have slightly modified the patch reviewed by Monzur, to match the new format of nat_forwarder_common_lib.NAT_FORWARDER_KEY (see: r7274). This is committed on r7275.
Now what remains is to add verification that the forwarders are properly functioning.

@choksi81
Copy link
Author

choksi81 commented Jun 2, 2014

What was the specific issue with r7253? I don't see an issue discussed, apart from what's on the patch description. (BTW, the patch in r7272 essentially removes randomization: shuffle permutes the list in-place, sample returns a new list).
Also, when was consensus reached that the NAT forwarders should advertise multiple keys (with the side effect of requiring band-aids like r7288)? Why couldn't we run two forwarder instances listening on different ports and advertising different keys?

@choksi81
Copy link
Author

choksi81 commented Jun 2, 2014

Originally I was going to deploy multiple forwarders with different keys, however I thought you and I had discussed about it and we had agreed that it wouldn't be a big change for a forwarder to advertise under multiple keys (when we needed the two different keys for testing). So I modified the forwarder to take in a list of keys rather than just a string of one key.
As for the issue with r7253, it was an issue with random.shuffle(). It uses the keyword 'reversed()', which is unsafe in Repy. If you try to use random.shuffle() (which was being used by sockettimeout and udpcentralizedadvertise), the program would crash and you would get an error. This had essentially caused our advertise service to break and return a false TimeoutError?.

@choksi81
Copy link
Author

choksi81 commented Jun 2, 2014

Okay, I see. Sorry if I was unclear on how to achieve the advertisement of multiple keys. My setup has been multiple instances of the forwarder, running on different ports, advertising different keys. Zero changes required (other than local ones in the instances I started).
Thanks for the explanation of r7253's problem. I'll shove the relevant pieces of information over to #1388, "Improve random.repy performance". It shouldn't be hard to go without .reversed() I guess.

@choksi81
Copy link
Author

choksi81 commented Jun 2, 2014

FYI, r7318 fixes the reversed() issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants