Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade/ retire psychz exit node #28

Open
jhpoelen opened this issue Apr 11, 2018 · 22 comments
Open

upgrade/ retire psychz exit node #28

jhpoelen opened this issue Apr 11, 2018 · 22 comments
Labels

Comments

@jhpoelen
Copy link
Contributor

As discussed in #8 (comment) and related threads, the current installation of the psychz exit node has to run an older version of Debian (without security patches) in order to allow unpatched home nodes to create exit node tunnels.

At time of writing (11 April 2018) 15 home nodes are connecting to the psychz exit node. These nodes need to be upgrade before being able to upgrade or retire this exit node.

If the exit node is upgraded/ retired before that, the home nodes will be unable to create tunnels without physical access to the nodes. This means that the home nodes would not able to provide open internet access unless they mesh with neighboring nodes with a route to the internet.

@jhpoelen jhpoelen changed the title upgrade/ retire psychz exit node/ tunnel digger broker upgrade/ retire psychz exit node Apr 11, 2018
@bennlich
Copy link
Collaborator

@Juul is planning to not renew psychz subscription very soon, so it's time to retire the exitnode.

IPs connected to psychz at time of posting:

  • 100.64.55.192/26
  • 100.64.63.0/26
  • 100.65.14.64/26
  • 100.65.140.128/26
  • 100.65.20.0/26
  • 100.65.6.0/26

@paidforby
Copy link

@jhpoelen is there anything that should be saved/rescued from the psychz server before it goes away forever? Not sure if there are any old configurations that might be worth backing up.

@jhpoelen
Copy link
Contributor Author

jhpoelen commented Oct 17, 2018

great to hear that the psychz exit node is getting retired - it took us quite some work to be able to do this without bringing the network down and losing the connection to our existing nodes.

Some suggestions:

  1. I believe that @bennlich made some recent updates to the exitnode configuration to quickly recover from suspected babeld bugs, however I saw no changes to https://github.com/sudomesh/exitnode . Suggest to include the config changes in the repository.
  2. using https://github.com/sudomesh/exitnode script, instantiate at least one extra exitnode in addition to the he exitnode using an ip address that is under our control and has been installed as default on all existing home nodes. In my opinion, history has shown that having only a single exit node increases the risks of bricking home nodes due to a malfunctioning, or no longer available (due to blacklisting), exit node.

@paidforby
Copy link

I think work might have been done on that in the upgrade babeld branch and has been an open PR for sometime, sudomesh/exitnode#15, which also includes the ability to redeploy the exitnode by re-running create_exitnode.sh.

Also, maybe before psychz is turned off, we should test rebuilding HE by running create_exitnode.sh, just in case it goes horribly, irreparably wrong?

@paidforby
Copy link

Also, good idea about at least deploying the exitnode script on one of the backup servers. For reference, here's the potential servers that could be used, https://github.com/sudomesh/sudowrt-firmware/blob/master/files/opt/mesh/templates/etc/config/tunneldigger not sure about the configuration on any of them other than builds.sudomesh.org, which already has a ton of things running/stored on it.

@jhpoelen
Copy link
Contributor Author

re: builds.sudomesh.org - I imagine that the dns would be relatively easy to point to another machine
to make room for a secondary exit node, compared to the scenario in which each node needs patching. On a related topic - have you considered moving the build / web server stuff to a dedicated non-do/non-heroku server? Perhaps worth noting that I recently rented a 16GB / 1TB server for $20 /mo with Hetzner.de , so there's much cheaper alternatives to the likes of heroku/ digital oceans, especially for running continuous running / heavy processes.
re: pending pull requests - I am used to a more continuous integration / deployment workflow, so I am not in the habit of checking the pending pull requests. Thanks for pointing out this pending request.

@paidforby
Copy link

good points about builds.sudomesh.org! Committing to moving builds.sudomesh.org would encourage us to cleanup that server and make sure everything on it is update to date, documented, and re-deployable. I like that idea so much that I'm going to open a new bug to document services currently running on it.

And agreed on PRs, I almost forgot that one existed, I encourage @bennlich to go ahead and merge it.

@paidforby
Copy link

So after copying the sudomesh server to a new droplet and new IP address, 107.170.221.27, I've realized that there is no way to decrease the size of the old sudomesh server, which way bigger than it needs to be for a back up exit node, and keep the old IP address, 107.170.219.5. @jhpoelen or @bennlich any ideas on how to preserve our precious IP address and downsize the droplet, I can't seem to find any way of doing this in digital ocean.

@bennlich
Copy link
Collaborator

Yep, I think you're right. From https://www.digitalocean.com/docs/droplets/how-to/resize/

CPU and RAM only resizes can be reversed if you want to return to using a smaller Droplet. Disk, CPU, and RAM resizes cannot be reversed because decreasing the size of the disk poses data integrity issues. Any time you resize a Droplet, be sure to test that all your services are running as expected.

In retrospect, we should have created a "digitalocean floating ip", which is a static ip address that you can move around between droplets, and use that IP in our tunneldigger configs. Woops.

Let's do this and patch some nodes? Doesn't really make sense to leave ourselves tied to a droplet with specs we don't really want IMO.

@jhpoelen
Copy link
Contributor Author

From my patching experience last year, I've learned that patching nodes is a risky, time intensive process that I wouldn't wish on anyone to have to repeat. Rather that spending time on this, I'd favor to keep the droplet around, install tunnneldigger on it, and spent the time to work on ways to either get rid of exit nodes or figure out a more dynamic approach to connect to exit nodes. Meanwhile, would it be worth trying to convince digitalocean to turn our existing ip into one of these floating IPs?

@paidforby
Copy link

I opened a ticket with DO asking them to convert the static IP to a floating IP. We'll see what they say.

Here's a question, is there any reason we couldn't run all the sudomesh server and exit node services on the same droplet? I know it's not ideal, but it seems like the only solution at this point. I second @jhpoelen on exploring new ideas of how to host exit nodes, or if they are even necessary.

@bennlich
Copy link
Collaborator

I opened a ticket with DO asking them to convert the static IP to a floating IP. We'll see what they say.

Thank you!

Here's a question, is there any reason we couldn't run all the sudomesh server and exit node services on the same droplet?

Yeah, it's really not ideal. We want to be able to bring down the services server without negatively affecting the mesh.

I know it's not ideal, but it seems like the only solution at this point.

Can't we just pay for two droplets--one for services, one for exitnode-ing?

--

I don't see the need to patch nodes going away. Might be worth figuring out how to make this less of a headache.

--

A dynamic but centralized solution would be to point tunneldigger to some kind of load balancer IP that routed to different registered exitnodes.

@paidforby
Copy link

Digital Ocean has been giving me the run around. I would like to stop using their service. We should put more energy into acquiring rack space that we have physical access to and actual control of. We can keep the old server up temporarily as a backup exit node, since it is cheaper than psychz.

Ultimately, we should work more on #39 and redeploy the sudomesh server once we have a local rack space.

Likewise, exploring the possibilities of an exitnode-less mesh, alternatives to tunneldigger, and an easier node patching system seem more useful than attempting to communicate with Digital Ocean.

@paidforby
Copy link

Hey, I just thought of something? Can tunneldigger take a domain name instead of an IP address? If it can't by default, it would be fairly easy to write a hook script that retrieves the IP address of say "builds.sudomesh.org" and enters it into the tunneldigger conf before attemtping to open a tunnel. This still locks us into a owning that domain, but it seems more flexible than an IP address and would be good as a backup option.

@jhpoelen
Copy link
Contributor Author

in responding to #28 (comment) -

@paidforby thanks for taking the initiative to try and get digital ocean to convert the static ip into a "floating" ip. I can imagine it can be frustrating to deal with (probably overworked) customer service folks.

I see how a long term plan to get more agency and control over our server hardware / ip addresses as well as simplifying the architecture is in line with sudomesh's values.

I also agree with @bennlich short term solution to, for the time being, run a dedicated exit node droplet behind the ip address baked into most home nodes. I imagine we first move the existing services to a second, temporary, droplet until we move them to a place that better suit our long terms goals like the rack mounted hardware you suggested.

@jhpoelen
Copy link
Contributor Author

re: #28 (comment) - Nice hack!

@jhpoelen
Copy link
Contributor Author

We could even get a list of IPs, 'cause you can associated more than one ip to a domain name.

@paidforby
Copy link

paidforby commented Oct 26, 2018

@paidforby
Copy link

re: #28 (comment)
all sudomesh services have been moved to a second temporary droplet at 107.170.221.27 and are up and running. I have yet to wipe the old server at 107.170.219.5 and turn it into an exit node.

@paidforby
Copy link

Confirmed! tunneldigger does support hostnames instead of IP addresses. Used create_exitnode.sh and tunneldigger lab to test this theory. The following commands work equally as well,

sudo $PWD/tunneldigger/client/tunneldigger -f -b 206.189.172.184:8942 -u f28cbac6-1670-4fbd-bd27-a49114e5f573 -i l2tp0 -s $PWD/tunnel_hook.sh 

sudo $PWD/tunneldigger/client/tunneldigger -f -b paidforby.me:8942 -u f28cbac6-1670-4fbd-bd27-a49114e5f573 -i l2tp0 -s $PWD/tunnel_hook.sh

jhpoelen pushed a commit to sudomesh/monitor that referenced this issue Oct 26, 2018
@bennlich
Copy link
Collaborator

right before turning 107.170.219.5 into an exitnode, @paidforby and I remembered #23 ... :-/

We'll need to decomission psychz at the same time that we bring up 107.170.219.5, and transfer psychz' mesh ip to the new exitnode.

@jhpoelen
Copy link
Contributor Author

@bennlich nice catch. Do you need help turning off psychz tunnel digger?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants