Brad's homelab setup
Switch branches/tags
Nothing to show
Clone or download
Latest commit b1244c2 Jul 4, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
img add some images Jul 2, 2018
README.md add HN link Jul 4, 2018

README.md

Brad's Homelab

Hi, I'm Brad, @bradfitz on Twitter, etc.

This page describes my home server & networking setup.

Discussion

Questions welcome!

Goals

The primary goals of this project are...

  • to have a highly-available home Internet setup, with no SPOF (Single Point of Failure)

  • to learn and have fun.

In summary

I have 3 physical machines plugged into 3 switches, with all switches connected to each other. I don't have a physical router/gateway. Instead, a Linux virtual machine handles the IPv4 NAT, IPv6 announcements, DHCP, DNS, etc, and that Linux VM floats between the 3 machines as needed, including live migration during maintenance.

My 4 Wifi APs are PoE-powered from the two switches. I have two ISPs.

I have two UPSes and two PDUs powering separate halves of the gear, and separate ISPs, giving me about 35-45 minutes of runtime (and thus Internet) during a power outage. The whole house might be dark, but the battery-powered wifi will work.

In photos

Higher quality photos at https://photos.app.goo.gl/Y5Ah6AeGekVkf3tY9.

closed

top

switches

bottom

Gear

Servers

Switches

  • 2 x UniFi Switch 24 PoE-250W: 24x Power-over-Ethernet 1Gbps ports
  • 1 x UniFi Switch 16 XG: 10Gbps Aggregation Switch, primarily for Ceph (but part of same LAN). I only have one of these, but if it fails the Linux bond fails over to the 1Gbps switches.

Wi-Fi APs

Other

  • UniFi Cloud Key to run the Unifi controller. This isn't necessary to boot the cluster. It just runs the pretty UI and is needed to add new devices. I could run the software on a VM too, I suppose. But I had it from earlier, so I'm still using it.
  • misc Raspberry Pis for monitoring

Power

The whole setup including all APs and switches draws about 220 watts idle. Power is pretty cheap in Seattle. Washington State (as of April 2018) has the cheapest electricity in the United States, at $0.0974/kWh.

ISPs

Software

  • Proxmox VE is the Debian-based base OS on the servers, and Proxmox is a nice UI for managing qemu VMs and Ceph. I previously tried VMware for about a year, both are annoying in different ways. Proxmox might be a little rough in places, but I prefer it.
  • Ceph for storage. I love Ceph so much and discovering it makes this whole adventure worth it. Still much to learn, though.
  • ISC DHCP for the DHCP server. I auto-generate its config from a Go program that has a map of most my important devices' MAC addresses.
  • CoreDNS for the DNS server on the gateway VM, which lets me encrypt all upstream DNS so ISPs can't see or mess with it. (even though they can see IPs and SNI)
  • tcpproxy that Dave Anderson and I wrote. I use it on an HA VM to route ingress traffic to various VMs & services.

Config

Network config

  • The LAN is 10.0.0.0/16.
  • Untrusted VLAN is 10.2.0.0/16, which the LAN can connect to, but the untrusted machines can't initiate connections back out to.
  • Gateway, DHCP at 10.0.0.1 (and 10.2.0.1 for untrusted)
  • DHCP range is 10.0.100-199.x so they're easy to recognize. Likewise for the untrusted VLAN.
  • Networking gear have static IPs 10.0.6.x (6 is above the letter N on the keyboard, which is how I map letters to numbers usually)
  • ...

Proxmox/host config

...

Device config

...

Firewall config

  • Ferm for simplifying writing iptables rules

Monitoring

  • Not enough yet. WIP. Plan is to use Prometheus more.
  • A Raspberry Pi has USB connections to the two UPSes.

Home Automation

Testing

TODO: link to program with dependency graph of all devices, services, and connections, and to simulate failures to validate there are no hidden SPOFs.

Past failures

  • I used to use a Soekris net6501 as my home gateway, but its CPU maxes out NAT'ing about 300 Mbps, sadly, so I started looking at alternatives when I got Centurylink fiber.
  • A truck once clipped the fiber running to our house. It's nice having a second WAN link.
  • I used to use a UniFi Security Gateway Pro but it failed one day and wouldn't power on any more. Dave had a backup for me handy, but the Unifi controller software wedged itself and wouldn't let me remove the old (dead) one and thus I couldn't add the new replacement, since you can only have one gateway in a site at a time. I was not amused, and that was the final straw that made me realize I wanted a highly-available setup.
  • I used to use VMware with highly-available vCenter setup, but the whole thing was felt bloated and slow and enterprisey, and I couldn't stand the Flash UI, which was still required for many operations. That's increasingly going away and being replaced with HTML5, but I also couldn't stand the VMware enterprise-targeted documentation. And I wanted to use something Open Source, too.

Thanks

Much thanks to Dave Anderson for helping with a lot of this. He has a very similar setup at his home and we enjoy watching each other both succeed and fail at trying new things.