Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/wireguard: add options to punch-through non-symmetric NATs #128014

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ju1m
Copy link
Contributor

@ju1m ju1m commented Jun 24, 2021

This PR is ready for reviewing.

Motivation for this change

Having peer-to-peer VPN with WireGuard by having all peers connecting to the same peer (usually publicly reachable),
hence giving it their own public endpoint, and then initiating a periodic TCP connection to that peer (within the VPN)
to receive the endpoints of other peers, in order to punch-through non-symmetric NATs.

Note that it can also work with two hosts behind the same NAT if that NAT can reflect (to its internal network) connections from its internal network to its external IP address, which may require persistentKeepalives as low as 1 and/or redirecting at least one host's WireGuard port, eg. with UPnP.

This is a much simpler alternative to WireGuard Endpoint Discovery and NAT Traversal using DNS-SD or natpunch-go.

Things done
  • Add per WireGuard interface options: peersAnnouncing.{enable,port}
    creating a systemd socket triggering a wg show ${iface} dump to send the known endpoints.
  • Add per WireGuard peer option: endpointsUpdater.{enable,addr,port,refreshSeconds}
    running a netcat to the (announcing) peer to receive the endpoints.
  • Adding myself as a maintainer of the networking.wireguard NixOS module.
  • Hardening of announcing and receiving services, and basic firewalling (by using BindToDevice, IPAddressAllow and MaxConnectionsPerSource):
# systemd-analyze security 'wireguard-wg-intra-peers-announcing@'
  NAME                                                        DESCRIPTION                                                                    EXPOSURE
✗ PrivateNetwork=                                             Service has access to the host's network                                            0.5
✓ User=/DynamicUser=                                          Service runs under a transient non-root user identity                                  
✓ CapabilityBoundingSet=~CAP_SET(UID|GID|PCAP)                Service cannot change UID/GID identities/capabilities                                  
✓ CapabilityBoundingSet=~CAP_SYS_ADMIN                        Service has no administrator privileges                                                
✓ CapabilityBoundingSet=~CAP_SYS_PTRACE                       Service has no ptrace() debugging abilities                                            
✗ RestrictAddressFamilies=~AF_(INET|INET6)                    Service may allocate Internet sockets                                               0.3
✓ RestrictNamespaces=~CLONE_NEWUSER                           Service cannot create user namespaces                                                  
✗ RestrictAddressFamilies=~…                                  Service may allocate exotic sockets                                                 0.3
✓ CapabilityBoundingSet=~CAP_(CHOWN|FSETID|SETFCAP)           Service cannot change file ownership/access mode/capabilities                          
✓ CapabilityBoundingSet=~CAP_(DAC_*|FOWNER|IPC_OWNER)         Service cannot override UNIX file/IPC permission checks                                
✗ CapabilityBoundingSet=~CAP_NET_ADMIN                        Service has network configuration privileges                                        0.2
✓ CapabilityBoundingSet=~CAP_SYS_MODULE                       Service cannot load kernel modules                                                     
✓ CapabilityBoundingSet=~CAP_SYS_RAWIO                        Service has no raw I/O access                                                          
✓ CapabilityBoundingSet=~CAP_SYS_TIME                         Service processes cannot change the system clock                                       
✗ DeviceAllow=                                                Service has a device ACL with some special devices                                  0.1
✓ IPAddressDeny=                                              Service blocks all IP address ranges                                                   
✓ KeyringMode=                                                Service doesn't share key material with other services                                 
✓ NoNewPrivileges=                                            Service processes cannot acquire new privileges                                        
✓ NotifyAccess=                                               Service child processes cannot alter service state                                     
✓ PrivateDevices=                                             Service has no access to hardware devices                                              
✓ PrivateMounts=                                              Service cannot install system mounts                                                   
✓ PrivateTmp=                                                 Service has no access to other software's temporary files                              
✗ PrivateUsers=                                               Service has access to other users                                                   0.2
✓ ProtectClock=                                               Service cannot write to the hardware clock or system clock                             
✓ ProtectControlGroups=                                       Service cannot modify the control group file system                                    
✓ ProtectHome=                                                Service has no access to home directories                                              
✓ ProtectKernelLogs=                                          Service cannot read from or write to the kernel log ring buffer                        
✓ ProtectKernelModules=                                       Service cannot load or read kernel modules                                             
✓ ProtectKernelTunables=                                      Service cannot alter kernel tunables (/proc/sys, …)                                    
✓ ProtectProc=                                                Service has restricted access to process tree (/proc hidepid=)                         
✓ ProtectSystem=                                              Service has strict read-only access to the OS file hierarchy                           
✗ RestrictAddressFamilies=~AF_PACKET                          Service may allocate packet sockets                                                 0.2
✓ RestrictSUIDSGID=                                           SUID/SGID file creation by service is restricted                                       
✓ SystemCallArchitectures=                                    Service may execute system calls only with native ABI                                  
✓ SystemCallFilter=~@clock                                    System call allow list defined for service, and @clock is not included                 
✓ SystemCallFilter=~@debug                                    System call allow list defined for service, and @debug is not included                 
✓ SystemCallFilter=~@module                                   System call allow list defined for service, and @module is not included                
✓ SystemCallFilter=~@mount                                    System call allow list defined for service, and @mount is not included                 
✓ SystemCallFilter=~@raw-io                                   System call allow list defined for service, and @raw-io is not included                
✓ SystemCallFilter=~@reboot                                   System call allow list defined for service, and @reboot is not included                
✓ SystemCallFilter=~@swap                                     System call allow list defined for service, and @swap is not included                  
✓ SystemCallFilter=~@privileged                               System call allow list defined for service, and @privileged is not included            
✓ SystemCallFilter=~@resources                                System call allow list defined for service, and @resources is not included             
✗ AmbientCapabilities=                                        Service process receives ambient capabilities                                       0.1
✓ CapabilityBoundingSet=~CAP_AUDIT_*                          Service has no audit subsystem access                                                  
✓ CapabilityBoundingSet=~CAP_KILL                             Service cannot send UNIX signals to arbitrary processes                                
✓ CapabilityBoundingSet=~CAP_MKNOD                            Service cannot create device nodes                                                     
✓ CapabilityBoundingSet=~CAP_NET_(BIND_SERVICE|BROADCAST|RAW) Service has no elevated networking privileges                                          
✓ CapabilityBoundingSet=~CAP_SYSLOG                           Service has no access to kernel logging                                                
✓ CapabilityBoundingSet=~CAP_SYS_(NICE|RESOURCE)              Service has no privileges to change resource use parameters                            
✓ RestrictNamespaces=~CLONE_NEWCGROUP                         Service cannot create cgroup namespaces                                                
✓ RestrictNamespaces=~CLONE_NEWIPC                            Service cannot create IPC namespaces                                                   
✓ RestrictNamespaces=~CLONE_NEWNET                            Service cannot create network namespaces                                               
✓ RestrictNamespaces=~CLONE_NEWNS                             Service cannot create file system namespaces                                           
✓ RestrictNamespaces=~CLONE_NEWPID                            Service cannot create process namespaces                                               
✓ RestrictRealtime=                                           Service realtime scheduling access is restricted                                       
✓ SystemCallFilter=~@cpu-emulation                            System call allow list defined for service, and @cpu-emulation is not included         
✓ SystemCallFilter=~@obsolete                                 System call allow list defined for service, and @obsolete is not included              
✗ RestrictAddressFamilies=~AF_NETLINK                         Service may allocate netlink sockets                                                0.1
✗ RootDirectory=/RootImage=                                   Service runs within the host's root directory                                       0.1
✓ SupplementaryGroups=                                        Service has no supplementary groups                                                    
✓ CapabilityBoundingSet=~CAP_MAC_*                            Service cannot adjust SMACK MAC                                                        
✓ CapabilityBoundingSet=~CAP_SYS_BOOT                         Service cannot issue reboot()                                                          
✓ Delegate=                                                   Service does not maintain its own delegated control group subtree                      
✓ LockPersonality=                                            Service cannot change ABI personality                                                  
✓ MemoryDenyWriteExecute=                                     Service cannot create writable executable memory mappings                              
✓ RemoveIPC=                                                  Service user cannot leave SysV IPC objects around                                      
✓ RestrictNamespaces=~CLONE_NEWUTS                            Service cannot create hostname namespaces                                              
✓ UMask=                                                      Files created by service are accessible only by service's own user by default          
✓ CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE                  Service cannot mark files immutable                                                    
✓ CapabilityBoundingSet=~CAP_IPC_LOCK                         Service cannot lock memory into RAM                                                    
✓ CapabilityBoundingSet=~CAP_SYS_CHROOT                       Service cannot issue chroot()                                                          
✓ ProtectHostname=                                            Service cannot change system host/domainname                                           
✓ CapabilityBoundingSet=~CAP_BLOCK_SUSPEND                    Service cannot establish wake locks                                                    
✓ CapabilityBoundingSet=~CAP_LEASE                            Service cannot create file leases                                                      
✓ CapabilityBoundingSet=~CAP_SYS_PACCT                        Service cannot use acct()                                                              
✓ CapabilityBoundingSet=~CAP_SYS_TTY_CONFIG                   Service cannot issue vhangup()                                                         
✓ CapabilityBoundingSet=~CAP_WAKE_ALARM                       Service cannot program timers that wake up the system                                  
✗ RestrictAddressFamilies=~AF_UNIX                            Service may allocate local sockets                                                  0.1
✓ ProcSubset=                                                 Service has no access to non-process /proc files (/proc subset=)                       

→ Overall exposure level for wireguard-wg-intra-peers-announcing@test-instance.service: 1.8 OK 🙂
# systemd-analyze security 'wireguard-wg-intra-endpoints-updater-XbTEP2X71LBTjmdmySdiOpQJ\x2buIomcXvg1aiQGUtWBI\x3d.service'
  NAME                                                        DESCRIPTION                                                                    EXPOSURE
✗ PrivateNetwork=                                             Service has access to the host's network                                            0.5
✓ User=/DynamicUser=                                          Service runs under a transient non-root user identity                                  
✓ CapabilityBoundingSet=~CAP_SET(UID|GID|PCAP)                Service cannot change UID/GID identities/capabilities                                  
✓ CapabilityBoundingSet=~CAP_SYS_ADMIN                        Service has no administrator privileges                                                
✓ CapabilityBoundingSet=~CAP_SYS_PTRACE                       Service has no ptrace() debugging abilities                                            
✗ RestrictAddressFamilies=~AF_(INET|INET6)                    Service may allocate Internet sockets                                               0.3
✓ RestrictNamespaces=~CLONE_NEWUSER                           Service cannot create user namespaces                                                  
✓ RestrictAddressFamilies=~…                                  Service cannot allocate exotic sockets                                                 
✓ CapabilityBoundingSet=~CAP_(CHOWN|FSETID|SETFCAP)           Service cannot change file ownership/access mode/capabilities                          
✓ CapabilityBoundingSet=~CAP_(DAC_*|FOWNER|IPC_OWNER)         Service cannot override UNIX file/IPC permission checks                                
✗ CapabilityBoundingSet=~CAP_NET_ADMIN                        Service has network configuration privileges                                        0.2
✓ CapabilityBoundingSet=~CAP_SYS_MODULE                       Service cannot load kernel modules                                                     
✓ CapabilityBoundingSet=~CAP_SYS_RAWIO                        Service has no raw I/O access                                                          
✓ CapabilityBoundingSet=~CAP_SYS_TIME                         Service processes cannot change the system clock                                       
✗ DeviceAllow=                                                Service has a device ACL with some special devices                                  0.1
✗ IPAddressDeny=                                              Service defines IP address allow list with non-localhost entries                    0.1
✓ KeyringMode=                                                Service doesn't share key material with other services                                 
✓ NoNewPrivileges=                                            Service processes cannot acquire new privileges                                        
✓ NotifyAccess=                                               Service child processes cannot alter service state                                     
✓ PrivateDevices=                                             Service has no access to hardware devices                                              
✓ PrivateMounts=                                              Service cannot install system mounts                                                   
✓ PrivateTmp=                                                 Service has no access to other software's temporary files                              
✗ PrivateUsers=                                               Service has access to other users                                                   0.2
✓ ProtectClock=                                               Service cannot write to the hardware clock or system clock                             
✓ ProtectControlGroups=                                       Service cannot modify the control group file system                                    
✓ ProtectHome=                                                Service has no access to home directories                                              
✓ ProtectKernelLogs=                                          Service cannot read from or write to the kernel log ring buffer                        
✓ ProtectKernelModules=                                       Service cannot load or read kernel modules                                             
✓ ProtectKernelTunables=                                      Service cannot alter kernel tunables (/proc/sys, …)                                    
✓ ProtectProc=                                                Service has restricted access to process tree (/proc hidepid=)                         
✓ ProtectSystem=                                              Service has strict read-only access to the OS file hierarchy                           
✓ RestrictAddressFamilies=~AF_PACKET                          Service cannot allocate packet sockets                                                 
✓ RestrictSUIDSGID=                                           SUID/SGID file creation by service is restricted                                       
✓ SystemCallArchitectures=                                    Service may execute system calls only with native ABI                                  
✓ SystemCallFilter=~@clock                                    System call allow list defined for service, and @clock is not included                 
✓ SystemCallFilter=~@debug                                    System call allow list defined for service, and @debug is not included                 
✓ SystemCallFilter=~@module                                   System call allow list defined for service, and @module is not included                
✓ SystemCallFilter=~@mount                                    System call allow list defined for service, and @mount is not included                 
✓ SystemCallFilter=~@raw-io                                   System call allow list defined for service, and @raw-io is not included                
✓ SystemCallFilter=~@reboot                                   System call allow list defined for service, and @reboot is not included                
✓ SystemCallFilter=~@swap                                     System call allow list defined for service, and @swap is not included                  
✓ SystemCallFilter=~@privileged                               System call allow list defined for service, and @privileged is not included            
✓ SystemCallFilter=~@resources                                System call allow list defined for service, and @resources is not included             
✗ AmbientCapabilities=                                        Service process receives ambient capabilities                                       0.1
✓ CapabilityBoundingSet=~CAP_AUDIT_*                          Service has no audit subsystem access                                                  
✓ CapabilityBoundingSet=~CAP_KILL                             Service cannot send UNIX signals to arbitrary processes                                
✓ CapabilityBoundingSet=~CAP_MKNOD                            Service cannot create device nodes                                                     
✓ CapabilityBoundingSet=~CAP_NET_(BIND_SERVICE|BROADCAST|RAW) Service has no elevated networking privileges                                          
✓ CapabilityBoundingSet=~CAP_SYSLOG                           Service has no access to kernel logging                                                
✓ CapabilityBoundingSet=~CAP_SYS_(NICE|RESOURCE)              Service has no privileges to change resource use parameters                            
✓ RestrictNamespaces=~CLONE_NEWCGROUP                         Service cannot create cgroup namespaces                                                
✓ RestrictNamespaces=~CLONE_NEWIPC                            Service cannot create IPC namespaces                                                   
✓ RestrictNamespaces=~CLONE_NEWNET                            Service cannot create network namespaces                                               
✓ RestrictNamespaces=~CLONE_NEWNS                             Service cannot create file system namespaces                                           
✓ RestrictNamespaces=~CLONE_NEWPID                            Service cannot create process namespaces                                               
✓ RestrictRealtime=                                           Service realtime scheduling access is restricted                                       
✓ SystemCallFilter=~@cpu-emulation                            System call allow list defined for service, and @cpu-emulation is not included         
✓ SystemCallFilter=~@obsolete                                 System call allow list defined for service, and @obsolete is not included              
✗ RestrictAddressFamilies=~AF_NETLINK                         Service may allocate netlink sockets                                                0.1
✗ RootDirectory=/RootImage=                                   Service runs within the host's root directory                                       0.1
✓ SupplementaryGroups=                                        Service has no supplementary groups                                                    
✓ CapabilityBoundingSet=~CAP_MAC_*                            Service cannot adjust SMACK MAC                                                        
✓ CapabilityBoundingSet=~CAP_SYS_BOOT                         Service cannot issue reboot()                                                          
✓ Delegate=                                                   Service does not maintain its own delegated control group subtree                      
✓ LockPersonality=                                            Service cannot change ABI personality                                                  
✓ MemoryDenyWriteExecute=                                     Service cannot create writable executable memory mappings                              
✓ RemoveIPC=                                                  Service user cannot leave SysV IPC objects around                                      
✓ RestrictNamespaces=~CLONE_NEWUTS                            Service cannot create hostname namespaces                                              
✓ UMask=                                                      Files created by service are accessible only by service's own user by default          
✓ CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE                  Service cannot mark files immutable                                                    
✓ CapabilityBoundingSet=~CAP_IPC_LOCK                         Service cannot lock memory into RAM                                                    
✓ CapabilityBoundingSet=~CAP_SYS_CHROOT                       Service cannot issue chroot()                                                          
✓ ProtectHostname=                                            Service cannot change system host/domainname                                           
✓ CapabilityBoundingSet=~CAP_BLOCK_SUSPEND                    Service cannot establish wake locks                                                    
✓ CapabilityBoundingSet=~CAP_LEASE                            Service cannot create file leases                                                      
✓ CapabilityBoundingSet=~CAP_SYS_PACCT                        Service cannot use acct()                                                              
✓ CapabilityBoundingSet=~CAP_SYS_TTY_CONFIG                   Service cannot issue vhangup()                                                         
✓ CapabilityBoundingSet=~CAP_WAKE_ALARM                       Service cannot program timers that wake up the system                                  
✓ RestrictAddressFamilies=~AF_UNIX                            Service cannot allocate local sockets                                                  
✓ ProcSubset=                                                 Service has no access to non-process /proc files (/proc subset=)                       

→ Overall exposure level for wireguard-wg-intra-endpoints-updater-XbTEP2X71LBTjmdmySdiOpQJ\x2buIomcXvg1aiQGUtWBI\x3d.service: 1.4 OK 🙂
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • 21.11 Release Notes (or backporting 21.05 Relase notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.


# Query the peer announcing other peers,
# for setting the current endpoint of all configured peers.
script = ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some nontrivial level of bash and quoting.

Given that Wireguard is highly security relevant, would you be up for doing the same in a safer-by-default script like Python instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I am no Pythoner, I personnaly would not be able to produce something more secure than this Bash script, in which I see no obvious problem.

AFAIU, because the Bash script is initiating the connection, inside the WireGuard tunnel, and only to the first hop, the input validation is not needed from a security point of view (unless the announcing peer is itself compromised). If that's correct, then that validation is here only from a feature point of view (updating only the peers configured in the peers' NixOS and not all those known by the announcing peer).
But feel free to point out if I'm mistaken or forgetting something here.

I have added timeouting (-t) and input limiting (-n) flags to the read in order to be more resilient/sane, but this is likely useless.

@lheckemann
Copy link
Member

I don't think I really like the idea of logic like this, which is applicable outside the context of NixOS and even nixpkgs, living within nixpkgs. IMHO this deserves its own repository and a package, for better reusability e.g. on other distributions, rather than being inlined into the NixOS wireguard module.

What it does is pretty cool though :)

@ju1m
Copy link
Contributor Author

ju1m commented Oct 27, 2021

  • Fix endpoint updater : replace old (2004) and non-ported (on ARM) netcat-gnu by socat.

@ju1m
Copy link
Contributor Author

ju1m commented Nov 4, 2021

  • Increase services.wireguard.${iface}.peers.[].endpointsUpdater.refreshSeconds to 60s : no longer set it equal to persistentKeepalive which may be as low as 1s! but use a more sensible default likely responsive enough for most use cases, 60s is the default net/ipv4/tcp_fin_timeout so it saves a bit of space in the conntrack table of the announcing peer (ss --numeric -o state time-wait dst 192.168.42.0/24), and might also save a little bit more of bandwidth/battery for mobiles.
  • Improve descriptions

Copy link
Member

@lheckemann lheckemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, while the functionality is cool and the implementation looks fair enough at a glance, I don't think this should be in nixpkgs. Would you be willing to move this into its own repo instead?

@ju1m
Copy link
Contributor Author

ju1m commented Jan 23, 2022

@lheckemann, this PR adds:

  • the peersAnnouncing option to interfaceOpts which is used in an attrsOf: type = with types; attrsOf (submodule interfaceOpts);
  • and the endpointsUpdater option to peerOpts which is used in a listOf: type = with types; listOf (submodule peerOpts);

Do you know if I can "move this into its own repo", without copying/replacing the whole wireguard.nix?

@lheckemann
Copy link
Member

Yes, this should be possible by replicating the structure and introducing your extra options:

{ config, pkgs, lib, ... }:
{
  options.networking.wireguard.interfaces = lib.mkOption {
     type = lib.types.submodule ({
       options = {
         peersAnnouncing = lib.mkOption ...;
         peers = lib.mkOption { ... };
     });
  };
}

The NixOS module system is smart and merges these as appropriate.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jul 31, 2022
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants