This is a short presentation on the topic of network automation and abstraction of various network devices. The presentation format is designed to include slides of text and discussion, with bits of code to demonstrate the concepts along the way.






Part 1: Hiding devices behind a single layer of commands
Part 2: Extracting information from devices, connecting to external tooling
Part 3: Presenting data
Part 4: Extending everything

**"Hacking"**

This is a subjective term and it can mean a number of things. To the outsider or layperson, it's some evil nerd with a deep technical skill set and an attitude. In our world, it really just means being creative, curious, and thoughtful about the approaches you take to problems while working in a networked environment.

When working with one or even thousands of networked resources, it pays to be organized. It also pays to automate that which takes up manual cycles of a network operator's time since this cuts down on operational costs. This is one central reason you see networking jobs in the industry require some coding experience. Automation makes tasks *reapeatable* as well, which means both you and your peer can set up, execute, and tear down changes identically at different times in different places.

Along the way in this tutorial, note any unfamiliar areas and ask yourself, *"How would I design this?" or "How would I learn more about this?*. Be thinking about the environments you've worked in and the open problems you'd like to solve.

possible solutions and the stack of tooling available

ansible, robot, salt, python unit test frameworks, build systems...

napalm, ipython, home-grown scripts...

Cisco, Juniper, Arista, F5, Citrix, Infinera, etc.

**Part 1 - Device Abstraction**

One common problem with network operations is nobody really implements a protocol or interface *consistently*.

HTTP, HTTPS, telnet, SSH, NETCONF, JSON-RPC, XML-RPC, bare sockets, carrier pidgeons[1](https://tools.ietf.org/html/rfc2549)

The problem is *they don't have to* implement their interface consistently. How are they supposed to expect we all want the exact same functions available on all network devices? It's unreasonable to assume parity between interfaces to these network OS'es, so what are we going to do about it?

It turns out, there's (roughly) an 80/20 rule here. If you lump together all the network devices you control under one access layer, you can either engineer or locate a solution which provides about 80 percent of what you need. The remaining 20 percent is functionality you require, but it's either not implemented, not on their roadmap, or not even technically possible.

In [6]:
from pprint import pprint
from napalm import get_network_driver

routers = ['172.16.130.11',
           '172.16.130.12',
           '172.16.130.13',
           '172.16.130.14']

# Important pattern - They've hidden the underlying transport mechanism from us here.
# We only care about the endpoint we hit, the username, and password (at a minimum).
# 
driver = get_network_driver('eos')

def discover_devices(targets):
    for d in targets:
        print(d)
        device = driver(d, 'admin', 'admin')
        device.open()
        pprint(vars(device))
        pprint(device.get_facts()) 
        
        
discover_devices(targets=routers)

172.16.130.11
{'config_session': None,
 'device': Node(connection=EapiConnection(transport=https://172.16.130.11:443//command-api)),
 'enablepwd': '',
 'eos_autoComplete': None,
 'hostname': '172.16.130.11',
 'locked': False,
 'password': 'admin',
 'port': 443,
 'profile': ['eos'],
 'timeout': 60,
 'transport': 'https',
 'username': 'admin'}
{'fqdn': 'eos1',
 'hostname': 'eos1',
 'interface_list': ['Ethernet1',
                    'Ethernet2',
                    'Ethernet3',
                    'Loopback1',
                    'Loopback2',
                    'Management1'],
 'model': 'vEOS',
 'os_version': '4.15.5M-3054042.4155M',
 'serial_number': '',
 'uptime': 85006,
 'vendor': 'Arista'}
172.16.130.12
{'config_session': None,
 'device': Node(connection=EapiConnection(transport=https://172.16.130.12:443//command-api)),
 'enablepwd': '',
 'eos_autoComplete': None,
 'hostname': '172.16.130.12',
 'locked': False,
 'password': 'admin',
 'port': 443,
 'profile': ['eos'],
 'timeout': 60,

timed out
Traceback (most recent call last):
  File "/Users/bboothv/.pyenv/versions/3.6.3/envs/nanog/lib/python3.6/site-packages/pyeapi/eapilib.py", line 385, in send
    self.transport.endheaders(message_body=data)
  File "/Users/bboothv/.pyenv/versions/3.6.3/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Users/bboothv/.pyenv/versions/3.6.3/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/Users/bboothv/.pyenv/versions/3.6.3/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/Users/bboothv/.pyenv/versions/3.6.3/lib/python3.6/http/client.py", line 1392, in connect
    super().connect()
  File "/Users/bboothv/.pyenv/versions/3.6.3/lib/python3.6/http/client.py", line 936, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "/Users/bboothv/.pyenv/versions/3.6.3/lib/python3.6/socket.py", line 724, in create_connection


ConnectionException: Socket error during eAPI connection: timed out

Notice that the function above includes a password in cleartext. That was a short cut to get a connection to the box. Checking in code which contains sensitive data such as API passwords is not a good idea because anyone with a copy of the script has access to this password now. You want to limit the *blast radius* for security breaches, so if an outside were to get a copy of this script, they didn't have your super-secret internal passwords to hit your router APIs.

What other ways can we solve this problem?

* Shift to a model of using key-based authentication.
* Store passwords somewhere secure outside the script and call them when needed.

**Secrets and Managing Sensitive Data**

Most network configurations contain very sensitive information. This includes but is not limited to:

* BGP secrets
* enable/configuration passwords
* SSH passwords or keys
* SSL private keys and private/public keypairs
* Configuration blocks which expose sensitive customer data

These can't just be stored in source control!

You can go as deep as you want to, but sometimes the simple/quick/iterative solution is the most effective _right now_. The passwords in this demo are in cleartext. Just throw them into a file that looks like this:

```
172.130.16.11 admin
172.130.16.12 admin
172.130.16.13 admin
172.130.16.14 admin
```

Then modify .gitignore to completely ignore this file so it is _never checked into source code repos_ and you're just a little bit more obscure here. No more passwords in the presentations or your code reviews, plus they're going to have to log in with enough privileges to read that file on your local filesystem. This file exists in the same directory as this notebook, but it can easily be put in another directory and changed to read/write permission the owner and nobody else.

In [18]:
with open('passwords.txt') as f:
    passwords = {}
    data = f.read()
    
    for line in data.splitlines():
        ip, _, password = line.partition(' ')
        passwords[ip] = password
    pprint(passwords)
    

{'172.16.130.11': 'admin',
 '172.16.130.12': 'admin',
 '172.16.130.13': 'admin',
 '172.16.130.14': 'admin'}


What are some other alternatives for storage of sensitive data?

* Netbox has a method of [storing secrets](http://netbox.readthedocs.io/en/latest/data-model/secrets/) with an API.
* All major relational databases 

**Configuration Change Management**

Device configurations have to be stored somewhere engineers, support personnel, and sometimes customers can access them. The storage solution depends on the structure of your organization, your security requirements, and your risk tolerance.

In [19]:
# Now we don't have to write our passwords in source code. Moving along...
device = driver(routers[0], 'admin', passwords[routers[0]])
device.open()

cfg = device.get_config()
pprint(cfg)

{'candidate': '',
 'running': '! Command: show running-config\n'
            '! device: eos1 (vEOS, EOS-4.15.5M)\n'
            '!\n'
            '! boot system flash:/vEOS-lab.swi\n'
            '!\n'
            'transceiver qsfp default-mode 4x10G\n'
            '!\n'
            'hostname eos1\n'
            '!\n'
            'spanning-tree mode mstp\n'
            '!\n'
            'no aaa root\n'
            '!\n'
            'username admin role network-admin secret 5 '
            '$1$RwN5EBy0$tKxGuIPPFzaMbq6VB8.VH0\n'
            '!\n'
            'interface Ethernet1\n'
            '!\n'
            'interface Ethernet2\n'
            '   no switchport\n'
            '   ip address 10.1.1.8/31\n'
            '!\n'
            'interface Ethernet3\n'
            '   no switchport\n'
            '   ip address 10.1.1.0/31\n'
            '!\n'
            'interface Loopback1\n'
            '   ip address 1.1.1.1/32\n'
            '!\n'
            'interface Loopback2\n'
      

In [5]:
# Notice how the device.open() had no corresponding device.close().
# Use their provided context manager to ensure connections to devices are torn down.

with driver(routers[0], 'admin', 'admin') as device:
    cfg = device.get_config()
    pprint(cfg)
    
# connection to the device is closed once you fall out of the context manager.

{'candidate': '',
 'running': '! Command: show running-config\n'
            '! device: eos1 (vEOS, EOS-4.15.5M)\n'
            '!\n'
            '! boot system flash:/vEOS-lab.swi\n'
            '!\n'
            'transceiver qsfp default-mode 4x10G\n'
            '!\n'
            'hostname eos1\n'
            '!\n'
            'spanning-tree mode mstp\n'
            '!\n'
            'no aaa root\n'
            '!\n'
            'username admin role network-admin secret 5 '
            '$1$RwN5EBy0$tKxGuIPPFzaMbq6VB8.VH0\n'
            '!\n'
            'interface Ethernet1\n'
            '!\n'
            'interface Ethernet2\n'
            '   no switchport\n'
            '   ip address 10.1.1.8/31\n'
            '!\n'
            'interface Ethernet3\n'
            '   no switchport\n'
            '   ip address 10.1.1.0/31\n'
            '!\n'
            'interface Loopback1\n'
            '   ip address 1.1.1.1/32\n'
            '!\n'
            'interface Loopback2\n'
      

show configuration change/staging
then show how a context manager can be used to control cleanup or rollback

There are two major classes of errors you can hit when you load a configuration, syntax errors and semantic errors.

Syntax errors are pretty simple to detect on most major network platforms. Think about the *commit check* command in JunOS. It verifies the candidate configuration you wanted is **syntactically** valid. Each network OS might handle a syntax error slightly differently. For example, on JunOS, that 'commit check' command is essentially asking permission to load a configuration. Over on Arista's EOS, we have *config sessions* but we don't have exactly the same abilities here. On EOS, the closest we can get to this is asking for forgiveness - by simply trying to commit the candidate to the running configuration, looking for fire and smoke after it loads.

Luckily, python is designed primarily to ask for forgiveness, so why not **try**?

In [None]:
with driver(routers[0], 'admin', 'admin') as device:
    

Napalm provides us a rollback function, but what if our candidate config is going to be applied but do something bad once loaded?

We can use a context manager here as well to protect against missing rollback.

External libraries can be used to pull in quick functionality to make things look better. Progress bars have always been an issue in programming.

In [None]:
from tqdm import tqdm
import time

for x in tqdm(range(10), desc='configuring all devices'):
    time.sleep(0.1)


In [None]:
dir(driver)

**References**

[2](https://en.wikipedia.org/wiki/Heuristic) Heuristic - (Wikipedia)