### Getting Started with Batfish

We will use the python client for Batfish (pybatfish) to analyze a sample network. This notebook will show the commands necessary to:

#### 1. Create a Network and a Snapshot
Network is the logical network. It can be the entire network or a subset of it. Snapshot is a collection of information (configuration files, etc…) that represent the network at a point in time. Snapshots can contain the actual configuration of network devices or candidate configurations.

#### 2. Extract Information
Batfish creates a comprehensive vendor neutral device and network model from which information such as list of devices, interface state, VRFs etc. can be extracted.


In [1]:
# Importing packages and loading questions
%run startup.py

  "Pybatfish public API is being updated, note that API names and parameters will soon change.")


### Initializing our Network and Snapshot

In [2]:
NETWORK_NAME = "example_network"
SNAPSHOT_NAME = "example_snapshot"

# Update the SNAPSHOT_PATH to point to a directory containing your network snapshots (default is batfish/test_rigs)
SNAPSHOT_PATH = "../test_rigs/example"

bf_set_network(NETWORK_NAME)
bf_init_snapshot(SNAPSHOT_PATH, name=SNAPSHOT_NAME, overwrite=True)

'example_snapshot'

### Loading Questions from Batfish
Questions are the mechanism by which you query the Batfish service about the created network and snapshot(s). 

In [3]:
# Load questions from Batfish
load_questions()

In [4]:
# To see available questions use the tab auto-completion on the Batfish question module - bfq. -> press TAB key,
# uncomment and try on the following line
# bfq.

### Getting status of parsed files

To retrieve information about the files that were parsed to create the snapshot, use the fileParseStatus question

In [5]:
parse_status = bfq.fileParseStatus().answer().frame()

`answer()` runs the question at the service. The result is returned in a tabular JSON format. 

`frame()` wraps the answer as [pandas dataframe](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).

Additional post-processing can be done on this, like filtering for values in one or multiple columns, reducing the number of columns, etc...

Information on this can be found in the [pandas tutorial on filtering](http://nbviewer.jupyter.org/github/jvns/pandas-cookbook/blob/v0.2/cookbook/Chapter%203%20-%20Which%20borough%20has%20the%20most%20noise%20complaints%20%28or%2C%20more%20selecting%20data%29.ipynb).

In [6]:
# Use a filter on the returned dataframe to see which files failed to parse
parse_status[parse_status['Status'] != 'PASSED']  # change '!=' to '==' to get the files which passed


Unnamed: 0,Filename,Status,Hosts


### Exploring "example" Snapshot and extracting information


Let's retrieve information about all of the <b>border</b> routers in the network

In [7]:
node_properties = bfq.nodeProperties(nodeRegex=".*border.*").answer().frame()

In [8]:
# To view what columns are present in the answer, run
node_properties.columns

Index(['node', 'domain-name', 'ip6-access-lists', 'tacacs-servers',
       'logging-servers', 'tacacs-source-interface', 'ipsec-vpns',
       'snmp-source-interface', 'hostname', 'ntp-source-interface',
       'configuration-format', 'routing-policies', 'dns-servers',
       'dns-source-interface', 'ike-policies', 'device-type',
       'route6-filter-lists', 'canonical-ip', 'route-filter-lists',
       'interfaces', 'ip-access-lists', 'logging-source-interface',
       'authentication-key-chains', 'ipsec-policies', 'zones',
       'community-lists', 'ip-spaces', 'default-cross-zone-action',
       'default-inbound-action', 'ipsec-proposals', 'snmp-trap-servers',
       'as-path-access-lists', 'ntp-servers', 'ike-gateways', 'vendor-family',
       'vrfs'],
      dtype='object')

In [9]:
# To only return the answer for a subset of columns, pass the propertySpec parameter in the question as shown below
node_properties_trunc = bfq.nodeProperties(nodeRegex=".*border.*", propertySpec="domain-name|ntp-servers|interfaces").answer().frame()

node_properties_trunc

Unnamed: 0,node,domain-name,interfaces,ntp-servers
0,as1border1,lab.local,"[GigabitEthernet0/0, GigabitEthernet1/0, Ethernet0/0, Loopback0]",[]
1,as1border2,lab.local,"[GigabitEthernet0/0, GigabitEthernet1/0, GigabitEthernet2/0, Ethernet0/0, Loopback0]","[18.18.18.18, 23.23.23.23]"
2,as2border1,lab.local,"[GigabitEthernet0/0, GigabitEthernet1/0, GigabitEthernet2/0, Ethernet0/0, Loopback0]","[18.18.18.18, 23.23.23.23]"
3,as2border2,lab.local,"[GigabitEthernet0/0, GigabitEthernet1/0, GigabitEthernet2/0, Ethernet0/0, Loopback0]",[18.18.18.18]
4,as3border1,lab.local,"[GigabitEthernet0/0, GigabitEthernet1/0, Ethernet0/0, Loopback0]","[18.18.18.18, 23.23.23.23]"
5,as3border2,lab.local,"[GigabitEthernet0/0, GigabitEthernet1/0, Ethernet0/0, Loopback0]","[18.18.18.18, 23.23.23.23]"


An alternative way to restrict the list of columns displayed is to use pandas column filtering, [pandas tutorial](http://nbviewer.jupyter.org/github/jvns/pandas-cookbook/blob/v0.2/cookbook/Chapter%202%20-%20Selecting%20data%20%26%20finding%20the%20most%20common%20complaint%20type.ipynb) 

In [10]:
# Let's remove the interfaces column from our result
node_properties_trunc = node_properties_trunc[["node", "domain-name", "ntp-servers"]]

node_properties_trunc

Unnamed: 0,node,domain-name,ntp-servers
0,as1border1,lab.local,[]
1,as1border2,lab.local,"[18.18.18.18, 23.23.23.23]"
2,as2border1,lab.local,"[18.18.18.18, 23.23.23.23]"
3,as2border2,lab.local,[18.18.18.18]
4,as3border1,lab.local,"[18.18.18.18, 23.23.23.23]"
5,as3border2,lab.local,"[18.18.18.18, 23.23.23.23]"


You can add additional filters to restrict entries based on values of columns. 
For example to only view nodes which have **23.23.23.23** as one of the configured ntp-servers, run the following on the above result

In [11]:
node_properties_trunc[node_properties_trunc['ntp-servers'].apply(lambda x:'23.23.23.23' in x)]

Unnamed: 0,node,domain-name,ntp-servers
1,as1border2,lab.local,"[18.18.18.18, 23.23.23.23]"
2,as2border1,lab.local,"[18.18.18.18, 23.23.23.23]"
4,as3border1,lab.local,"[18.18.18.18, 23.23.23.23]"
5,as3border2,lab.local,"[18.18.18.18, 23.23.23.23]"


To retrieve information about interfaces present and the properties of them, use the **interfaceProperties** question

In [12]:
interfaces = bfq.interfaceProperties(nodeRegex=".*border.*", propertySpec="interface-type|bandwidth|vrf|primary-address").answer().frame()

If you wanted to just find interfaces with the primary ip address in <b>10.12.0.0/16</b>, you can filter the results as shown below.

**na=False** is required in order to ignore interfaces without any configured IP addresses, such as ethernet switchports.


In [13]:
interfaces[interfaces['primary-address'].str.match("10.12", na=False)]

Unnamed: 0,interface,bandwidth,interface-type,vrf,primary-address
2,as1border1:GigabitEthernet1/0,1000000000.0,PHYSICAL,default,10.12.11.1/24
10,as2border1:GigabitEthernet0/0,1000000000.0,PHYSICAL,default,10.12.11.2/24


### Exploring Routing and Forwarding Tables (RIBs and FIBs) in the Data Plane

First, let's generate a data plane for our network snapshot.

In [14]:
# This will trigger the computation of the routing and forwarding tables (aka dataplane)
bf_generate_dataplane()



This ensures that the dataplane is computed before running any questions that require it. If the dataplane is not already computed when a question is asked that needs it, Batfish will first compute the dataplane and then evaluate the question.



Now we can inspect the routes computed as a part of data plane generation. To get the routing table of all of the VRFs on all of the nodes in the network, run: 

In [15]:
routes_df = bfq.routes().answer().frame()

This can generate a lot of results, you can restrict the output by using filter in the question. To restrict the results to just **border** routers, provide the argument **nodeRegex = ".*border.*"**.

You can also just filter the results. 

For example, if you wanted to see all the routes on all the nodes/VRFs for the network **90.90.90.0/24** with an **Admin Distance of 0**", you can filter using multiple conditions in [pandas](http://pandas.pydata.org/pandas-docs/version/0.15/indexing.html#boolean-indexing)

In [16]:
routes_df[(routes_df['Network'] == "90.90.90.0/24") & (routes_df["AdminDistance"] == 0)]

Unnamed: 0,Node,VRF,Network,Protocol,Tag,NextHopIp,NextHop,AdminDistance,Metric
329,as3core1,default,90.90.90.0/24,connected,-1,AUTO/NONE(-1l),,0,0
330,as3core1,default,90.90.90.0/24,connected,-1,AUTO/NONE(-1l),,0,0


That's it for now! Feel free to explore some more by adding cells to the notebook