### Getting started with Batfish

We will use the python client for Batfish (pybatfish) to analyze a sample network. This notebook will show the commands necessary to:
#### 1. Create a Network and a Snapshot
Network is the logical network. It can be the entire network or a subset of it. Snapshot is a collection of information (configuration files, etc…) that represent the network at a point in time. Snapshots can contain the actual configuration of network devices or candidate configurations.

#### 2. Extract information
Batfish creates a comprehensive vendor neutral device and network model from which information such as list of devices, interface state, VRFs etc. can be extracted.

In [8]:
# Importing required libraries and setting up logging

from pybatfish.client.commands import (bf_set_network,
                                       bf_init_snapshot, bf_generate_dataplane, bf_logger)
from pybatfish.question import bfq, load_questions

import logging

bf_logger.setLevel(logging.WARN)

### Initializing our Network and Snapshot

In [9]:
NETWORK_NAME = "example_network"
SNAPSHOT_NAME = "example_snapshot"
SNAPSHOT_PATH = "test_rigs/example"

bf_set_network(NETWORK_NAME)
bf_init_snapshot(SNAPSHOT_PATH, name=SNAPSHOT_NAME)

'{\n  "answerElements" : [\n    {\n      "class" : "org.batfish.datamodel.answers.InitInfoAnswerElement",\n      "parseStatus" : {\n        "as1border1" : "PASSED",\n        "as1border2" : "PASSED",\n        "as1core1" : "PASSED",\n        "as2border1" : "PASSED",\n        "as2border2" : "PASSED",\n        "as2core1" : "PASSED",\n        "as2core2" : "PASSED",\n        "as2dept1" : "PASSED",\n        "as2dist1" : "PASSED",\n        "as2dist2" : "PASSED",\n        "as3border1" : "PASSED",\n        "as3border2" : "PASSED",\n        "as3core1" : "PASSED",\n        "host1" : "PASSED",\n        "host2" : "PASSED",\n        "iptables/host1.iptables" : "PASSED",\n        "iptables/host2.iptables" : "PASSED"\n      }\n    }\n  ],\n  "status" : "SUCCESS",\n  "summary" : {\n    "numFailed" : 0,\n    "numPassed" : 0,\n    "numResults" : 0\n  }\n}\n'

### Loading questions from Batfish
Questions are like commands exposed through Batfish/clients to interact with a network and run queries on it.

In [10]:
# Load questions from Batfish
load_questions()

In [11]:
# To see available questions use the tab auto-completion on the Batfish question module - bfq. -> press TAB key,
# uncomment and try on the following line
# bfq.

To get information about which files were not parsed completely during initialization, we can use the fileParseStatus question

In [12]:
parse_status = bfq.fileParseStatus().answer().frame()

`answer()` runs a question at the service and returns the answer in a tabular JSON format. `frame()` further wraps the answer as [pandas dataframe](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) which can be used to do various post-processing on the answer, like filtering or minimizing the columns in the tabular JSON: [pandas tutorial on filtering](http://nbviewer.jupyter.org/github/jvns/pandas-cookbook/blob/v0.2/cookbook/Chapter%203%20-%20Which%20borough%20has%20the%20most%20noise%20complaints%20%28or%2C%20more%20selecting%20data%29.ipynb).

In [13]:
# Use a filter on the returned dataframe to see which files failed to parse
parse_status[parse_status['Status'] != 'PASSED']  # change '!=' to '==' to get the files which passed


Unnamed: 0,Filename,Status,Hosts


### Exploring "example" network snapshot and extracting information


Getting information about the border nodes in the network

In [14]:
node_properties = bfq.nodeProperties(nodeRegex=".*border.*").answer().frame()
# To view what columns are present in the answer, run
# node_properties.columns
# To make Batfish return an answer with only some columns, pass the propertySpec parameter in the question
node_properties_trunc = bfq.nodeProperties(nodeRegex=".*border.*", propertySpec="hostname|domain-name|ntp-servers|interfaces").answer().frame()



We can also apply pandas column filtering on top of the result, [pandas tutorial](http://nbviewer.jupyter.org/github/jvns/pandas-cookbook/blob/v0.2/cookbook/Chapter%202%20-%20Selecting%20data%20%26%20finding%20the%20most%20common%20complaint%20type.ipynb) 

In [15]:
# Let's remove the interfaces column from our result
node_properties_trunc = node_properties_trunc[["hostname", "domain-name", "ntp-servers"]]

We can also do filtering based on values of columns, for example to view only the nodes having **23.23.23.23** as one of their ntp-servers, run the following on the above result

In [16]:
node_properties_trunc[node_properties_trunc['ntp-servers'].apply(lambda x:'23.23.23.23' in x)]

Unnamed: 0,hostname,domain-name,ntp-servers
1,as1border2,lab.local,"[18.18.18.18, 23.23.23.23]"
2,as2border1,lab.local,"[18.18.18.18, 23.23.23.23]"
4,as3border1,lab.local,"[18.18.18.18, 23.23.23.23]"
5,as3border2,lab.local,"[18.18.18.18, 23.23.23.23]"


To inspect the properties of interfaces present in the network we can use the **interfaceProperties** question

In [17]:
interfaces = bfq.interfaceProperties(nodeRegex=".*border.*", propertySpec="interface-type|bandwidth|vrf|primary-address").answer().frame()



We can do filtering on the table returned to get information like which interfaces have a primary IP starting with **10.12.\*.\*, na=False** will ignore the interfaces which do not have a primary address.

In [18]:
interfaces[interfaces['primary-address'].str.match("10.12", na=False)]

Unnamed: 0,interface,bandwidth,interface-type,vrf,primary-address
2,as1border1:GigabitEthernet1/0,1000000000.0,PHYSICAL,default,10.12.11.1/24
10,as2border1:GigabitEthernet0/0,1000000000.0,PHYSICAL,default,10.12.11.2/24


### Exploring the data-plane related aspects of the network

In [19]:
# Let's generate the data-plane for our network
bf_generate_dataplane()



This will make sure a data-plane exists before we start running our data-plane dependent questions
Otherwise this will be done implicitly by the data-plane dependent questions

We can run the routes question to get all the routes present on all nodes/VRFs after the data-plane computation is done.

In [20]:
routes_df = bfq.routes().answer().frame()

We can do filtering based on the fields returned, to find out information like "Tell me all the routes on all the nodes/VRFs which are going to the network **90.90.90.0/24** with an **Admin Distance of 0**", we can filter using multiple conditions in [pandas](http://pandas.pydata.org/pandas-docs/version/0.15/indexing.html#boolean-indexing)

In [21]:
routes_df[(routes_df['Network'] == "90.90.90.0/24") & (routes_df["AdminDistance"] == 0)]

Unnamed: 0,Node,VRF,Network,Protocol,Tag,NextHopIp,NextHop,AdminDistance,Metric
329,as3core1,default,90.90.90.0/24,connected,-1,AUTO/NONE(-1l),,0,0
330,as3core1,default,90.90.90.0/24,connected,-1,AUTO/NONE(-1l),,0,0


That's it for now!