## Introduction to Route Analysis using Batfish

Network engineers routinely need to validate routing and forwarding in the network. They often do that by connecting to multiple network devices and executing a series of `show route` commands. This distributed debugging is highly complex even in a moderately-sized network. Batfish makes this task extremely simple by providing an easy-to-query, centralized view of routing tables in the network. 

In this notebook, we will look at how you can extract routing information from Batfish.

![Analytics](https://ga-beacon.appspot.com/UA-100596389-3/open-source/pybatfish/jupyter_notebooks/intro-route-analysis?pixel&useReferer)

In [1]:
# Import packages and load questions
%run startup.py

  "Pybatfish public API is being updated, note that API names and parameters will soon change.")


 ### Initializing the Network and Snapshot

`SNAPSHOT_PATH` below can be updated to point to a custom snapshot directory, see the [Batfish instructions](https://github.com/batfish/batfish/wiki/Packaging-snapshots-for-analysis) for how to package data for analysis.<br>
More example networks are available in the [networks](https://github.com/batfish/batfish/tree/master/networks) folder of the Batfish repository.

In [2]:
# Initialize a network and snapshot
NETWORK_NAME = "example_network"
SNAPSHOT_NAME = "example_snapshot"

SNAPSHOT_PATH = "networks/example"

bf_set_network(NETWORK_NAME)
bf_init_snapshot(SNAPSHOT_PATH, name=SNAPSHOT_NAME, overwrite=True)

'example_snapshot'

The network snapshot that we initialized above is illustrated below. You can download/view devices' configuration files [here](https://github.com/batfish/pybatfish/tree/master/jupyter_notebooks/networks/example).

![example-network](https://raw.githubusercontent.com/batfish/pybatfish/master/jupyter_notebooks/networks/example/example-network.png)

All of the information we will show you in this notebook is dynamically computed by Batfish based on the configuration files for the network devices.

### Viewing Routing Tables
Batfish makes **all** routing tables in the network easily accessible.

In [3]:
# Get routing tables for all nodes and VRFs
routes_all = bfq.routes().answer().frame()

We are not going to print this table as it has a large number of entries. To extract a subset of the entries, we can run the `routes()` question with parameters of interest.

In [4]:
# Get the routing table for the 'default' VRF on border routers of as1
routes_as1border = bfq.routes(nodeRegex="as1border.*", vrfRegex="default").answer().frame()
routes_as1border

Unnamed: 0,Node,VRF,Network,Protocol,NextHopIp,NextHop,AdminDistance,Metric,Tag
0,as1border1,default,10.12.11.1/32,local,AUTO/NONE(-1l),,0,0,
1,as1border2,default,10.13.22.1/32,local,AUTO/NONE(-1l),,0,0,
2,as1border2,default,1.0.1.0/24,ospf,1.0.2.2,as1core1,110,2,
3,as1border1,default,10.13.22.0/24,ospfE2,1.0.1.2,as1core1,110,20,
4,as1border1,default,10.14.22.0/24,ospfE2,1.0.1.2,as1core1,110,20,
5,as1border2,default,10.12.11.0/24,ospfE2,1.0.2.2,as1core1,110,20,
6,as1border2,default,10.14.22.1/32,local,AUTO/NONE(-1l),,0,0,
7,as1border1,default,1.0.1.1/32,local,AUTO/NONE(-1l),,0,0,
8,as1border1,default,2.128.0.0/16,bgp,10.12.11.2,as2border1,20,50,
9,as1border1,default,1.0.1.0/24,connected,AUTO/NONE(-1l),,0,0,


### Debugging Connectivity
To debug a connectivity issue, network engineers often need to analyze the routing entries for a prefix at multiple devices. The commands below show you how to use Batfish to debug a connectivity issue to a server in the subnet **1.0.2.0/24**, as an example.

One of the first things you might do is try to find the devices which do not have a route for the prefix in question. That is easy to do with Batfish and the Pandas [groupby](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html) and [filter](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.DataFrameGroupBy.filter.html) functions.

In [5]:
# Group all routes by Node and filter for those that don't have '1.0.2.0/24'
nodes_filtered = routes_all.groupby('Node').filter(lambda x: all(x['Network'] != '1.0.2.0/24'))

# Get the unique node names and sort the list
print(sorted(nodes_filtered["Node"].unique()))

['host1', 'host2']


The only devices that do not have a route to **1.0.2.0/24** are the 2 hosts in the snapshot. This is expected, as they should just have a default route. Let's verify that.

In [6]:
routes_all[routes_all['Node'].str.contains('host')]

Unnamed: 0,Node,VRF,Network,Protocol,NextHopIp,NextHop,AdminDistance,Metric,Tag
101,host2,default,2.128.1.0/24,connected,AUTO/NONE(-1l),,0,0,
167,host1,default,2.128.0.101/32,local,AUTO/NONE(-1l),,0,0,
207,host2,default,2.128.1.101/32,local,AUTO/NONE(-1l),,0,0,
210,host2,default,0.0.0.0/0,static,2.128.1.1,as2dept1,1,0,
229,host1,default,2.128.0.0/24,connected,AUTO/NONE(-1l),,0,0,
254,host1,default,0.0.0.0/0,static,2.128.0.1,as2dept1,1,0,


Both hosts have their connected routes, local routes and a static default (that is how host default gateways are modeled in Batfish).

The next thing to do is get the routing information for **1.0.2.0/24** on all of the other devices, to see if that may be the cause of for the connectivity issue.

In [7]:
# Get routes for '1.0.2.0/24'
routes_filtered = routes_all[routes_all["Network"] == "1.0.2.0/24"]

# print the results
routes_filtered

Unnamed: 0,Node,VRF,Network,Protocol,NextHopIp,NextHop,AdminDistance,Metric,Tag
24,as2dept1,default,1.0.2.0/24,bgp,2.34.101.3,as2dist1,20,50,
26,as3core1,default,1.0.2.0/24,ibgp,10.13.22.1,as1border2,200,50,
73,as3border1,default,1.0.2.0/24,ibgp,10.13.22.1,as1border2,200,50,
90,as2dist2,default,1.0.2.0/24,ibgp,10.12.11.1,as1border1,200,50,
91,as2dist2,default,1.0.2.0/24,ibgp,10.12.11.1,as1border1,200,50,
93,as2dist1,default,1.0.2.0/24,ibgp,10.12.11.1,as1border1,200,50,
94,as2dist1,default,1.0.2.0/24,ibgp,10.12.11.1,as1border1,200,50,
143,as1border1,default,1.0.2.0/24,ospf,1.0.1.2,as1core1,110,2,
144,as2border2,default,1.0.2.0/24,ibgp,10.12.11.1,as1border1,200,50,
145,as2border2,default,1.0.2.0/24,ibgp,10.12.11.1,as1border1,200,50,


If the route entries on each of the devices looks correct, the next troubleshooting step would to determine the path between the end-point having issues connecting to a server in the subnet  **1.0.2.0/24** and evaluating any ACLs or firewall rules that are configured on devices in the path.

Future notebooks will dive into these topics, so stay tuned!

To recap, in this notebook we covered the foundational tasks for route analysis:

1. How to get routes at all nodes in the network or only at a subset of them
2. How to find which nodes have an entry for a prefix or which ones do not

### Want to know more? 

Reach out to us through [Slack](https://join.slack.com/t/batfish-org/shared_invite/enQtMzA0Nzg2OTAzNzQ1LTUxOTJlY2YyNTVlNGQ3MTJkOTIwZTU2YjY3YzRjZWFiYzE4ODE5ODZiNjA4NGI5NTJhZmU2ZTllOTMwZDhjMzA) or [Github](https://github.com/batfish/batfish) to learn more or send feedback.