NetGraf: Real-time Network Monitoring Tool that uses Machine Learning

Network performance monitoring collects heterogeneous data such as network flow data to give an overview of network performance, and other metrics, necessary for diagnosing and optimizing service quality. However, due to disparate and heterogeneity, to obtain metrics and visualize entire data from several devices, engineers have to log into multiple dashboards. Here we present NetGraf: An end-to-end learning monitoring system that utilizes current monitoring tools and libraries to analyze the data and perform real-time anomaly finding. It can learn network performance baselines, understand important data features, and can capture a holistic view of the networking infrastructure from a packet,flow, and device-level data by tapping into multiple opensource solutions. NetGraf uses automated deployment with Ansible to provide real-time visualizations of various network health metrics from different multiple monitoring sources into a single dashboard for valuable insight on the network. With its machine learning libraries, NetGraf can learn baseline performance, and eventually the ability to optimize in a self-learning network.

Guide and Documentation

NetGraf API documentations:
NetGraf Example Tutorials: https://github.com/esnet/daphne/tree/master/NetGraf-Ansible

Explanations

Website
Blog
Video

Installation Guide

Please ensure you have the IP addresses of the network devices, switches, routers ,servers, systems or hosts you intend to monitor. Netgraf has been tested on Chameleon cloud and Digital ocean instances. To get started, please feel free to use any enviroment or provider of your choice.

Installation Pre-requisite

Create a set of Virtual Machine(VM) Instances or network devices that you intend to monitor, and ensure they can communicate and pingable to each other.
For example you can spin up VMs Using Chameleon Cloud, Amazon EC2, Digital Ocean or any other cloud provider of your choice.
To reserve a node and lauch an instance on Chameleon, follow the steps provided here.
Make one of your host or network devices your Control Node, and then others your Target Nodes.
Modify your hosts file to match the number of intended Target hosts by specifying the IP addresses.
Depending on your Controller node OS, Install ansible using the steps provided here.

Ensure the devices or host machines are pingable to each other by creating a public key using the instruction here, and then copy over your keys to your target nodes using:

ssh-copy-id user@0.0.0.0

Once the installtion is complete and your environment is all set up, check the version of your ansible:

ansible --version

Clone the repository accordingly from a terminal whilst in your home directory with the following command -:

git clone https://github.com/esnet/netgraf.git

cd netgraf-main

Test the connectivity between your nodes(Control Node and Target Nodes):

ansible all -m ping 

ansible all -m ping -o

Install NetGraf using one push button:

ansible-playbook playbook.yml 

time ansible-playbook playbook.yml -vvv

Once the NetGraf installtion is complete, view the PLAY RECAP for any errors and check the prometheus and grafana status.

sudo systemctl status prometheus

sudo systemctl status grafana

To view all the active target nodes and metrics, type the following below. Please note that our controller IP address is 159.65.60.19. Please refer to our environment setup and IP address assignment here:

 http://159.65.60.19:9090/targets

 http://159.65.60.19:9090/graph

To view the all-in-one NetGraf Dashboard on the controller node:

 http://159.65.60.19:3000

To check the log if the collected metrics is streaming into the central database:

cd /var/log/promscale/

tail -n 30 -f promscale.log promscale.log

To check login to the Central database:

sudo su postgres -c psql

\l+

\c timescaledb_db

select * from  metric;

\q

\exit

To extract specific network related metrics and store them into the DB for analysis:

 bash monitoring_script.sh

To export collected data to a remote location using rlcone - Google

rclone copy /opt/monitor_metrics netgraf_metrics:/metrics_data/ -v

To run the Machine learning component, change to the ml-pipepine directory and follow the instructions here

cd ml-pipeline

Network and System Monitoring tools

Currently, NetGraf library supports the following Monitoring tools:

Supported Network and System Monitoring tools:

ntopng
netdata
collectl
prometheus
perfSONAR
Confluo
zabbix
node_exporter
grafana

Features

Currently, NetGraf library contains the following features:

Machine Learning Models:

LSTM,
SARIMA,
Exponential smoothing,
ARIMA,
Facebook Prophet,
FFT (Fast Fourier Transform),
DDCRNN,

Tests

A gradle setup works best when used in a python environment, but the only requirement is to have pip installed for Python 3+

To run all tests at once just run

./gradlew test_all

alternatively you can run

./gradlew unitTest_all # to run only unittests
./gradlew coverageTest # to run coverage
./gradlew lint         # to run linter

Documentation

To build documantation locally just run

./gradlew buildDocs

After that docs will be available in ./docs/build/html directory. You can just open ./docs/build/html/index.html using your favourite browser.

Contact Us

See attached Licence to Lawrence Berkeley National Laboratory Email: Mariam Kiran mkiran@es.net

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
group_vars		group_vars
logo		logo
ml-pipeline		ml-pipeline
monitor_metrics		monitor_metrics
roles		roles
sftp-graphs-dashboard		sftp-graphs-dashboard
stardust		stardust
templates/promscale		templates/promscale
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
ansible.cfg		ansible.cfg
hosts		hosts
inventory.yml		inventory.yml
monitoring_script.sh		monitoring_script.sh
monitoring_script_full.sh		monitoring_script_full.sh
old_netplaybook.yaml		old_netplaybook.yaml
playbook.yml		playbook.yml
requirements.yml		requirements.yml
test.yml		test.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NetGraf: Real-time Network Monitoring Tool that uses Machine Learning

Guide and Documentation

Explanations

Installation Guide

Installation Pre-requisite

Network and System Monitoring tools

Features

Tests

Documentation

Important Links

Contact Us

About

Releases

Packages

Contributors 2

Languages

License

esnet/netgraf

Folders and files

Latest commit

History

Repository files navigation

NetGraf: Real-time Network Monitoring Tool that uses Machine Learning

Guide and Documentation

Explanations

Installation Guide

Installation Pre-requisite

Network and System Monitoring tools

Features

Tests

Documentation

Important Links

Contact Us

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages