Skip to content
This repository has been archived by the owner on Aug 24, 2022. It is now read-only.
/ netmet Public archive

NetMet is networking tool that allows you to track and analyze network uptime of multi data centers installations

License

Notifications You must be signed in to change notification settings

godaddy/netmet

NetMet is networking tool that allows you to track and analyze network uptime of multi data centers installations

Motivation

As a cloud provider one must gurantee SLA. For example 99.95% dataplane uptime (~43 seconds downtime per day). This means few things:

  • Cloud provider cannot rely on customer generated tickets for downtime measurements.
  • Cloud provider needs to be proactive:
    • Get alerts within seconds after any downtime occurs.
    • Have all required data for debugging in place
      • What Data Centers (DC), Availability Zones (AZ) & Servers are effected
      • Is it an underlay or overlay network issue
      • When the event started and when it stopped

To verify uptime requirements we developed NetMet – a tool that constantly measures connectivity between all servers including those in different availability zones or even different data centers.

Contributing

Everybody is welcome to contribute to project. We use standard GitHub process with Issues & PR.

Architecture

Netmet Architecture

The client-server architecture of NetMet was designed with a clear separation of concerns in mind:

  • Clients periodically perform connectivity checks between each other
  • Clients periodically perform Internet connectivty checks
  • Clients send data to server
  • Server performs aggregation of data and stores it in ElasticSearch
  • Server exposes an API for retrieving aggregated data to facilitate visualization of it.

Deployment

Run all netmet clients and servers:

  • Netmet Server: run few instances of netmet servers under HAproxy or Nginx
  • Netmet Client: run 1 client per 1 server that should be monitored

Physical placement

To collect all metrics needed to monitor network of Data Centers use next schema:

Netmet Deployment

  • Run few instances of Netmet servers in different regions
  • Run 1 instance of Netmet client per 1 server
  • Elasticsearch cluster mode

Logical placement

To avoid Netmet downtime use next schema:

Netmet Placement

  • Netmet servers should be run under Nginx/HAproxy/Lbaas (for now)
  • Netmet server may use multiple Elasticsearch addresses (no need in HA)

Install & Run

To install NetMet from source you should run next command

pip install .   # run it from root directory

After that netmet command should become available

To run Netmet Server

APP=server NETMET_SERVER_URL="<url:port>" NETMET_OWN_URL="<url:port>" ELASTIC="<url:port>" PORT=5005 netmet

To run Netmet Client

APP=client PORT=5005 netmet

Configure & Upgrade

Netmet is meant to be very easy to configuration. All configuration is done via Netmet server API method POST /api/v1/config which in future is going to update installation (remove/add clients), geneate new client configurations and update clients

Updates & Upgrades

To configure the Netmet use POST /api/v2/config

cat > config.json <<- EOM
{
    "deployment": {
        "static": {
            "clients": [
                {
                    "az": "test-az",
                    "dc": "test-dc",
                    "host": "127.0.0.1",
                    "ip": "127.0.0.1",
                    "port": 5001
                }
            ]
      }
    },
    "external": [
        {"dest": "8.8.8.8", "period": 1, "protocol": "icmp", "timeout": 0.5}
    ],
    "mesher": {
        "full_mesh": {}
    }
}
EOM

curl -H "Content-Type: application/json" -X POST -d '@config.json' ${NETMET_SERVER_URL}/api/v2/config

Running Tests

Running test is very easy.

Install tox tool

pip install tox

Run tox

tox             # runs all tests
tox -e pep8     # runs only pep8 code style checks
tox -e py27     # runs unit tests using python 2.7

About

NetMet is networking tool that allows you to track and analyze network uptime of multi data centers installations

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages