# Degree Algorithm with Neptune Analytics

This notebook demonstrates how the Degree algorithms computation can be offloaded to a remote AWS Neptune Analytics instance.

## Setup and Imports

First, let's import the necessary libraries and set up logging.

In [None]:
import networkx as nx
from nx_neptune import NeptuneGraph
import logging
import os
import matplotlib.pyplot as plt
from nx_neptune.utils.utils import get_stdout_logger
import requests
import pandas as pd

In [None]:
logger = get_stdout_logger(__name__,[
                    'nx_neptune.algorithms.centrality.degree_centrality',
                    'nx_neptune.na_graph', 'nx_neptune.utils.decorators', __name__])

# Ignore cache warnings
nx.config.warnings_to_ignore.add("cache")

## Check for Neptune Analytics Graph ID

We need to ensure that the NETWORKX_GRAPH_ID environment variable is set. You can also set it directly in this notebook.

In [None]:
# Read and load graphId from environment variable
graph_id = os.getenv('NETWORKX_GRAPH_ID')

# If not set, you can set it here
if not graph_id:
    # Uncomment and set your Graph ID
    # %env NETWORKX_GRAPH_ID=your-neptune-analytics-graph-id
    # graph_id = os.getenv('NETWORKX_GRAPH_ID')
    print("Warning: Environment Variable NETWORKX_GRAPH_ID is not defined")
    print("You can set it using: %env NETWORKX_GRAPH_ID=your-neptune-analytics-graph-id")
else:
    print(f"Using Neptune Analytics Graph ID: {graph_id}")

## Download and configure Air route dataset

Then download the air route dataset for testing purpose.

In [None]:
# Download routes data
routes_url = "https://raw.githubusercontent.com/jpatokal/openflights/master/data/routes.dat"
routes_file = "resources/notebook_test_data_routes.dat"

# Ensure the directory exists
os.makedirs(os.path.dirname(routes_file), exist_ok=True)

with open(routes_file, "wb") as f:
    f.write(requests.get(routes_url).content)

cols = [
    "airline", "airline_id", "source_airport", "source_airport_id",
    "dest_airport", "dest_airport_id", "codeshare", "stops", "equipment"
]

routes_df = pd.read_csv("resources/notebook_test_data_routes.dat", names=cols, header=None)

g = nx.DiGraph()  # use DiGraph for directed air routes

for _, row in routes_df.iterrows():
    src = row["source_airport"]
    dst = row["dest_airport"]
    if pd.notnull(src) and pd.notnull(dst):
        g.add_edge(src, dst)
print(f'Populated test dataset with nodes:{g.number_of_nodes()} and edges:{g.number_of_edges()}')

### Example 1: Execute Degree Centrality Algorithm

Let's start with running the Degree Centrality Algorithm against the air route data and list out the top 10 results

In [None]:
r = nx.degree_centrality(g, backend="neptune")
logger.info("Algorithm execution - Neptune Analytics: ")
for key, value in sorted(r.items(), key=lambda x: (x[1], x[0]), reverse=True)[:10]:
    logger.info(f"{key}: {value}")

### Example 2: Execute In Degree Centrality Algorithm

Execute In Degree Centrality algorithm against air route dataset on remote Neptune Analytics instance

In [None]:
r = nx.in_degree_centrality(g, backend="neptune")
logger.info("Algorithm execution - Neptune Analytics: ")
for key, value in sorted(r.items(), key=lambda x: (x[1], x[0]), reverse=True)[:10]:
    logger.info(f"{key}: {value}")

### Example 3: Execute Out Degree Centrality Algorithm

Execute Out Degree Centrality algorithm against air route dataset on remote Neptune Analytics instance

In [None]:
r = nx.out_degree_centrality(g, backend="neptune")
logger.info("Algorithm execution - Neptune Analytics: ")
for key, value in sorted(r.items(), key=lambda x: (x[1], x[0]), reverse=True)[:10]:
    logger.info(f"{key}: {value}")