Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: Apache-2.0

![image.png](attachment:e3783a74-838b-4c3f-8cf1-8d8cce5ff7b2.png)
# Getting Started with Graph Algorithms
Neptune Analytics provides a set of optimized in-database graph algorithms that provide implementations of common graph algorithms which are exposed to you as openCypher procedures.  Graph algorithms are a powerful set of algorithms that can be used to provide insights into data based on inherent aspects of the underlying graph structure, such as connectedness (path finding), relative importance (centrality), community membership (community detection), and similarity.  

Within Neptune Analytics, we currently support algorithms in four main categories: Path finding, Centrality, Community detection/clustering, and similarity.  Each of the categories of algorithms is described below with links to specific notebooks that provide details on how to use each of the algorithms.

* [Path finding algorithms](./02-Path-Finding-Algorithms.ipynb) find the existence, quality, or availability of a path or paths, the set of nodes and edges, between two or more nodes within a graph.

By efficiently determining the optimal route between two nodes, path-finding algorithms enable you to model real-world systems like roads or social networks as interconnected nodes and edges. Finding the shortest paths between various points is crucial in applications like route planning for GPS systems, logistics optimization, and even in solving complex problems in fields like biology or engineering.

* [Centrality algorithms](./03-Centrality-Algorithms.ipynb) are used to determine the absolute or relative importance or influence of a node or nodes within a graph. 

By identifying the most influential or important nodes within a network, centrality algorithms can provide insights about key players or critical points of interaction. This is valuable in social network analysis, where it helps pinpoint influential individuals, and in transportation networks, where it aids in identifying crucial hubs for efficient routing and resource allocation.

* [Community detection/clustering algorithms](./04-Community-Detection-Algorithms.ipynb) evaluate how nodes group or cluster into communities of closely knit sets of highly or loosely interconnected groups. 

Community-detection algorithms can identify meaningful groups or clusters of nodes in a network, revealing hidden patterns and structures that can provide insights into the organization and dynamics of complex systems. This is valuable in social network analysis, and in biology, for identifying functional modules in protein-protein interaction networks, and more generally for understanding information flow and influence propagation in many different domains.

* [Similarity algorithms](./05-Similarity-Algorithms.ipynb) evaluate how nodes group or cluster into communities of closely knit sets of highly or loosely interconnected groups. 

Graph similarity algorithms allow you to compare and analyze the similarities and dissimilarities between different graph structures, which can provide insight into relationships, patterns, and commonalities across diverse datasets. This is invaluable in various fields, such as biology, for comparing molecular structures, such as social networks, for identifying similar communities, and such as recommendation systems, for suggesting similar items based on user preferences.

Many of these algorithms require interacting with most to all the nodes and edges in a graph, often in an iterative fashion.  These aspects of algorithms make them computationally expensive to process using most normal analytics technologies.  Neptune Analytics has built a set of highly optimized algorithms that allow performing these operations on graphs of any size.  


## Using Graph Algorithms in Neptune Analytics

Graph algorithms are integrated into Neptune Analytics through the openCypher query language.  Executing algorithms is achieved through the use of the `CALL` clause, as illustrated below.

*Find the 10 most important airports in Washington*
```
MATCH (n:airport {region: 'US-WA'})
CALL neptune.algo.pageRank(n)
YIELD rank
RETURN n.code, rank
ORDER BY rank DESC LIMIT 10
```

### Invoking Algorithms
Algorithms can be run in a query by themselves, known as Standalone, or in queries with other clauses, known as Query Integrated, where their inputs can be further constrained or their outputs can be used with standard openCypher syntax and semantics. 

For example a Standalone algorithm invocation might look like:

```
CALL neptune.algo.pageRank.mutate()
```

While a Query Integrated algorithm invocation would look like:

```
MATCH (n:airport {code: 'SEA'})
CALL neptune.algo.pageRank(n)
YIELD rank
RETURN n.code, rank
```

### Algorithm Variations

Many of the algorithms available in Neptune Analytics support different variations of the algorithms.  These variations may provide different outputs for the same algorithm, such as `bfs.parents` and `bfs.levels`, provide different implementations, such as `sssp.bellmanFord` and `sssp.deltaStepping`, or may provide different options for managing results, such as `wcc` and `wcc.mutate`.  In each case, it is best to check the documentation for the specific algorithm(s) to get a more detailed understanding of the available options.

## Next Steps
In this notebook, we have discussed how to load data and query that data using Neptune Analytics. Below are four notebook links, each of which provides a deeper dive into each category of algorithms. 

* [Path finding algorithms](./02-Path-Finding-Algorithms.ipynb)

* [Centrality algorithms](./03-Centrality-Algorithms.ipynb)

* [Community detection/clustering algorithms](./04-Community-Detection-Algorithms.ipynb)

* [Similarity algorithms](./05-Similarity-Algorithms.ipynb)