# Neptune Analytics Instance Management With S3 Tables Projections

This notebook uses the SessionManager to create projections from S3 Table datalake, load the projection into Neptune Analytics through S3. We will use the Louvain algorithm to find potential fraudulent nodes, and export the mutated graph back into S3 for our datalake.

This notebook demonstrates how to:
1. Create a projection from S3 bucket.
2. Import the projection into Neptune Analytics.
3. Run Louvain algorithm on the provisioned instance.
4. Export the graph back into S3 bucket.

## Setup

Import the necessary libraries and set up logging.

In [1]:
import logging
import sys
import os
import dotenv

dotenv.load_dotenv()

from nx_neptune.session_manager import SessionManager

In [2]:
# Configure logging to see detailed information about the instance creation process
logging.basicConfig(
    level=logging.INFO,
    format='%(levelname)s - %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S',
    stream=sys.stdout  # Explicitly set output to stdout
)
# Enable debug logging for the instance management module
for logger_name in ['nx_neptune.instance_management', 'nx_neptune.session_manager']:
    logging.getLogger(logger_name).setLevel(logging.DEBUG)
logger = logging.getLogger(__name__)

## Configuration

Check for environment variables and configure the NetworkX backend for Neptune Analytics.

In [4]:
def check_env_vars(var_names):
    values = {}
    for var_name in var_names:
        value = os.getenv(var_name)
        if not value:
            print(f"Warning: Environment Variable {var_name} is not defined")
            print(f"You can set it using: %env {var_name}=your-value")
        else:
            print(f"Using {var_name}: {value}")
        values[var_name] = value
    return values
    
# Check for optional environment variables
env_vars = check_env_vars([
    'NETWORKX_S3_IMPORT_BUCKET_PATH',
    'NETWORKX_S3_EXPORT_BUCKET_PATH',
])

# Get environment variables
s3_location_import = os.getenv('NETWORKX_S3_IMPORT_BUCKET_PATH')  # Optional: for importing data after creation
s3_location_export = os.getenv('NETWORKX_S3_EXPORT_BUCKET_PATH')  # Optional: for importing data after creation

Using NETWORKX_S3_IMPORT_BUCKET_PATH: s3://nx-fraud-detection/projection/
Using NETWORKX_S3_EXPORT_BUCKET_PATH: s3://nx-fraud-detection


## Create a New Neptune Analytics Instance

Provision a new Neptune Analytics instance on demand. This process may take several minutes to complete.

In [5]:
session = SessionManager.Session("example_fraud_detection")
graph_list = session.list_graphs()
logger.info(f"The following graphs are available: {graph_list}")

graph = await session.get_or_create_graph()
logger.info(f"Created graph: {graph}")

graph_list = session.list_graphs()
logger.info(f"The following graphs are available: {graph_list}")

INFO - Found credentials in environment variables.
INFO - The following graphs are available: []


## Import Data from S3

Import data from S3 into the Neptune Analytics graph and wait for the operation to complete. <br>
IAM permisisons required for import: <br>
 - s3:GetObject, kms:Decrypt, kms:GenerateDataKey, kms:DescribeKey

In [None]:
os.environ['NETWORKX_GRAPH_ID'] = graph_id
na_graph = NeptuneGraph.from_config(graph=nx.DiGraph())
future = import_csv_from_s3(
na_graph, s3_location_import)
import_blocking_status = await future
print("Import completed with status: " + import_blocking_status)

## Execcute BFS Algorithm

Create a NetworkX graph and initialize the connection to the Neptune Analytics instance.

In [None]:
g = nx.DiGraph()
# BFS on Air route
r = list(nx.bfs_edges(g, source="48", backend="neptune"))
print('BFS search on Neptune Analytics with source=48 (Vancouver international airport): ')
print(f"Total size of the result: {len(r)}")


## Delete the Neptune Analytics Instance

Delete the Neptune Analytics instance after the computation. This process may take several minutes to complete.

In [None]:
fut = await delete_na_instance(graph_id)
logger.info(f"Instance delete completed with status: {fut}")


## Conclusion

This notebook demonstrated the complete lifecycle of a Neptune Analytics instance:

1. **Creation**: We created a new Neptune Analytics instance on demand
2. **Import**: We imported the demo Air Route dataset from 
3. **Usage**: We ran graph algorithms (BFS) on the instance
4. **Deletion**: We deleted the on demand instance after the computation is conducted

The `create_na_instance()` and `delete_na_instance` functions make it easy to provision and destroy Neptune Analytics resources when needed, enabling seamless scaling of graph computations.