## Create a new Neptune Analytics graph with a vertex index

After enriching your graph with embeddings for every Transaction, you can now proceed to bring this data back into Neptune Analytics. 

Here you will run an import task that will import both the original graph data and enriched predictions and embeddings together.

In [None]:
import json
import os

# Read information about the graph from the JSON file you created in notebook 0
with open("task-info.json", "r") as f:
    task_info = json.load(f)

BUCKET=task_info["BUCKET"]
GRAPH_NAME=task_info["GRAPH_NAME"]
EXPORTED_GRAPH_S3 = task_info["EXPORTED_GRAPH_S3"]
ENRICHED_S3 = os.path.join(EXPORTED_GRAPH_S3, "enriched")
GRAPH_ID = task_info["GRAPH_ID"]
AWS_REGION = task_info["AWS_REGION"]

# Neptune Analytics bulk loader role needs:
# - AmazonS3ReadOnlyAccess (or more restricted S3 access)
# - AWS KMS permissions if using encrypted S3 buckets
# For detailed permissions and trust policy requirements, see:
# https://docs.aws.amazon.com/neptune-analytics/latest/userguide/bulk-import-create-from-s3.html#create-iam-role-for-s3-access
NEPTUNE_IMPORT_ROLE = "arn:aws:iam::0123456789012:role/<Your-NeptuneAnalytics-Import-Role>"

In [3]:
import boto3
from botocore.config import Config

# Configure boto3 client with retries and error handling
config = Config(
    retries={"total_max_attempts": 1, "mode": "standard"},
    read_timeout=None
)

neptune_graph = boto3.client("neptune-graph")

### Move the training masks to a different path

Because Neptune Analytics expects only graph data to be present under the import S3 prefix, you will need to move the data split files you created for training, validation, and test sets:

In [None]:
OLD_MASKS_S3 = f"{EXPORTED_GRAPH_S3}/data_splits/"
NEW_MASKS_S3 = f"s3://{BUCKET}/neptune-input/{GRAPH_NAME}/data_splits/"

In [None]:
!aws s3 mv --recursive $OLD_MASKS_S3 $NEW_MASKS_S3

### Start the import task

The data on S3 are now ready to be imported into Neptune Analytics

In [4]:
# Start the import task
import_response = neptune_graph.start_import_task(
    graphIdentifier=GRAPH_ID,
    format="CSV",
    source=EXPORTED_GRAPH_S3,
    roleArn=NEPTUNE_IMPORT_ROLE,
)
# Get the task ID from the response
task_id = import_response["taskId"]

In [None]:
# Wait for import task to complete
import_waiter = neptune_graph.get_waiter("import_task_successful")
import_waiter.wait(taskIdentifier=task_id)

The data import should take around ~10 minutes to complete.

In notebook `1 - SageMaker Setup`, you launched a graph notebook. Let's ensure the notebook instance is available before moving on to the final notebook which you will run on the graph notebook instance itself.

In [None]:
sm_client = boto3.client("sagemaker", region_name=AWS_REGION, config=config)

try:
    waiter = sm_client.get_waiter("notebook_instance_in_service")
    waiter.wait(
        NotebookInstanceName=f"{GRAPH_NAME}-{GRAPH_ID}-notebook",
    )
except Exception as e:
    print(f"Error waiting for notebook instance: {str(e)}")
    raise

If the notebook is available, you will be able to to login to it using the URL provided below:

In [None]:
from IPython.display import Markdown, display

try:
    # Get the presigned URL for the notebook instance
    response = sm_client.create_presigned_notebook_instance_url(
        NotebookInstanceName=f"{GRAPH_NAME}-{GRAPH_ID}-notebook"
    )

    # Display as a clickable link with custom text
    display(Markdown(f"[Open JupyterLab Instance]({response['AuthorizedUrl']})"))
except Exception as e:
    print(f"Error creating presigned URL: {str(e)}")
    raise