# Lab 4- Integrate Neptune with OpenSearch
In this lab we will configure the existing OpenSearch index to integrate with the Neptune cluster created in Lab 2. This lab expects that you have already created Workshop 0, or have enabled OpenSearch index by default on deployment.

To integrate Neptune with OpenSearch, you can use an existing OpenSearch Service cluster that has been populated according the Neptune data model for OpenSearch data, or you can create an OpenSearch service domain linked with Neptune using an AWS CLoudFormation stack. In this lab, we will be using an existing cluster.

## Neptune data model for OpenSearch data
Documents in OpenSearch correspond to an entity and store the relevant information for the entity. We compare this to Gremlin, where vertices and edges are considered entities. This means that the OpenSearch documents need to have the information about our vertices and edges in the form of labels and properties.

When we set up Neptune in Lab 2, we included labels and properties so our data would fit with this format.

### Import dependencies
The following libraries are needed for this lab.

In [1]:
import boto3

### Create Clients

In [2]:
neptune = boto3.client('neptune')
ec2 = boto3.client('ec2')
batch = boto3.client('batch')

### Load variables saved in Lab 2
At the end of Lab 2 we saved some variables that we'll need in this lab. The following cell with load those variables into this lab environment.

In [None]:
%store -r

### Enable Neptune streams
In order to add the data in Neptune to our OpenSearch index, Neptune streams need to be enabled.

CLuster parameters such as `neptune_streams` are part of a parameter group. We cannot change the default parameter group, so we first need to create a new one and then update our cluster to use the new parameter group with streams enabled over the default one.

In [4]:
# Create the new parameter group. All parameters will be created by default
parameter_group_response = neptune.create_db_cluster_parameter_group(
    DBClusterParameterGroupName='retail-demo-store-neptune-opensearch-parameter-group',
    DBParameterGroupFamily='neptune1.2',
    Description='Parameter group for Neptune OpenSearch integration'
)

# Enable streams
neptune.modify_db_cluster_parameter_group(
    DBClusterParameterGroupName=parameter_group_response['DBClusterParameterGroup']['DBClusterParameterGroupName'],
    Parameters=[
        {
            'ParameterName': 'neptune_streams',
            'ParameterValue': '1',
            'ApplyMethod': 'pending-reboot'
        },
    ]
)

# Use the newly created parameter group with our existing cluster
neptune.modify_db_cluster(
    DBClusterIdentifier=db_cluster_identifier,
    ApplyImmediately=True,
    DBClusterParameterGroupName=parameter_group_response['DBClusterParameterGroup']['DBClusterParameterGroupName'],
)

{'DBClusterParameterGroup': {'DBClusterParameterGroupName': 'retail-demo-store-neptune-opensearch-parameter-group',
  'DBParameterGroupFamily': 'neptune1.2',
  'Description': 'Parameter group for Neptune OpenSearch integration',
  'DBClusterParameterGroupArn': 'arn:aws:rds:eu-west-1:827561349713:cluster-pg:retail-demo-store-neptune-opensearch-parameter-group'},
 'ResponseMetadata': {'RequestId': 'f9ad0048-3104-46b8-bd28-0d58ac801e0b',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'f9ad0048-3104-46b8-bd28-0d58ac801e0b',
   'strict-transport-security': 'max-age=31536000',
   'content-type': 'text/xml',
   'content-length': '809',
   'date': 'Thu, 11 May 2023 15:19:08 GMT'},
  'RetryAttempts': 0}}

## Amazon Neptune-to-OpenSearch replication setup
Amazon Neptune supports full-text search in Gremlin and SPARQL queries using Amazon OpenSearch Service (OpenSearch Service). You can use an AWS CloudFormation stack to link an OpenSearch Service domain to Neptune.

We will be following the instructions in [this repository](https://github.com/awslabs/amazon-neptune-tools/tree/master/export-neptune-to-elasticsearch) to index existing data in an Amazon Neptune database in ElasticSearch before enabling Neptune's full-text search integration.

### Create keypair for stack creation
A keypair is required as a parameter for the stack.

In [None]:
keypair = ec2.create_key_pair(KeyName='retail-demo-store-neptune-opensearch')

### Launch Stack
#### Stack parameters

In [None]:
keypair['KeyName']


#### The stack


| Region | Stack |
| ---- | ---- |
|US East (N. Virginia) |  [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|US East (Ohio) |  [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-2.console.aws.amazon.com/cloudformation/home?region=us-east-2#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|US West (Oregon) |  [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-west-2.console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|Europe (Ireland) |  [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://eu-west-1.console.aws.amazon.com/cloudformation/home?region=eu-west-1#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|Europe (London) |  [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://eu-west-2.console.aws.amazon.com/cloudformation/home?region=eu-west-2#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|Europe (Frankfurt) |  [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://eu-central-1.console.aws.amazon.com/cloudformation/home?region=eu-central-1#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|Europe (Stockholm) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://eu-north-1.console.aws.amazon.com/cloudformation/home?region=eu-north-1#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|Asia Pacific (Mumbai) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://ap-south-1.console.aws.amazon.com/cloudformation/home?region=ap-south-1#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|Asia Pacific (Seoul) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://ap-northeast-2.console.aws.amazon.com/cloudformation/home?region=ap-northeast-2#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|Asia Pacific (Singapore) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://ap-southeast-1.console.aws.amazon.com/cloudformation/home?region=ap-southeast-1#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|Asia Pacific (Sydney) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://ap-southeast-2.console.aws.amazon.com/cloudformation/home?region=ap-southeast-2#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |
|Asia Pacific (Tokyo) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://ap-northeast-1.console.aws.amazon.com/cloudformation/home?region=ap-northeast-1#/stacks/create/review?templateURL=https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/export-neptune-to-elasticsearch/export-neptune-to-elasticsearch.json&stackName=neptune-index) |


### Create batch job to update index
Once the stack has successfully created we can invoke the Lambda function that starts the batch job. You can find and copy the command as the `Test` property in the stack output


In [21]:
# Inset command below
!

{"jobName": "export-neptune-to-kinesis-22148b50-1683820647036", "jobId": "6671137f-ab01-4162-a583-9429d51351de"}{
    "StatusCode": 200,
    "ExecutedVersion": "$LATEST"
}


#### Wait for batch job to complete
Insert the given job id in the output to observe job status.

In [22]:
import time

status = None
max_time = time.time() + 15*60
while time.time() < max_time:
    response = batch.describe_jobs(
        jobs=[
            # Inset job id here,
        ]
    )
    status = response['jobs'][0]['status']

    print("Status: {}".format(status))

    if status == 'SUCCEEDED':
        break

    time.sleep(30)

ClientError: An error occurred (AccessDeniedException) when calling the DescribeJobs operation: User: arn:aws:sts::827561349713:assumed-role/retaildemostore-Base-13L7GWHV7GHLM-N-ExecutionRole-1OZ31UKGHN2YD/SageMaker is not authorized to perform: batch:DescribeJobs on resource: *

## Lab 3 Summary
In this lab we added the data stored in Neptune to our OpenSearch index.