Name		Name	Last commit message	Last commit date
parent directory ..
cloudformation-templates		cloudformation-templates
lambda		lambda
LICENSE		LICENSE
NOTICE		NOTICE
export-neptune-to-elasticsearch.png		export-neptune-to-elasticsearch.png
readme.md		readme.md

readme.md

Export Neptune to ElasticSearch

The Neptune Full-Text Search CloudFormation templates provide a mechanism for indexing all new data that is added to an Amazon Neptune database in ElasticSearch. However, there are situations in which you may want to index existing data in a Neptune database prior to enabling the full-text search integration.

This solution allows you to index existing data in an Amazon Neptune database in ElasticSearch before enabling Neptune's full-text search integration.

Once you have populated ElasticSearch with your existing Neptune data, you can remove this solution from your account.

Prerequisites

Before provisioning the solution ensure the following conditions are met:

You have an existing Neptune cluster and an existing ElasticSearch cluster in the same VPC
ElasticSearch is version 7.1 or above
You have at least one subnet with a route to the internet:
- Either, a subnet with the Auto-assign public IPv4 address set to Yes, a route table with a route destination of 0.0.0.0/0, and an internet gateway set to Target (for example, igw-1a2b3c4d).
- Or, a subnet with the Auto-assign public IPv4 address set to No, a route table with a route destination of 0.0.0.0/0, and a NAT gateway set to Target (for example, nat-12345678901234567). For more details, see Routing.
You have VPC security groups that can be used to access your Neptune and ElasticSearch clusters.

This solution uses neptune-export to export data from your Neptune database. We recommend using neptune-export against a static version of your data. Either suspend writes to your database while the export is taking place, or run the export against a snapshot or clone of your database.

neptune-export uses long-running queries to get data from Neptune. You may need to increase the neptune_query_timeout DB parameter in order to run the export solution against large datasets.

The export process uses SSL to connect to Neptune. It currently supports IAM Database Authentication for Gremlin, but not SPARQL.

Installation

Launch the Neptune-to-ElasticSearch CloudFormation stack for your Region from the table below.
Once the stack has been provisioned, open a terminal and run the StartExportCommand AWS Command Line Interface (CLI) command from the CloudFormation output. For example:
```
aws lambda invoke \
  --function-name arn:aws:lambda:eu-west-1:000000000000:function:export-neptune-to-kinesis-xxxx \
  --region eu-west-1 \
  /dev/stdout
```
The function returns the name and ID of an AWS Batch job that begins the export from Neptune.
Once you have successfully populated ElasticSearch with existing data in your Neptune database, you can remove this solution from your account by deleting the CloudFormation stack.

Region	Stack
US East (N. Virginia)
US East (Ohio)
US West (Oregon)
Europe (Ireland)
Europe (London)
Europe (Frankfurt)
Europe (Stockholm)
Asia Pacific (Mumbai)
Asia Pacific (Seoul)
Asia Pacific (Singapore)
Asia Pacific (Sydney)
Asia Pacific (Tokyo)

Solution overview

You trigger the export process via an AWS Lambda Function
The export process uses AWS Batch to host and execute neptune-export, which exports data from Neptune and publishes it to an Amazon Kinesis Data Stream in the Neptune Streams format.
A second AWS Lambda function polls the Kinesis Stream and publishes records to your Amazon ElasticSearch cluster. This function uses the same parsing and publishing code as the Neptune Streams ElasticSearch integration solution.

Monitoring and troubleshooting

To diagnose issues with the export from Neptune to Kinesis, consult the Amazon CloudWatch logs for your AWS Batch export-neptune-to-kinesis-job. These logs will indicate whether neptune-export was successfully downloaded to the batch instance, and the progress of the export job. When reviewing the logs, ensure that:

neptune-export has been successfully downloaded to the Batch compute instance
neptune-export has successfully exported nodes and relationships from Neptune and published them to Kinesis

If your job is stuck in a RUNNABLE state, you may need to review the network and security settings for your AWS Batch compute environment. See Verify the network and security settings of the compute environment in this knowledge article.

To diagnose issues with the indexing of data in Amazon ElasticSearch, consult the Amazon CloudWatch logs for your kinesis-to-elasticsearch AWS Lambda function. These logs will show the Lambda connecting to ElasticSearch, and will indicate how many records from the Kinesis Stream have been processed.

Example performance

Neptune	ElasticSearch	Vertices	Edges	Concurrency	Kinesis Shards	Batch Size	Duration
r4.2xlarge	5.large	21932	66622	2	8	100	47 seconds
r5.12xlarge	r5.4xlarge	281,707,103	1,770,726,703	4	32	200	4 hours

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

export-neptune-to-elasticsearch

export-neptune-to-elasticsearch

readme.md

Export Neptune to ElasticSearch

Prerequisites

Installation

Solution overview

Monitoring and troubleshooting

Example performance

Files

export-neptune-to-elasticsearch

Directory actions

More options

Directory actions

More options

Latest commit

History

export-neptune-to-elasticsearch

Folders and files

parent directory

readme.md

Export Neptune to ElasticSearch

Prerequisites

Installation

Solution overview

Monitoring and troubleshooting

Example performance