Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: Apache-2.0

# Neptune Bulk data loader for Manufacturing Digital Thread

## Introduction

Manufacturing organizations have vast amounts of knowledge dispersed across the product lifecycle, which can result in limited visibility, knowledge gaps, and the inability to continuously improve. A digital thread offers an integrated approach to combine disparate data sources across enterprise systems to drive traceability, accessibility, collaboration, and agility.

In this sample project, learn how to create an intelligent manufacturing digital thread using a combination of knowledge graph and generative AI technologies based on data generated throughout the product lifecycle, and their interconnected relationship. Explore use cases and discover actionable steps to start your intelligent digital thread journey.

As an introductory guide, this notebook will guide you through some of the most common steps you will perform when working on these projects.  

1. Loading data to a Neptune Knowledge graph
2. Visualizing the results
3. Running openCypher queries and algorithms

## Check Connection to the Graph

Run below commands one at a time. Let's start by first validating the connection by checking the status API endpoint of your graph.

In [None]:
%status

Examining the response we should see that the graph status is currently `healthy` as well as some metadata such as versions and the start time for the cluster.


<details>
    
You can get help at any time using the `--help` option.

```
%status --help
```

**Note:**  If you are using a cell magic the cell body needs at least one character in it for `--help` to work.

```
%%oc --help
x
```
    
</details>

## Set data source s3 bucket
The cell below list the Amazon S3 buckets. 

In [None]:
# Locate data source in s3 bucket ( e.g. mfg-digitalthread-data-<account_id> for sample_data)
!aws s3 ls

Before running the below command, please replace account_id with the aws account id. Refer the Amazon S3 bucket name from the previous list command.

In [None]:
# input your neptune bulk import data source in Amazon S3 (e.g. mfg-digitalthread-data-<account_id>). Make sure to replace the <account_id>
s3_bucket = "mfg-digitalthread-data-<account_id>"
s3_source = f"s3://{s3_bucket}/sample_data"

In [None]:
# list the edges and vertices files 
!aws s3 ls {s3_source} --recursive --human-readable --summarize

## Load data 
The cell below loads the sample digital thread data into your Neptune cluster. When you run the cell it will automatically install the `mfg_digital_thread` dataset into your graph which takes a few seconds.

In [None]:
# bulk import - vertices
%load -f csv -s {s3_source}/vertices --run 

Please wait until the vertices are loaded successfully!!!!

In [None]:
# bulk import - edges
%load -f csv -s {s3_source}/edges --run

Please wait until the edges are loaded successfully!!!!

In [None]:
# refresh statistics is required to make sure the graph is updated
%statistics --mode refresh

# Verify data
Please wait for 2 minutes before running the summary command. Summary command lists the nodes and edges imported into the Neptune Graph database. 

In [None]:
%summary

# Visualize the graph
The cell below displays the graph with the vertices and edges along with the properties. Please click on "Graph" tab to view the graph.

In [None]:
%%gremlin -p v,oute,inv
g.V().outE().inV().path().
by(valueMap(true)).
by().
by(valueMap(true))

# Query the graph
The below opencypher queries are just sample queries. It is not mandatory to run these queries as part of this workshop.

1. Who can access the project Turbo-Project?

In [None]:
%%oc
MATCH (p:Project {name: 'Turbo-Project'})-[r:team_member]->(e:Employee) 
RETURN e.name AS employee_name, 
r.access AS access

2. Can Emily access the project Turbo-Project?

In [None]:
%%oc
MATCH (p:Project {name: 'Turbo-Project'})-[r:team_member]->(e:Employee {name: 'Emily'}) 
RETURN r.access

3. Who are the suppliers for the part Turbo-Motor-11234?

In [None]:
%%oc
MATCH (p:Part {name:"Turbo-Motor-11234"})-[:supplied_by]->(s:Supplier) 
RETURN s.name

4. Which supplier is recommended for part Turbo-Motor-11234 based on quality score?

In [None]:
%%oc 
MATCH (p:Part {name:"Turbo-Motor-11234"})-[:supplied_by]->(s:Supplier) WITH s, s.qualityscore AS score ORDER BY score DESC LIMIT 1 
RETURN s.name AS RecommendedSupplier

5. What is the lead time and corrective action response time for Max Holdings?

In [None]:
%%oc
MATCH (s:Supplier {name:"Max Holdings"})-[:supplier_kpi]->(k:supplierkpi) 
RETURN k.leadtime AS lead_time, 
k.correctiveactionresponsetime AS response_time