Neptune client #104

austinkline · 2021-03-31T21:16:46Z

Neptune Client

Refactor all modules calling into various api endpoints to coalesce into one
client objet.
Add a builder object to facilitate creating the client with various options
Remove specification of iam_credentials_provider_type and instead make use of
the default boto3 session for obtaining aws credentials (as we do for Sagemaker integration)
Organize all tests using pytest to more easily filter on what tests should be run or not.

Client

The neptune client can be build either directly with its constructor:

from graph_notebook.neptune.client import Client
c = Client(host=foo)
c.status()

It can also be created using our builder class:

from botocore.session import get_session
from graph_notebook.neptune.client import ClientBuilder

builder = ClientBuilder() \
        .with_host(config.host) \
        .with_port(config.port) \
        .with_region(config.aws_region) \
        .with_tls(config.ssl) \
        .with_iam(get_session())

c = builder.build()
c.status()

The Client object has some components which are Neptune-specific, and some which are not:

Not Neptune Specific

sparql - takes any SPARQL query and interprets whether it should be issued as type query or type update
sparql_query - sends a query request to the configured SPARQL endpoint with the payload {'query': 'YOUR QUERY'}
sparql_update - sends an update request to the configured SPARQL endpoint with the payload {'update': 'YOUR QUERY'}
do_sparql_request - submits the given payload to the configured SPARQL endpoint
get_gremlin_connection - returns a websocket connection to the configured gremlin endpoint.
gremlin_query - obtains a new gremlin connection and submits the given query. The opened connection will be closed
after obtaining query results
gremlin_http_query - executes the given gremlin query via http(s) instead of websocket.
gremlin_status - returns the status of running gremlin queries on the configured Neptune endpoint. Takes an optional
query_id input to obtain the status of a specific query

Neptune specific

sparql_explain - obtains an explain query plan for the given SPARQL query (can be of type update or query)
sparql_status - returns the status of running SPARQL queries on the configured Neptune endpoint. Takes an optional
query_id input to obtain the status of a specific query
sparql_cancel - cancels the running SPARQL query with the provided query_id
gremlin_cancel - cancels the running Gremlin query with the provided query_id
gremlin_explain - obtains an explain query plan for a given Gremlin query
gremlin_profile - obtains a profile query plan for a given Gremlin query
status - retrieves the status of the configured Neptune endpoint
load - submits a new bulk load job with the provided parameters.
load_status - obtains the status of the bulk loader. Takes an optional query_id to obtain the status of a specific loader job
cancel_load - cancels the provided bulk loader job id
initiate_reset - obtains a token needed to execute a fast reset of your configured Neptune endpoint
perform_reset - takes a token obtained from initiate_reset and performs the reset
dataprocessing_start - starts a NeptuneML dataprocessing job with the provided parameters
dataprocessing_job_status - obtains the status of a given dataprocessing job id
dataprocessing_status - obtains the status of the configured Neptune dataprocessing endpoint
dataprocessing_stop - stops the given dataprocessing job id
modeltraining_start - starts a NeptuneML modeltraining job with the provided parameters
modeltraining_job_status - obtains the status of a given modeltraining job id
modeltraining_status - obtains the status of the configured Neptune modeltraining endpoint
modeltraining_stop - stops the given modeltraining job id
endpoints_create - creates a NeptuneML endpoint with the provided parameters
endpoints_status - obtain the status of a given endpoint job
endpoints_delete - delete a given endpoint id
endpoints - obtain the status of all endpoints to the configured Neptune database
export - helper function to call the Neptune exporter for NeptuneML. Note that this is not a Neptune endpoint.
export_status - obtain the status of the configured exporter endpoint.

… into one `client` objet. - Add a builder object to facilitate creating the client with various options - Remove specification of `iam_credentials_provider_type` and instead make use of the default boto3 session for obtaining aws credentials (as we do for Sagemaker integration) - Organize all tests using pytest to more easily filter on what tests should be run or not. The neptune client can be build either directly with its constructor: ```python from graph_notebook.neptune.client import Client c = Client(host=foo) c.status() ``` It can also be created using our builder class: ```python from botocore.session import get_session from graph_notebook.neptune.client import ClientBuilder builder = ClientBuilder() \ .with_host(config.host) \ .with_port(config.port) \ .with_region(config.aws_region) \ .with_tls(config.ssl) \ .with_iam(get_session()) c = builder.build() c.status() ``` The `Client` object has some components which are Neptune-specific, and some which are not: - `sparql` - takes any SPARQL query and interprets whether it should be issued as type `query` or type `update` - `sparql_query` - sends a query request to the configured SPARQL endpoint with the payload `{'query': 'YOUR QUERY'}` - `sparql_update` - sends an update request to the configured SPARQL endpoint with the payload `{'update': 'YOUR QUERY'}` - `do_sparql_request` - submits the given payload to the configured SPARQL endpoint - `get_gremlin_connection` - returns a websocket connection to the configured gremlin endpoint. - `gremlin_query` - obtains a new gremlin connection and submits the given query. The opened connection will be closed after obtaining query results - `gremlin_http_query` - executes the given gremlin query via http(s) instead of websocket. - `gremlin_status` - returns the status of running gremlin queries on the configured Neptune endpoint. Takes an optional `query_id` input to obtain the status of a specific query - `sparql_explain` - obtains an explain query plan for the given SPARQL query (can be of type update or query) - `sparql_status` - returns the status of running SPARQL queries on the configured Neptune endpoint. Takes an optional `query_id` input to obtain the status of a specific query - `sparql_cancel` - cancels the running SPARQL query with the provided query_id - `gremlin_cancel` - cancels the running Gremlin query with the provided `query_id` - `gremlin_explain` - obtains an explain query plan for a given Gremlin query - `gremlin_profile` - obtains a profile query plan for a given Gremlin query - `status` - retrieves the status of the configured Neptune endpoint - `load` - submits a new bulk load job with the provided parameters. - `load_status` - obtains the status of the bulk loader. Takes an optional `query_id` to obtain the status of a specific loader job - `cancel_load` - cancels the provided bulk loader job id - `initiate_reset` - obtains a token needed to execute a fast reset of your configured Neptune endpoint - `perform_reset` - takes a token obtained from `initiate_reset` and performs the reset - `dataprocessing_start` - starts a NeptuneML dataprocessing job with the provided parameters - `dataprocessing_job_status` - obtains the status of a given dataprocessing job id - `dataprocessing_status` - obtains the status of the configured Neptune dataprocessing endpoint - `dataprocessing_stop` - stops the given dataprocessing job id - `modeltraining_start` - starts a NeptuneML modeltraining job with the provided parameters - `modeltraining_job_status` - obtains the status of a given modeltraining job id - `modeltraining_status` - obtains the status of the configured Neptune modeltraining endpoint - `modeltraining_stop` - stops the given modeltraining job id - `endpoints_create` - creates a NeptuneML endpoint with the provided parameters - `endpoints_status` - obtain the status of a given endpoint job - `endpoints_delete` - delete a given endpoint id - `endpoints` - obtain the status of all endpoints to the configured Neptune database - `export` - helper function to call the Neptune exporter for NeptuneML. Note that this is not a Neptune endpoint. - `export_status` - obtain the status of the configured exporter endpoint.

… provided stack

src/graph_notebook/notebooks/04-Machine-Learning/neptune_ml_utils.py

krlawrence · 2021-04-02T19:00:48Z

test/integration/DataDrivenGremlinTest.py

        query_check_for_airports = "g.V('3684').outE().inV().has(id, '3444')"
-        res = do_gremlin_query(query_check_for_airports, self.host, self.port, self.ssl, self.client_provider)
+        res = self.client.gremlin_query(query_check_for_airports)


It would be better not to use explicit ID values here in case the data set ever changes and that route gets deleted. I am not sure what is needed but a different test might be more future proof.

Yeah, this was put in place to ensure that all airports were added (by checking for the last one), we could instead rewrite it to look for the content.

src/graph_notebook/neptune/client.py

austinkline · 2021-04-06T17:22:04Z

Looks like this PR could fix one reported bug:
#101

krlawrence

Looks good to me.

austinkline requested review from bechbd, michaelnchin and krlawrence March 31, 2021 21:16

Kline added 6 commits April 1, 2021 11:06

revert change to setup.py email

01ed421

revert change to setup.py email

72a41ff

remove references to iam_credentials_provider_type

8f830bb

address flake8

0a91da6

rebase from main, fix one unittest

349e87d

austinkline force-pushed the austinkline/neptuneclient branch from 97c4f72 to 349e87d Compare April 1, 2021 18:19

Kline added 7 commits April 1, 2021 12:23

upgrade GraphNotebookIntegrationTest due to import conflicts

859bac5

upload/download generated test configuration

d46adc4

add section in integ test helper to generate configuration based on a…

c3998cf

… provided stack

correct reference to auth mode when generating test config

46b6fe8

update changelog, persist generated config

7ddb083

flake8

de11e71

ensure cluster is iam enabled for iam tests

1d77a23

austinkline commented Apr 2, 2021

View reviewed changes

src/graph_notebook/notebooks/04-Machine-Learning/neptune_ml_utils.py Outdated Show resolved Hide resolved

bechbd reviewed Apr 2, 2021

View reviewed changes

src/graph_notebook/notebooks/04-Machine-Learning/neptune_ml_utils.py Outdated Show resolved Hide resolved

bechbd reviewed Apr 2, 2021

View reviewed changes

src/graph_notebook/notebooks/04-Machine-Learning/neptune_ml_utils.py Outdated Show resolved Hide resolved

found other instances of a rename that needs to be reverted

dd318db

krlawrence reviewed Apr 2, 2021

View reviewed changes

Kline added 2 commits April 2, 2021 12:29

fix a typo found in ml_utils

d7e1d4d

combine gremlin profile/explain into one underlying method

3ff0dcb

austinkline removed the request for review from michaelnchin April 6, 2021 15:26

austinkline mentioned this pull request Apr 6, 2021

[BUG] SPARQL load error due to lack of escaping apostrophe ' #101

Closed

krlawrence approved these changes Apr 7, 2021

View reviewed changes

bechbd approved these changes Apr 7, 2021

View reviewed changes

austinkline merged commit 997ace3 into main Apr 7, 2021

austinkline deleted the austinkline/neptuneclient branch April 7, 2021 18:34

michaelnchin mentioned this pull request Jul 9, 2021

Remove deprecated IAM config option from README #136

Merged

michaelnchin mentioned this pull request Jul 13, 2022

Support different sparql explain return formats #26

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neptune client #104

Neptune client #104

austinkline commented Mar 31, 2021

krlawrence Apr 2, 2021

austinkline Apr 5, 2021

austinkline commented Apr 6, 2021

krlawrence left a comment

Neptune client #104

Neptune client #104

Conversation

austinkline commented Mar 31, 2021

Neptune Client

Client

Not Neptune Specific

Neptune specific

krlawrence Apr 2, 2021

Choose a reason for hiding this comment

austinkline Apr 5, 2021

Choose a reason for hiding this comment

austinkline commented Apr 6, 2021

krlawrence left a comment

Choose a reason for hiding this comment