Neptune client (#104)

* - Refactor all modules calling into various api endpoints to coalesce into one `client` objet. - Add a builder object to facilitate creating the client with various options - Remove specification of `iam_credentials_provider_type` and instead make use of the default boto3 session for obtaining aws credentials (as we do for Sagemaker integration) - Organize all tests using pytest to more easily filter on what tests should be run or not. The neptune client can be build either directly with its constructor: ```python from graph_notebook.neptune.client import Client c = Client(host=foo) c.status() ``` It can also be created using our builder class: ```python from botocore.session import get_session from graph_notebook.neptune.client import ClientBuilder builder = ClientBuilder() \ .with_host(config.host) \ .with_port(config.port) \ .with_region(config.aws_region) \ .with_tls(config.ssl) \ .with_iam(get_session()) c = builder.build() c.status() ``` The `Client` object has some components which are Neptune-specific, and some which are not: - `sparql` - takes any SPARQL query and interprets whether it should be issued as type `query` or type `update` - `sparql_query` - sends a query request to the configured SPARQL endpoint with the payload `{'query': 'YOUR QUERY'}` - `sparql_update` - sends an update request to the configured SPARQL endpoint with the payload `{'update': 'YOUR QUERY'}` - `do_sparql_request` - submits the given payload to the configured SPARQL endpoint - `get_gremlin_connection` - returns a websocket connection to the configured gremlin endpoint. - `gremlin_query` - obtains a new gremlin connection and submits the given query. The opened connection will be closed after obtaining query results - `gremlin_http_query` - executes the given gremlin query via http(s) instead of websocket. - `gremlin_status` - returns the status of running gremlin queries on the configured Neptune endpoint. Takes an optional `query_id` input to obtain the status of a specific query - `sparql_explain` - obtains an explain query plan for the given SPARQL query (can be of type update or query) - `sparql_status` - returns the status of running SPARQL queries on the configured Neptune endpoint. Takes an optional `query_id` input to obtain the status of a specific query - `sparql_cancel` - cancels the running SPARQL query with the provided query_id - `gremlin_cancel` - cancels the running Gremlin query with the provided `query_id` - `gremlin_explain` - obtains an explain query plan for a given Gremlin query - `gremlin_profile` - obtains a profile query plan for a given Gremlin query - `status` - retrieves the status of the configured Neptune endpoint - `load` - submits a new bulk load job with the provided parameters. - `load_status` - obtains the status of the bulk loader. Takes an optional `query_id` to obtain the status of a specific loader job - `cancel_load` - cancels the provided bulk loader job id - `initiate_reset` - obtains a token needed to execute a fast reset of your configured Neptune endpoint - `perform_reset` - takes a token obtained from `initiate_reset` and performs the reset - `dataprocessing_start` - starts a NeptuneML dataprocessing job with the provided parameters - `dataprocessing_job_status` - obtains the status of a given dataprocessing job id - `dataprocessing_status` - obtains the status of the configured Neptune dataprocessing endpoint - `dataprocessing_stop` - stops the given dataprocessing job id - `modeltraining_start` - starts a NeptuneML modeltraining job with the provided parameters - `modeltraining_job_status` - obtains the status of a given modeltraining job id - `modeltraining_status` - obtains the status of the configured Neptune modeltraining endpoint - `modeltraining_stop` - stops the given modeltraining job id - `endpoints_create` - creates a NeptuneML endpoint with the provided parameters - `endpoints_status` - obtain the status of a given endpoint job - `endpoints_delete` - delete a given endpoint id - `endpoints` - obtain the status of all endpoints to the configured Neptune database - `export` - helper function to call the Neptune exporter for NeptuneML. Note that this is not a Neptune endpoint. - `export_status` - obtain the status of the configured exporter endpoint.
aws · Apr 7, 2021 · 997ace3 · 997ace3
1 parent ea1be84
commit 997ace3
Show file tree

Hide file tree

Showing 107 changed files with 1,969 additions and 2,276 deletions.
diff --git a/.github/workflows/integration.yml b/.github/workflows/integration.yml
@@ -81,37 +81,27 @@ jobs:
       - name: Give user /etc/hosts permission
         run: |
           sudo chmod 777 /etc/hosts
-      - name: Run Basic Tests
+      - name: Generate test configuration
         run: |
-          python test/integration/NeptuneIntegrationWorkflowSteps.py \
-            run-tests \
-            --pattern "*without_iam.py" \
+          python test/integration/NeptuneIntegrationWorkflowSteps.py generate-config \
             --cfn-stack-name ${{ needs.generate-stack-name.outputs.stack-name }} \
             --aws-region ${{ secrets.AWS_REGION }}
-      - name: Run Networkx Tests
+      - name: Run Basic Tests
+        env:
+          GRAPH_NOTEBOK_CONFIG: /tmp/graph_notebook_config_integration_test.json
         run: |
-          python test/integration/NeptuneIntegrationWorkflowSteps.py \
-            run-tests \
-            --pattern "*network*.py" \
-            --cfn-stack-name ${{ needs.generate-stack-name.outputs.stack-name }} \
-            --aws-region ${{ secrets.AWS_REGION }}
-      - name: Run Notebook Tests
+          pytest test/integration/without_iam
+      - name: Generate iam test configuration
         run: |
-          python test/integration/NeptuneIntegrationWorkflowSteps.py \
-            run-tests \
-            --pattern "*graph_notebook.py" \
+          python test/integration/NeptuneIntegrationWorkflowSteps.py generate-config \
             --cfn-stack-name ${{ needs.generate-stack-name.outputs.stack-name }} \
-            --aws-region ${{ secrets.AWS_REGION }}
+            --aws-region ${{ secrets.AWS_REGION }} \
+            --iam
       - name: Run IAM Tests
         env:
           GRAPH_NOTEBOK_CONFIG: /tmp/graph_notebook_config_integration_test.json
         run: |
-          python test/integration/NeptuneIntegrationWorkflowSteps.py \
-            run-tests \
-            --pattern "*with_iam.py" \
-            --iam \
-            --cfn-stack-name ${{ needs.generate-stack-name.outputs.stack-name }} \
-            --aws-region ${{ secrets.AWS_REGION }}
+          pytest test/integration/iam
       - name: Cleanup
         run: |
           python test/integration/NeptuneIntegrationWorkflowSteps.py \

diff --git a/.github/workflows/unit.yml b/.github/workflows/unit.yml
@@ -35,4 +35,4 @@ jobs:
           python -m graph_notebook.notebooks.install
       - name: Test with pytest
         run: |
-          pytest
+          pytest test/unit
diff --git a/ChangeLog.md b/ChangeLog.md
@@ -4,14 +4,15 @@ Starting with v1.31.6, this file will contain a record of major features and upd
 
 ## Upcoming
 
-- Add support for Mode, queueRequest, and Dependencies parameters when running %load command
-- Add support for list and dict as map keys in Python Gremlin
+- Add support for Mode, queueRequest, and Dependencies parameters when running %load command ([Link to PR](https://github.com/aws/graph-notebook/pull/91))
+- Add support for list and dict as map keys in Python Gremlin ([Link to PR](https://github.com/aws/graph-notebook/pull/100))
+- Refactor modules that call to Neptune or other SPARQL/Gremlin endpoints to use a unified client object ([Link to PR](https://github.com/aws/graph-notebook/pull/104))
 
 ## Release 2.0.12 (Mar 25, 2021)
 
- - Add default parameters for `get_load_status`
- - Add ipython as a dependency in `setup.py` ([Link to PT](https://github.com/aws/graph-notebook/pull/95))
- - Add parameters in `load_status` for `details`, `errors`, `page`, and `errorsPerPage`
+ - Add default parameters for `get_load_status` ([Link to PR](https://github.com/aws/graph-notebook/pull/96))
+ - Add ipython as a dependency in `setup.py` ([Link to PR](https://github.com/aws/graph-notebook/pull/95))
+ - Add parameters in `load_status` for `details`, `errors`, `page`, and `errorsPerPage` ([Link to PR](https://github.com/aws/graph-notebook/pull/88))
 
 ## Release 2.0.10 (Mar 18, 2021)
 

diff --git a/pytest.ini b/pytest.ini
@@ -0,0 +1,11 @@
+[pytest]
+markers =
+    neptune: tests which have to run against neptune
+    iam: tests which require iam authentication
+    gremlin: tests which run against a gremlin endpoint
+    sparql: tests which run against SPARQL1.1 endpoint
+    neptuneml: tests which run Neptune ML workloads
+    jupyter: tests which run against ipython/jupyter frameworks
+    reset: test which performs a fast reset against Neptune, running this will wipe your database!
+
+
diff --git a/requirements.txt b/requirements.txt
@@ -9,8 +9,9 @@ jupyter-contrib-nbextensions
 widgetsnbextension
 gremlinpython
 requests==2.24.0
+ipython==7.16.1
 
 # requirements for testing
 boto3==1.15.15
 botocore==1.18.18
-ipython==7.16.1
+pytest==6.2.2
diff --git a/setup.py b/setup.py
@@ -93,4 +93,7 @@ def get_version():
         'License :: OSI Approved :: Apache Software License'
     ],
     keywords='jupyter neptune gremlin sparql',
+    tests_require=[
+        'pytest'
+    ]
 )
diff --git a/src/graph_notebook/authentication/iam_credentials_provider/__init__.py b/src/graph_notebook/authentication/iam_credentials_provider/__init__.py
diff --git a/src/graph_notebook/authentication/iam_credentials_provider/credentials_factory.py b/src/graph_notebook/authentication/iam_credentials_provider/credentials_factory.py
diff --git a/src/graph_notebook/authentication/iam_credentials_provider/credentials_provider.py b/src/graph_notebook/authentication/iam_credentials_provider/credentials_provider.py
diff --git a/...aph_notebook/authentication/iam_credentials_provider/ec2_metadata_credentials_provider.py b/...aph_notebook/authentication/iam_credentials_provider/ec2_metadata_credentials_provider.py
diff --git a/src/graph_notebook/authentication/iam_credentials_provider/env_credentials_provider.py b/src/graph_notebook/authentication/iam_credentials_provider/env_credentials_provider.py