Skip to content

Commit

Permalink
Neptune client (#104)
Browse files Browse the repository at this point in the history
* - Refactor all modules calling into various api endpoints to coalesce into one
    `client` objet.
- Add a builder object to facilitate creating the client with various options
- Remove specification of `iam_credentials_provider_type` and instead make use of
the default boto3 session for obtaining aws credentials (as we do for Sagemaker integration)
- Organize all tests using pytest to more easily filter on what tests should be run or not.

The neptune client can be build either directly with its constructor:
```python
from graph_notebook.neptune.client import Client
c = Client(host=foo)
c.status()
```

It can also be created using our builder class:
```python
from botocore.session import get_session
from graph_notebook.neptune.client import ClientBuilder

builder = ClientBuilder() \
        .with_host(config.host) \
        .with_port(config.port) \
        .with_region(config.aws_region) \
        .with_tls(config.ssl) \
        .with_iam(get_session())

c = builder.build()
c.status()
```

The `Client` object has some components which are Neptune-specific, and some which are not:

- `sparql` - takes any SPARQL query and interprets whether it should be issued as type `query` or type `update`
- `sparql_query` - sends a query request to the configured SPARQL endpoint with the payload `{'query': 'YOUR QUERY'}`
- `sparql_update` - sends an update request to the configured SPARQL endpoint with the payload `{'update': 'YOUR QUERY'}`
- `do_sparql_request` - submits the given payload to the configured SPARQL endpoint
- `get_gremlin_connection` - returns a websocket connection to the configured gremlin endpoint.
- `gremlin_query` - obtains a new gremlin connection and submits the given query. The opened connection will be closed
  after obtaining query results
- `gremlin_http_query` - executes the given gremlin query via http(s) instead of websocket.
- `gremlin_status` - returns the status of running gremlin queries on the configured Neptune endpoint. Takes an optional
  `query_id` input to obtain the status of a specific query

- `sparql_explain` - obtains an explain query plan for the given SPARQL query (can be of type update or query)
- `sparql_status` - returns the status of running SPARQL queries on the configured Neptune endpoint. Takes an optional
  `query_id` input to obtain the status of a specific query
- `sparql_cancel` - cancels the running SPARQL query with the provided query_id
- `gremlin_cancel` - cancels the running Gremlin query with the provided `query_id`
- `gremlin_explain` - obtains an explain query plan for a given Gremlin query
- `gremlin_profile` - obtains a profile query plan for a given Gremlin query
- `status` - retrieves the status of the configured Neptune endpoint
- `load` - submits a new bulk load job with the provided parameters.
- `load_status` - obtains the status of the bulk loader. Takes an optional `query_id` to obtain the status of a specific loader job
- `cancel_load` - cancels the provided bulk loader job id
- `initiate_reset` - obtains a token needed to execute a fast reset of your configured Neptune endpoint
- `perform_reset` - takes a token obtained from `initiate_reset` and performs the reset
- `dataprocessing_start` - starts a NeptuneML dataprocessing job with the provided parameters
- `dataprocessing_job_status` - obtains the status of a given dataprocessing job id
- `dataprocessing_status` - obtains the status of the configured Neptune dataprocessing endpoint
- `dataprocessing_stop` - stops the given dataprocessing job id
- `modeltraining_start` - starts a NeptuneML modeltraining job with the provided parameters
- `modeltraining_job_status` - obtains the status of a given modeltraining job id
- `modeltraining_status` - obtains the status of the configured Neptune modeltraining endpoint
- `modeltraining_stop` - stops the given modeltraining job id
- `endpoints_create` - creates a NeptuneML endpoint with the provided parameters
- `endpoints_status` - obtain the status of a given endpoint job
- `endpoints_delete` - delete a given endpoint id
- `endpoints` - obtain the status of all endpoints to the configured Neptune database
- `export` - helper function to call the Neptune exporter for NeptuneML. Note that this is not a Neptune endpoint.
- `export_status` - obtain the status of the configured exporter endpoint.
  • Loading branch information
austinkline committed Apr 7, 2021
1 parent ea1be84 commit 997ace3
Show file tree
Hide file tree
Showing 107 changed files with 1,969 additions and 2,276 deletions.
32 changes: 11 additions & 21 deletions .github/workflows/integration.yml
Expand Up @@ -81,37 +81,27 @@ jobs:
- name: Give user /etc/hosts permission
run: |
sudo chmod 777 /etc/hosts
- name: Run Basic Tests
- name: Generate test configuration
run: |
python test/integration/NeptuneIntegrationWorkflowSteps.py \
run-tests \
--pattern "*without_iam.py" \
python test/integration/NeptuneIntegrationWorkflowSteps.py generate-config \
--cfn-stack-name ${{ needs.generate-stack-name.outputs.stack-name }} \
--aws-region ${{ secrets.AWS_REGION }}
- name: Run Networkx Tests
- name: Run Basic Tests
env:
GRAPH_NOTEBOK_CONFIG: /tmp/graph_notebook_config_integration_test.json
run: |
python test/integration/NeptuneIntegrationWorkflowSteps.py \
run-tests \
--pattern "*network*.py" \
--cfn-stack-name ${{ needs.generate-stack-name.outputs.stack-name }} \
--aws-region ${{ secrets.AWS_REGION }}
- name: Run Notebook Tests
pytest test/integration/without_iam
- name: Generate iam test configuration
run: |
python test/integration/NeptuneIntegrationWorkflowSteps.py \
run-tests \
--pattern "*graph_notebook.py" \
python test/integration/NeptuneIntegrationWorkflowSteps.py generate-config \
--cfn-stack-name ${{ needs.generate-stack-name.outputs.stack-name }} \
--aws-region ${{ secrets.AWS_REGION }}
--aws-region ${{ secrets.AWS_REGION }} \
--iam
- name: Run IAM Tests
env:
GRAPH_NOTEBOK_CONFIG: /tmp/graph_notebook_config_integration_test.json
run: |
python test/integration/NeptuneIntegrationWorkflowSteps.py \
run-tests \
--pattern "*with_iam.py" \
--iam \
--cfn-stack-name ${{ needs.generate-stack-name.outputs.stack-name }} \
--aws-region ${{ secrets.AWS_REGION }}
pytest test/integration/iam
- name: Cleanup
run: |
python test/integration/NeptuneIntegrationWorkflowSteps.py \
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/unit.yml
Expand Up @@ -35,4 +35,4 @@ jobs:
python -m graph_notebook.notebooks.install
- name: Test with pytest
run: |
pytest
pytest test/unit
11 changes: 6 additions & 5 deletions ChangeLog.md
Expand Up @@ -4,14 +4,15 @@ Starting with v1.31.6, this file will contain a record of major features and upd

## Upcoming

- Add support for Mode, queueRequest, and Dependencies parameters when running %load command
- Add support for list and dict as map keys in Python Gremlin
- Add support for Mode, queueRequest, and Dependencies parameters when running %load command ([Link to PR](https://github.com/aws/graph-notebook/pull/91))
- Add support for list and dict as map keys in Python Gremlin ([Link to PR](https://github.com/aws/graph-notebook/pull/100))
- Refactor modules that call to Neptune or other SPARQL/Gremlin endpoints to use a unified client object ([Link to PR](https://github.com/aws/graph-notebook/pull/104))

## Release 2.0.12 (Mar 25, 2021)

- Add default parameters for `get_load_status`
- Add ipython as a dependency in `setup.py` ([Link to PT](https://github.com/aws/graph-notebook/pull/95))
- Add parameters in `load_status` for `details`, `errors`, `page`, and `errorsPerPage`
- Add default parameters for `get_load_status` ([Link to PR](https://github.com/aws/graph-notebook/pull/96))
- Add ipython as a dependency in `setup.py` ([Link to PR](https://github.com/aws/graph-notebook/pull/95))
- Add parameters in `load_status` for `details`, `errors`, `page`, and `errorsPerPage` ([Link to PR](https://github.com/aws/graph-notebook/pull/88))

## Release 2.0.10 (Mar 18, 2021)

Expand Down
11 changes: 11 additions & 0 deletions pytest.ini
@@ -0,0 +1,11 @@
[pytest]
markers =
neptune: tests which have to run against neptune
iam: tests which require iam authentication
gremlin: tests which run against a gremlin endpoint
sparql: tests which run against SPARQL1.1 endpoint
neptuneml: tests which run Neptune ML workloads
jupyter: tests which run against ipython/jupyter frameworks
reset: test which performs a fast reset against Neptune, running this will wipe your database!


3 changes: 2 additions & 1 deletion requirements.txt
Expand Up @@ -9,8 +9,9 @@ jupyter-contrib-nbextensions
widgetsnbextension
gremlinpython
requests==2.24.0
ipython==7.16.1

# requirements for testing
boto3==1.15.15
botocore==1.18.18
ipython==7.16.1
pytest==6.2.2
3 changes: 3 additions & 0 deletions setup.py
Expand Up @@ -93,4 +93,7 @@ def get_version():
'License :: OSI Approved :: Apache Software License'
],
keywords='jupyter neptune gremlin sparql',
tests_require=[
'pytest'
]
)

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

0 comments on commit 997ace3

Please sign in to comment.