Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalising collections and adding ElasticsearchCollection #660

Merged
merged 5 commits into from
Mar 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
46 changes: 38 additions & 8 deletions .github/workflows/deps_lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ jobs:

services:
mongo:
image: mongo:4.2
image: mongo:4
ports:
- 27017:27017
postgres:
Expand All @@ -156,6 +156,14 @@ jobs:
--health-retries 5
ports:
- 5432:5432
elasticsearch:
image: elasticsearch:6.8.13
ports:
- 9200:9200
- 9300:9300
env:
discovery.type: single-node


steps:
- uses: actions/checkout@v2
Expand All @@ -176,15 +184,21 @@ jobs:
pip install -r requirements.txt
pip install -r requirements-dev.txt

- name: Run all tests (using `mongomock`)
run: pytest -rs -vvv --cov=./optimade/ --cov-report=xml tests/
env:
OPTIMADE_DATABASE_BACKEND: 'mongomock'

ml-evs marked this conversation as resolved.
Show resolved Hide resolved

- name: Run server tests (using a real MongoDB)
run: pytest -rs -vvv --cov=./optimade/ --cov-report=xml tests/server
run: pytest -rs -vvv --cov=./optimade/ --cov-report=xml --cov-append tests/server
env:
OPTIMADE_CI_FORCE_MONGO: 1
OPTIMADE_DATABASE_BACKEND: 'mongodb'

- name: Run all tests (using `mongomock`)
run: pytest -rs -vvv --cov=./optimade/ --cov-report=xml --cov-append tests/
- name: Run server tests (using Elasticsearch)
run: pytest -rs -vvv --cov=./optimade/ --cov-report=xml --cov-append tests/server
env:
OPTIMADE_CI_FORCE_MONGO: 0
OPTIMADE_DATABASE_BACKEND: 'elastic'

- name: Install adapter conversion dependencies
run: |
Expand All @@ -200,9 +214,25 @@ jobs:
- name: Run previously skipped tests for adapter conversion
run: pytest -rs -vvv --cov=./optimade/ --cov-report=xml --cov-append tests/adapters/

- name: Run tests for validator only to assess coverage
- name: Run tests for validator only to assess coverage (mongomock)
if: matrix.python-version == 3.8
run: pytest -rs --cov=./optimade/ --cov-report=xml:validator_cov.xml tests/server/test_server_validation.py
run: pytest -rs --cov=./optimade/ --cov-report=xml:validator_cov.xml --cov-append tests/server/test_server_validation.py
env:
OPTIMADE_DATABASE_BACKEND: 'mongomock'

- name: Run tests for validator only to assess coverage (Elasticsearch)
if: matrix.python-version == 3.8
run: pytest -rs --cov=./optimade/ --cov-report=xml:validator_cov.xml --cov-append tests/server/test_server_validation.py
env:
OPTIMADE_DATABASE_BACKEND: 'elastic'
OPTIMADE_INSERT_TEST_DATA: false # Must be specified as previous steps will have already inserted the test data

- name: Run tests for validator only to assess coverage (MongoDB)
if: matrix.python-version == 3.8
run: pytest -rs --cov=./optimade/ --cov-report=xml:validator_cov.xml --cov-append tests/server/test_server_validation.py
env:
OPTIMADE_DATABASE_BACKEND: 'mongodb'
OPTIMADE_INSERT_TEST_DATA: false # Must be specified as previous steps will have already inserted the test data

- name: Upload coverage to Codecov
if: matrix.python-version == 3.8 && github.repository == 'Materials-Consortia/optimade-python-tools'
Expand Down
2 changes: 1 addition & 1 deletion INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ py.test
# Install pre-commit environment (e.g., auto-formats code on `git commit`)
pre-commit install

# Optional: Install MongoDB (and set `use_real_mongo = true`)
# Optional: Install MongoDB (and set `database_backend = mongodb`)
# Below method installs in conda environment and
# - starts server in background
# - ensures and uses ~/dbdata directory to store data
Expand Down
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,8 @@ The aim of OPTIMADE is to develop a common API, compliant with the [JSON API 1.0
This is to enable interoperability among databases that contain calculated properties of existing and hypothetical materials.

This repository contains a library of tools for implementing and consuming [OPTIMADE](https://www.optimade.org) APIs using Python.
It also contains a server validator tool, which may be called from the shell or used as a GitHub Action from [optimade-validator-action](https://github.com/Materials-Consortia/optimade-validator-action).

_Disclaimer_: While the package supports `elasticsearch-dsl` v6 & v7 and `django` v2 & v3, all tests are performed with the latest supported version.
If you experience any issues with the older versions, you are most welcome to contribute to the repository (see below under [Contributing](#contributing)).
ml-evs marked this conversation as resolved.
Show resolved Hide resolved
Server implementations can make use of the supported MongoDB (v4) and Elasticsearch (v6) database backends, or plug in a custom backend implementation.
The package also contains a server validator tool, which may be called from the shell (`optimade-validator`) or used as a GitHub Action from [optimade-validator-action](https://github.com/Materials-Consortia/optimade-validator-action).

## Status

Expand Down
2 changes: 1 addition & 1 deletion docs/static/default_config.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"config_file": "~/.optimade.json",
"debug": false,
"use_real_mongo": false,
"insert_test_data": true,
"mongo_database": "optimade",
"mongo_uri": "localhost:27017",
"links_collection": "links",
Expand Down
2 changes: 1 addition & 1 deletion optimade/filtertransformers/django.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import warnings

warnings.warn(
"Django functionality is deprecated and will be removed in later versions (unless support is requested).",
"Django functionality is deprecated and will be removed in a later version (unless support is requested).",
DeprecationWarning,
stacklevel=2,
)
Expand Down
80 changes: 45 additions & 35 deletions optimade/filtertransformers/elasticsearch.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

from lark import v_args
from elasticsearch_dsl import Q, Text, Keyword, Integer, Field
from optimade.models import CHEMICAL_SYMBOLS, ATOMIC_NUMBERS
from optimade.filtertransformers import BaseTransformer
from optimade.server.exceptions import BadRequest

Expand All @@ -14,7 +13,7 @@
_has_operators = {"ALL": "must", "ANY": "should"}
_length_quantities = {
"elements": "nelements",
"elements_rations": "nelements",
"elements_ratios": "nelements",
"dimension_types": "dimension_types",
}

Expand Down Expand Up @@ -130,42 +129,50 @@ def _has_query_op(self, quantities, op, predicate_zip_list):
# in elastic search. Only supported for elements, where we can construct
# an anonymous "formula" based on elements sorted by order number and
# can do a = comparision to check if all elements are contained
if len(quantities) > 1:
raise NotImplementedError("HAS ONLY is not supported with zip")
quantity = quantities[0]

if quantity.has_only_quantity is None:
raise NotImplementedError(
"HAS ONLY is not supported by %s" % quantity.name
)

def values():
for predicates in predicate_zip_list:
if len(predicates) != 1:
raise NotImplementedError("Tuples not supported in HAS ONLY")
op, value = predicates[0]
if op != "=":
raise NotImplementedError(
"Predicated not supported in HAS ONLY"
)
if not isinstance(value, str):
raise NotImplementedError("Only strings supported in HAS ONLY")
yield value

try:
order_numbers = list([ATOMIC_NUMBERS[element] for element in values()])
order_numbers.sort()
value = "".join(
[CHEMICAL_SYMBOLS[number - 1] for number in order_numbers]
)
except KeyError:
raise NotImplementedError(
"HAS ONLY is only supported for chemical symbols"
)
# @ml-evs: Disabling this HAS ONLY workaround as tests are not passing
raise NotImplementedError(
"HAS ONLY queries are not currently supported by the Elasticsearch backend."
)
CasperWA marked this conversation as resolved.
Show resolved Hide resolved

return Q("term", **{quantity.has_only_quantity.name: value})
# from optimade.models import CHEMICAL_SYMBOLS, ATOMIC_NUMBERS

# if len(quantities) > 1:
# raise NotImplementedError("HAS ONLY is not supported with zip")
# quantity = quantities[0]

# if quantity.has_only_quantity is None:
# raise NotImplementedError(
# "HAS ONLY is not supported by %s" % quantity.name
# )

# def values():
# for predicates in predicate_zip_list:
# if len(predicates) != 1:
# raise NotImplementedError("Tuples not supported in HAS ONLY")
# op, value = predicates[0]
# if op != "=":
# raise NotImplementedError(
# "Predicated not supported in HAS ONLY"
# )
# if not isinstance(value, str):
# raise NotImplementedError("Only strings supported in HAS ONLY")
# yield value

# try:
# order_numbers = list([ATOMIC_NUMBERS[element] for element in values()])
# order_numbers.sort()
# value = "".join(
# [CHEMICAL_SYMBOLS[number - 1] for number in order_numbers]
# )
# except KeyError:
# raise NotImplementedError(
# "HAS ONLY is only supported for chemical symbols"
# )

# return Q("term", **{quantity.has_only_quantity.name: value})
else:
raise NotImplementedError
raise NotImplementedError(f"Unrecognised operation {op}.")

queries = [
self._has_query(quantities, predicates) for predicates in predicate_zip_list
Expand Down Expand Up @@ -320,12 +327,15 @@ def set_zip_op_rhs(self, args):
return lambda quantity: self._has_query_op([quantity] + add_on, op, values)

def property_zip_addon(self, args):
raise NotImplementedError("Correlated list queries are not supported.")
return args

def value_zip(self, args):
raise NotImplementedError("Correlated list queries are not supported.")
return self.value_list(args)

def value_zip_list(self, args):
raise NotImplementedError("Correlated list queries are not supported.")
return args

def value_list(self, args):
Expand Down
8 changes: 4 additions & 4 deletions optimade/filtertransformers/mongo.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,11 +61,11 @@ def value_list(self, arg):

def value_zip(self, arg):
# value_zip: [ OPERATOR ] value ":" [ OPERATOR ] value (":" [ OPERATOR ] value)*
raise NotImplementedError
raise NotImplementedError("Correlated list queries are not supported.")

def value_zip_list(self, arg):
# value_zip_list: value_zip ( "," value_zip )*
raise NotImplementedError
raise NotImplementedError("Correlated list queries are not supported.")

def expression(self, arg):
# expression: expression_clause ( OR expression_clause )
Expand Down Expand Up @@ -158,11 +158,11 @@ def length_op_rhs(self, arg):
def set_zip_op_rhs(self, arg):
# set_zip_op_rhs: property_zip_addon HAS ( value_zip | ONLY value_zip_list | ALL value_zip_list |
# ANY value_zip_list )
raise NotImplementedError
raise NotImplementedError("Correlated list queries are not supported.")

def property_zip_addon(self, arg):
# property_zip_addon: ":" property (":" property)*
raise NotImplementedError
raise NotImplementedError("Correlated list queries are not supported.")

def _recursive_expression_phrase(self, arg):
"""Helper function for parsing `expression_phrase`. Recursively sorts out
Expand Down
48 changes: 45 additions & 3 deletions optimade/server/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,13 @@ class LogLevel(Enum):
CRITICAL = "critical"


class SupportedBackend(Enum):
ml-evs marked this conversation as resolved.
Show resolved Hide resolved

ELASTIC = "elastic"
MONGODB = "mongodb"
MONGOMOCK = "mongomock"


class ServerConfig(BaseSettings):
"""This class stores server config parameters in a way that
can be easily extended for new config file types.
Expand All @@ -46,11 +53,28 @@ class ServerConfig(BaseSettings):
None, description="File to load alternative defaults from"
)
debug: bool = Field(
False, description="Turns on Debug Mode for the OPTIMADE Server implementation"
False,
description="Turns on Debug Mode for the OPTIMADE Server implementation",
)
use_real_mongo: bool = Field(
False, description="Use a real Mongo server rather than MongoMock"

insert_test_data: bool = Field(
True,
description="Insert test data into each collection on server initialisation. If true, the configured backend will be populated with test data on server start. Should be disabled for production usage.",
)

use_real_mongo: Optional[bool] = Field(
None, description="DEPRECATED: force usage of MongoDB over any other backend."
)

database_backend: SupportedBackend = Field(
CasperWA marked this conversation as resolved.
Show resolved Hide resolved
SupportedBackend.MONGOMOCK,
description="Which database backend to use out of the supported backends.",
)

elastic_hosts: Optional[List[Dict]] = Field(
None, description="Host settings to pass through to the `Elasticsearch` class."
)

mongo_database: str = Field(
"optimade", description="Mongo database for collection data"
)
Expand Down Expand Up @@ -146,6 +170,24 @@ def set_implementation_version(cls, v):
res.update(v)
return res

@root_validator(pre=True)
def use_real_mongo_override(cls, values):
"""Overrides the `database_backend` setting with MongoDB and
raises a deprecation warning.

"""
use_real_mongo = values.pop("use_real_mongo", None)
if use_real_mongo is not None:
warnings.warn(
"'use_real_mongo' is deprecated, please set the appropriate 'database_backend' instead.",
DeprecationWarning,
)

if use_real_mongo:
values["database_backend"] = SupportedBackend.MONGODB

return values

@root_validator(pre=True)
def load_settings(cls, values):
"""
Expand Down
3 changes: 3 additions & 0 deletions optimade/server/data/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
try:
with open(Path(__file__).parent / path) as f:
globals()[var] = bson.json_util.loads(f.read())

if var == "structures":
globals()[var] = sorted(globals()[var], key=lambda x: x["task_id"])
ml-evs marked this conversation as resolved.
Show resolved Hide resolved
except FileNotFoundError:
if var != "providers":
raise
5 changes: 2 additions & 3 deletions optimade/server/entry_collections/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
from .entry_collections import EntryCollection
from .mongo import MongoCollection, client, CI_FORCE_MONGO
from .entry_collections import EntryCollection, create_collection

__all__ = ["EntryCollection", "MongoCollection", "client", "CI_FORCE_MONGO"]
__all__ = ("EntryCollection", "create_collection")