Integrate unified backends#416
Merged
quitterie-lcs merged 30 commits intomasterfrom Oct 12, 2023
Merged
Conversation
Contributor
25726e7 to
c94d0f4
Compare
2db0f3f to
43adbfd
Compare
62dd60e to
2058395
Compare
4e2de6b to
678b0a6
Compare
Leobouloc
reviewed
Sep 12, 2023
Contributor
Leobouloc
left a comment
There was a problem hiding this comment.
Here is a partial review (up to statements.py)
SergioSim
reviewed
Sep 19, 2023
205036d to
ccc73fa
Compare
SergioSim
reviewed
Sep 21, 2023
ec0cac2 to
7428a7e
Compare
wilbrdt
commented
Oct 2, 2023
Leobouloc
reviewed
Oct 3, 2023
Contributor
Leobouloc
left a comment
There was a problem hiding this comment.
This is neat ! Looking forward to this being merged.
4752d8d to
06cc6bb
Compare
2058395 to
e6c29ff
Compare
We add the OpenStack Swift data backend implementation. With the `data` parameter changed to an Iterable, we cannot use high level SwiftService API to upload files anymore (it needs a file object source, not an iterable). Changing to Connection lower-level API, which is more flexible.
We add the LDP data backend implementation that is mostly taken from the existing LDPStorage backend.
Add S3 backend under the new common `data` interface
Add ClickHouse backend under the new common 'data' interface. With ClickHouse under the new data interface, tests are updated as well. Storage and Database backends had similar interfaces and usage, so a new Data Backend interface has been created.
We want to simplify our tests that are mocking the request package. Therefore we choose to use the `request_mock` library.
We add the ES data backend implementation that is mostly taken from the existing ESDatabase backend.
We want to provide an LRS backend that could be added to Ralph's LRS without any additional dependencies.
Methods `read_raw` and `parse_bytes_to_dict` are generic and used by multiple backends. Moving them to file `utils.py`.
Add asynchronous base interface for async backends such as async_es or async_mongo
We add the Async elasticsearch data backend mostly taken from the sync backend using the async elasticsearch methods.
We want to improve the current mongo data backend implementation to align it more with other unified data backends.
We want to provide an async version of our MongoDatabase backend.
- `get_query` method for Elasticsearch would be better namespaced under the ESLRSBackend. Changing it to a static method instead of a global function. - At initialization, data backends can either take settings or None. Setting `settings_class` to Optional to anticipate mypy warning when mypy will be added. - Piping x|None is preferred since Python 3.10, changing from Optional to Union[x|None] for backends as it would be easier to switch to pipes. - Changes to backend methods docstrings - Rename variable `new_documents` to be more explicit
Synchronous backends such as Elasticsearch or Mongo need their connection to be closed when finished. This commits adds an abstract method close to the BaseDataBackend interface, and implements it for backends that need it.
The update to a recent version of `motor` highlighted a bug on our side when listing collections. Now asynchronously iterate over collections list.
With the new data backend interface, settings are now close to each backend and not under general conf.py. Unifying stream backend WS to have the same architecture as data backends.
With the new data backend interface, settings are now close to each backend and not under general conf.py. Unifying HTTP backends to have the same architecture as data backends.
After unifying database and storage backends under a common interface, backends settings are now handled directly alongside the backends classes. Modifying the CLI to support new settings and new backends interfaces.
With the addition of new asynchronous backends, it could be useful to be able to use them in the CLI.
Adding a default value for ClickHouse client option `allow_experimental_object_type` highlights a pydantic validation error with type `Literal[0,1]`. Switching to `coint`.
With addition of unified backends and changes to the conf files, API router needs some changes to be able to get the backends instance.
With addition of unified backends, API router needs some changes to be able to use asynchronous backends.
Tests using filesystem failed with pyfakefs in the CI as pyfakefs does not succeed on creating requesting files in the default directory path. The latter is then defined specifically for these tests and forced to be used in the ralph command.
Environment variables `RALPH_BACKENDS__DATABASE__ES__*` have been renamed to `RALPH_BACKENDS__DATA__ES__*`. Changing them in the `tray`.
Changed - Refactor `database` and `storage` backends under the unified `data` backend interface [BC] - Refactor LRS `query_statements` and `query_statements_by_ids` backends methods under the unified `lrs` backend interface [BC]
Following `pylint` upgrade to version > 3.0 , a false negative was corrected resulting in many warnings about methods having too many arguments. Escaping these warnings.
96beb9e to
27143be
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Purpose
After unifying
databaseandstoragebackends under a commondatainterface, backends settings and methods have changed.
Proposal
Streambackends settings alongside backend classHTTPbackends settings alongside backend classdatabaseandstoragebackendsconf.pytobackends/conf.pyto avoid acircular import (e.g.
conf.py->es.py->conf.py)_class_pathattributes from backends settingsutils.pymethods to get backend type and backend instancecli.pyto use new unified backendscli.pyto be able to use async backendsunify-data-backendsutilsfunctions