Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Groundwork for porting the Plugins core entity to kong.db #3739

Merged
merged 30 commits into from Sep 11, 2018

Conversation

Projects
None yet
2 participants
@hishamhm
Copy link
Member

hishamhm commented Sep 3, 2018

This is the first half of the work porting the Plugins core entity to kong.db — Plugins is the last pending core entity using the old DAO; after this, the only core entity using the old DAO will be the deprecated Apis entity, as well as the non-core entities from Plugins, which are on the way to being ported as well.

This PR contains a number of features and fixes needed to port Plugins to kong.db, the new DAO infrastructure. This contains all the work that can be merged prior to "flicking the switch" and deleting the entity from kong.dao and adding the entity in kong.db.

This contains the minimal changes to the test suite to keep tests passing. Tests exercising the new behaviors added will appear in the second PR, in which the Plugins entity is ported over.

@thibaultcha
Copy link
Member

thibaultcha left a comment

Went over commit by commit - the biggest blocker for me right now is the regression in the migrations that were updated to not use the DAO anymore. Mind having a look?

if not _constraints[field.reference] then
_constraints[field.reference] = {}
end
_constraints[field.reference][schema.name] = {

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 4, 2018

Member

It'd be nicer to use an array here instead of a dictionary, so we don't have to use pairs() in delete()

@@ -53,6 +53,7 @@ local validation_errors = {
FUNCTION = "expected a function",
-- validations
EQ = "value must be %s",
NE = "value must not be %s",

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 4, 2018

Member

my 2cts: neq might have been clearer for the reader

SELECT *
FROM plugins
WHERE consumer_id = 00000000-0000-0000-0000-000000000000
]])

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 4, 2018

Member

Using db:query() like so will only select the first 1000 plugins, which results in an incomplete migration. Instead, we should use the iterate() API offered by the driver. See:

for rows, err in coordinator:iterate([[

bottom = bottom or {}
for key, value in pairs(bottom) do
output[key] = value
end

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 4, 2018

Member

Do we really need the extra loop here, or couldn't output[key] = bottom[key] be done in the beginning of the next loop's body?

schemas[entity_name] = Entity.new(entity_schema)
local entity, err = Entity.new(entity_schema)
if not entity then
return nil, fmt("schema of entity '%s' is invalid: %s", entity_name,

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 4, 2018

Member

This would mean that there are errors that the MetaSchema won't catch, but Entity.new will? Is it only for the schema subclass of the DB entities? If so, maybe the MetaSchema should be responsible for validating such subclasses itself? This way we don't duplicate error handling here, which seems like a design flaw as it stands.

This comment has been minimized.

Copy link
@hishamhm

hishamhm Sep 5, 2018

Author Member

This would mean that there are errors that the MetaSchema won't catch, but Entity.new will?

Yes. The MetaSchema checks that it's a valid Schema, not that it's a valid Entity. This was already the case before this PR, we just didn't catch those errors. (I hit this during development when I wrote a bad entity).

If so, maybe the MetaSchema should be responsible for validating such subclasses itself? This way we don't duplicate error handling here, which seems like a design flaw as it stands.

My initial thought about this is that a MetaSchema-based validation of Entity structures would be doable with a "MetaEntity" subschema for the MetaSchema, in which the additional constraints are specified (which would be conceptually a pretty cool solution). That would require a little structural tweaking, so I would prefer to implement this improvement on a further PR though.

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 10, 2018

Member

My initial thought about this is that a MetaSchema-based validation of Entity structures would be doable with a "MetaEntity" subschema for the MetaSchema, in which the additional constraints are specified

Agreed! And yes, let's do this later.


cql = fmt(cql, ttl)
local cols = {}
local binds = {}

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 4, 2018

Member

Please initialize these variables below the args local declaration? And we should use the new_tab(fields_len) allocation (caching the #schema.fields value), even if we may end up using less slots that allocated.

@@ -575,9 +561,19 @@ function _mt:insert(entity, options)
end

insert(args, serialize_arg(field, entity[field_name]))
insert(cols, field_name)
insert(binds, "?")

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 4, 2018

Member

Respecting the insert(cols); insert(binds); insert(args) order (as per the is_partitioned branch) might be helpful for the reader.

schema.name,
concat(cols, ", "),
concat(binds, ", "),
ttl and fmt(" USING TTL %d", ttl) or "")

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 4, 2018

Member

This function suddenly became more costly for all Cassandra versions. Was there no way to bind cassandra.unset instead?

This comment has been minimized.

Copy link
@hishamhm

hishamhm Sep 5, 2018

Author Member

I haven't tried that approach, I just went with copying the old DAO approach which worked on C* 2.2 — giving cassandra.unset a spin now, will update this PR if the tests pass.

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 10, 2018

Member

Any success with cassandra.unset?

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 10, 2018

Member

It seems like this moved to query-build-time anyway

@hishamhm hishamhm force-pushed the feat/plugins-new-db-groundwork branch from 1d7e05e to 7632d69 Sep 5, 2018

hishamhm added some commits Aug 1, 2018

fix(db) return errors on iteration when using :each()
The implementation of `each` already considered returning
errors in the second argument, but that didn't work because
a first argument of `nil` breaks the `for` iteration.

Returning `false, err` makes this work, and then client code
can check for the presence of `err` to determine an error
condition.
feat(api) add post-processing support for POST
This way we can perform post-processing operations (such as
reports) only when a POST successfully posted. This allows us
to keep the post-processing semantics of the old DAO, and perform
both pre-actions (using `parent`) and post-actions.
fix(db) Postgres fixes for unique and foreign keys
* `collapse` expects a list of foreign keys
* better support for the case where a unique key is part of a key,
  such as ids in Plugins
fix(schema) subschema support improvements
Make more parts of the schema library aware of subschemas:

* avoid checking constraints of abstract fields when instantiating
  the parent entity (we rely on the checking of the subschemas)
* recursively process defaults of records declared via subschemas
* support passing an entity value to `schema:each()` so that
  we can use the subschema_key to check what subschema to use
  when iterating the fields. Concretely speaking, we'll need to
  know which plugin configuration to return in the `config` field
  of Plugins and for that we need to know the value of `name`.
feat(db) expose extract_pk_values at schema level
Move `extract_pk_values` from the Cassandra strategy to the schema library, so
it can used in multiple places.
feat(db) support on_delete annotation in schema and make Cassandra aw…
…are of it

* add support for specifying `on_delete` to schemas
* Postgres already supports "restrict" (default) and "cascade"
* Cassandra supports "restrict" (default) but now since this commit won't
  do it always: if "cascade" is set, the Cassandra backend for now will
  do nothing, and the custom strategy implementation is responsible for
  implementing cascade. (Eventually this should go to the Cassandra strategy)
fix(db) use foreign key when generating foreign key methods
`generate_foreign_key_methods` was using the schema's own primary
key for validation instead of the proper foreign key.
That used to work because all keys happened to be the same,
`{ id }`. When a key is different (e.g. `{ id, name }`), the
mismatch becomes apparent.
refactor(db) expose deserialize_row for custom Cassandra strategy met…
…hods

This is in line with self.expand() in the Postgres strategy, which
is exposed and serves the same purpose.

(We should consider giving them the same name, perhaps?)
feat(schema) make metaschema fail if a field table is empty
Makes metaschema stricter, which is useful during development.
feat(schema) introduce legacy schemas
This feature is designed to support `api_id` in the database representations
of the Plugins entity using the new DAO with minimal coding.

Instead of adding to the new DAO support for relationships with old-DAO entities
like we did in the old DAO, we add support for creating "pseudo-schemas" for
old-DAO entities in the new DAO and marking them as `legacy`.

Legacy schemas only need to have their primary key defined, as they are only
used for foreign key validation in new-DAO entities. This allows using them
as `reference` entries in fields of type `"foreign"` in new-DAO entities.

The `legacy` marker allows us to skip these "pseudo-schemas" in places such as
the Admin API endpoint generator.
feat(dao) add ttl, find_all and count with filtering in legacy wrapper
Forwards the ttl option to `:insert()`.

Filtering is low-performance compared to the old DAO, as it is done
in Lua land, but the legacy wrapper is a temporary measure anyway.
fix(db) Cassandra select_by_field argument needs to be serialized
A proper serialization of the input argument needs to take into account the
field type.
feat(db) generic cache key support in new DB
This combines the fast-path behavior of the old DAO's `cache_key`
and the generic behavior of the old DAO's `entity_cache_key`.
Having it in the new DAO allows us to handle cache keys in the
`crud` events propagation system in a general manner, while the
fast path avoids us from degrading performance for the plugin
iterator.
feat(schema) add typedefs for no_consumer, no_route, no_service, no_api
* `no_consumer`, `no_route`, `no_service`, `no_api`, to be used in plugin
  overrides instead of `no_consumer = true` and the like in old-DAO schemas
feat(api) do not try to infer abstract records
Abstract records (ie, plugin configuration) can only be inferred
after the plugin name is determined. For plugin PATCH, this is
needs to be done performing read-before-write in its custom
endpoint operation.
feat(schema) adds a subschema_error field for custom error messages
Adds a subschema_error field for providing a custom friendlier error
message when a subschema_key field does not resolve to a valid
subschema name.

This will be used in the Plugins entity, to provide a better error message
when a user attempts to use a plugin that is disabled, not installed or simply
contains a typo.
feat(schema) add "ne" validator for "not equal"
Add a counterpart to the `eq` validator, the `ne` validator.
Also, add a special-case handling for when `eq` and `ne`
are set to `null`. Relax the types of both of them so that
they pass metaschema validation (the change isn't problematic
since they only rely on raw equality).
fix(schema) fix error message of conditional validator
fixes nested error message (as the driver of entity checkers
already invokes the `validation_errors` table), and improves
the message to refer to the field that triggered the conditional
validation.
fix(db) do not present "userdata: NULL" in error messages
Those were leaking all the way to the Admin API. This produces
much nicer error messages.
feat(db) add legacy schema for apis
This is used exclusively to build a foreign key in the upcoming
Plugins entity in the new DAO. Being a legacy schema, it does
not produce Admin API endpoints or is otherwise directly usable
via the new DAO.
feat(api) do PUT with upsert semantics when using new DB entities via…
… wrappers

This is needed by /apis/:apis/plugins, the last user of Plugins via
crud_helpers.

hishamhm and others added some commits Aug 7, 2018

feat(db) conditional read-before-write for partial update
When performing a partial update, some configurations of
a partial table (e.g. nested records) require a read-before-write
so that only the rest of the record (internally stored as JSON)
is retried and can be correctly merged.

To achieve this, we add some coordination between the schema
library and the generic DAO:

* the schema library detects whether an update of an entity
  necessitates a read-before-write and informs the generic DAO
* the generic DAO performs read-before-write on update only
  if needed, and merges the old and new values with help of
  the schema
* the schema library provides a schema-aware merge function,
  so that records are merged, but array entities are not:
  for example, merging `{a = "foo"}` and `{b = "bar"}` should
  produce `{a = "foo", b = "bar"}`, but merging `{a = {1,2}}`
  and `{a = {4,5,6}, b = "bar"}` should produce
  `{a = {1,2}, b = "bar"}` (and not `a = {1,2,6}`, as a
  generic recursive table-merge function would do)

Co-authored-by: Enrique García Cota <kikito@gmail.com>
feat(schema) run entity checks on nested fields
This is needed so we can specify entity checks on plugin config
fields, such as `{ at_least_one_of = { "config.second", "config.minute",
"config.hour", "config.day", "config.month", "config.year" } }` for
rate-limiting.
fix(dao) ensure wrappers return errors with matching types
The old DAO returns errors as `nil, err_t`, and the new DAO
returns as `nil, err, err_t`. This commit ensures the wrappers
return the table as the second argument. Since the table contents
are different and we make no efforts to reconcile this, we change
the `on_error` handler of the Admin API so that this function
can handle both old-style and new-style error tables, deferring
to the new-style handler when such a table is detected.
feat(db) manage composite cache_key values internally
The Plugins entity has a particular case of a composite uniqueness constraint
involving nullable values, which isn't natively supported by either Postgres
or Cassandra. The old DAO performed filtered searches on these composite-unique
entries in two places: when using the cache_key values to validate the
uniqueness of a plugin and in the plugins iterator. This had two problems:

1. To perform the composite uniqueness check, the old schema library gave full
   DAO access to the self-check function, which then performed a filtered search.
2. The old-DAO semantics of the filtered search were suboptimal, because it did
   an overly general search and then filtered out the undesired values in a loop.
   This added extra computation to the plugin iterator.

The new approach generates the unique cache_key string as an internal field in
the database, which can be unique-tested for in the usual way (solving problem 1)
and also be searched for precisely (solving problem 2).

For uniqueness check, in Postgres it uses native UNIQUE. For Cassandra,
the implementation is pretty much the same as the other uniqueness checks.
The management of a hidden column in those strategies is similar to that of
the ttl field for Postgres.

There is one interesting gotcha, which is how to keep it up-to-date on partial
updates. There is no other alternative than doing read-before-write, but we
need to do this for partial updates of Plugin config field too, so this uses
the general mechanism between schema and generic DAO to determine "do I need a
read-before-write on this update" (currently, the two situations that will
trigger a read-before-write at the generic DAO level are updates of nested
records and updates of composite cache-key fields).

At this point, this mechanism will only be useful for the Plugins entity and
we don't see it getting more widespread use in the future. We expect that if
this composite-uniqueness restriction is dropped from Plugins, then this
composite-cache-key handling mechanism will be able to be dropped from the DAO
as well.

@hishamhm hishamhm force-pushed the feat/plugins-new-db-groundwork branch from 7632d69 to fc78985 Sep 10, 2018

@thibaultcha

This comment has been minimized.

Copy link
Member

thibaultcha commented Sep 10, 2018

Builds succeeded even though the status is not updated in the PR - probably due to the current Travis issues.

@hishamhm hishamhm merged commit 4c53d38 into next Sep 11, 2018

0 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
continuous-integration/travis-ci/push The Travis CI build is in progress
Details

@hishamhm hishamhm deleted the feat/plugins-new-db-groundwork branch Sep 11, 2018

@thibaultcha

This comment has been minimized.

Copy link
Member

thibaultcha commented Sep 11, 2018

BTW, it is nice to use a merge commit for such long-lived branches, as it makes the history somewhat more readable in those cases, and reverts get easier as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.