Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: new data model first steps (Routes & Services) #3224

Merged
merged 69 commits into from
Feb 24, 2018

Conversation

thibaultcha
Copy link
Member

@thibaultcha thibaultcha commented Feb 13, 2018

Summary

This PR introduces the first steps a new DAO implementation and a new data model. Two new entities that are part of the new data model have been implemented first: Routes and Services. It contains work conducted by @bungle, @hishamhm, @kikito, and myself.

The new entities, Routes and Services, are introduced to decouple responsibilities previously held solely by the "API" entity. The API entity is still supported (making this PR backwards-compatible) but considered deprecated. The new DAO proposes a better implementation of database abstraction for both PostgreSQL and Cassandra, as well as a better interface for specifying entities schemas and nicer Admin API endpoints, addressing issues with the previous implementations of those components.

Table of Contents

Routes and Services concepts

By decoupling this entity, we hope to achieve a better separation of concerns in our model between what Kong considers as “downstream” and “upstream” in proxy jargon.

  • Routes describes how Kong matches a request. Routes are the entry-point for clients (downstream), and contain routing rules (hosts, HTTP methods, paths, possibly with regexes).
  • Services represents the user's upstream: these are the servers to which Kong proxies requests. A service has a host, port and possibly a path.

Or, with pictures, here is the previous model based on APIs:

routes and services - old model - api 1

And how Routes and Services decouples responsibilities:

routes and services - new model

The major benefit of Routes and Services is the ability to specify plugins on a subset of endpoints without duplicating the API definition (and the upstream_url property) and the shared plugin instances.

Note how the new model gets rid of redundant information: in the old model, both API objects had the same upstream_url attribute, and the key-auth plugin had to be applied to both entities, with redundant settings.

In the new model however, information pertaining each single route is stored in the Route entity, and upstream service information is stored in the Service entity. Note also that Plugin objects can be applied to either Routes (via route_id) or Services (via service_id). In the above example, the key-auth plugin attached to the Service will apply to all Routes that point to said Service.

Route and Services schema

Routes
Attribute Description
id (primary key) the route's UUID
created_at (auto-generated, read-only) the creation timestamp
updated_at (auto-generated, read-only) the update timestamp
protocols a set of supported protocols, default is { "http", "https" }
methods (semi-optional - see below) set of accepted HTTP methods (e.g. GET, POST, etc.)
hosts (semi-optional - see below) an array of hostnames, supporting a wildcard at the beginning or end (examples: foo.com, foo.*, *.foo.com)
paths (semi-optional - see below) an array of path components, which may use regexes (examples: /request, /foo[0-9]*)
regex_priority an integer priority for sorting routing routes containing regular expressions
strip_path boolean (same as in the old model)
preserve_host boolean (same as in the old model)
service a foreign key linking it to a service entity

At least one of methods, hosts or paths is required, in order to produce a routing rule for a Route.

Services
Attribute Description
id (primary key) the service's UUID
created_at (auto-generated, read-only) the creation timestamp
updated_at (auto-generated, read-only) the update timestamp
name (must be unique) the service's name, giving it a friendlier identifier than the UUID
retries integer (same as in the old model)
connect_timeout integer (same as in the old model)
write_timeout integer (same as in the old model)
read_timeout integer (same as in the old model)
protocol (required) the protocol used to connect to the service (default is “http”)
host (required) the service hostname or IP
port the service port (default is 80)
path the service's path component
url (write-only, convenience pseudo-field). If this field is set when creating/updating a service, its values are splitted and used to set the protocol, host, port and path fields.

Note: the fields protocol, host, port and path replace the field upstream_url in the old API entity, storing each fragment of the service URL separately. For convenience, the url field can be used to set those values at once.

Full changelog

  • New Routes and Services entities with better separation of concerns
    • Plugins can be applied to Routes, Services, Consumers, or APIs (still) or a combination of all
    • Improved naming of our core entities and their attributes by using more generic/standard names
    • Routes with regex URIs have a priority field for modifying their evaluation order (APIs rely on their created_at field)
  • New DAO layer
    • Reduce (not entirely) some harmful patterns such as read-before-writes, various un-optimized queries, etc...
    • Nicer interface for developers (although this the new DAO is not to be considered public/stable yet)
    • Better isolation between database-specific strategies: it will be easier for us to consider implementing support for new databases
    • Better extensibility for developers (although this the new DAO is not to be considered public/stable yet)
  • New definitions for schema's entities
    • Enforcing purely declarative schemas, with more flexible rules that reduce the need for custom validation logic code
    • Schemas validate themselves via a "meta-schema": when this API will be public, we shall be able to catch developer errors when they write invalid schemas
  • New Admin API endpoints, /routes and /services are their top-level prefixes. Those new endpoints have a nicer behavior:
    • Automatically generated CRUD endpoints for entities (with overriding capabilities)
    • Better response format (non-set fields are always visible via JSON NULL)
    • Better error messages: more consistent, more verbose, better suited for programmatic usage
    • Better support for compound types such as Arrays in both JSON and form-urlencoded content types (e.g. no more comma-separated arrays)
    • Ensure arrays in Lua do not get JSON-encoded as empty Objects {}, but empty Arrays []

Eventually, more entities will be moved to the new DAO layer and new Admin API/schemas. Those APIs will also be made available for public consumption by plugin developers as part of a later effort.

Issues resolved

TODO

@thibaultcha thibaultcha added this to the 0.13.0 milestone Feb 13, 2018
@thibaultcha thibaultcha changed the base branch from master to next February 13, 2018 23:16
@thibaultcha thibaultcha added the pr/ready (but hold merge) No more concerns, but do not merge yet (probably a conflict of interest with another PR or release) label Feb 14, 2018
@thibaultcha thibaultcha force-pushed the feat/model-routes-and-services branch from 337b878 to fb7afa8 Compare February 14, 2018 23:59
hishamhm and others added 19 commits February 19, 2018 15:50
A new schema validation engine, supporting various
data types and validations.

---

Adds a MetaSchema schema for validating schemas, to be used
especially when validating third-party plugin schemas.
Minor adjustments are made to the schema engine to support it.

---

Introduce a kong.db.schemas.typedefs module, storing common
type definitions. We use the name typedefs (and not types)
to make it explicity that these are type synonyms a la
typedefs in the C language, and not distinct types. In other
words, two type definitions that are structurally equal will
match identically as far as schema validation is concerned.

---

Turns the `fields` entry of a schema into an array, so that
each field is a one-key map. Adds a utility iterator,
`schema:each_field()` to make traversing fields easier.
Also updates tests accordingly.

---

Much to Tony Hoare's dismay, we introduce in Kong schemas the
billion-dollar mistake [1] : nullable types.

We do it in full understanding of how terrible an idea this is,
and in full appreciation that we are interfacing with the world
of JSON, JavaScript and the web, a world which expects null
references to exist and which has a rather quaint view of type
systems [2].

In order to keep the beast contained as much as possible, we
shy away from performing any automatic coercions (at this
level, at least -- handling form-encoded inputs elsewhere in
Kong presents challenges of its own).

The semantics introduced in this commit are as follows:

* All fields are nullable by default
* A field can be made non-nullable setting `nullable = false`
* Sub-values in compound fields (maps, arrays, sets and records)
  are NOT nullable.
* Foreign entries are subject to its own schema nullability
  settings.
* No value compares as equal to ngx.null.

The humorous nature of this commit message is brought to
you by the fact that it will be eventually lost to the sands
of time when we squash-commit this branch.

[1] https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare
[2] https://www.bram.us/wordpress/wp-content/uploads/2015/01/js-equality.png

---

Strings cannot be empty unless said otherwise using `len_min = 0`.
This changes the behavior to effectively get rid of empty strings,
while leaving the door open for any future use-cases that may
eventually need them.

---

Adds a `validate_update` function with proper semantics
for partial updates.

For symmetry, it also adds a `validate_insert` function, which
is equivalent to the current incarnation of `validate`.

Introduces `entity_checks` to schemas, which are declarative
checks designed to apply over multiple fields. Hopefully,
these will phase out `self.check`. Errors detected by
entity checks are available in `errors["@entity"]` and also
separately as a third argument.

Three entity checks are currently supported:

* `at_least_one_of` - at least one of the fields is given
* `only_one_of` - only one of the fields is given
* `conditional` - a case study of a more powerful entity
  check for testing the custom validator machinery: it
  works like "if/then": it tries a validation in one field,
  and if that passes, checks a validation in another field.
  Any one of the field validators can be composed with it.
  By capturing the notion of two-field interdependence,
  hopefully this will save us from creating a zillion
  similar validators. (And yes, I thought about adding an
  "else" mode -- will do it when the appropriate
  use-case arrives.)

With all these features in hand, I uncommented the partial
schema validaton from `update()` in the DAO, and adjusted some
tests accordingly.

Sorry about the long multi-topic commit, but due to the
interdependent nature of these features (no pun intended!)
it would be a lot of useless work to atomize this into multiple
commits in this branch. (I'd do it if these commits survived, though.)

---

An Entity is a restricted sub-type of Schema, which refers to
entities that can be persisted through the DAO.

An initial set of constraints is defined on Entity objects,
namely:

* no fields can be set to nil
* maps can only have strings as keys
* aggregate types only aggregate on basic types

These constraints can be later easily relaxed as we see fit
if we decide to give more power to DAO entities.

The general Schema feature set (including nil) is retained
as it is still useful for MetaSchema validation
(note that the MetaSchema is a more powerful sort of schema,
including for instance function fields).

The support for nil was renamed from `required = false` into
`nilable` (which is only available for the MetaSchema to use),
fixing the awkwardness of the fact that `required` was
actually a tri-state variable with different behaviors for
true, false and nil.

The Postgres strategy was adapted reflecting the change in the
routes schema.

---

feat(schema) allow sets to be indexed by their elements

When declaring a set, for set of strings,
you also get a my_set.my_entry shortcut syntax.
(We conservatively add this feature to strings only
because it cannot be applied to all types:
using this in sets of numbers would produce a conflict,
and some other types are not referentially transparent.)

---

feat(schema) `match_all`, `match_none`, `match_any` validators

Adds three new validators for lists of string patterns:
`match_all`, `match_none` and `match_any`.

This avoids introducing polymorphic validators to the schema engine,
making `match` and `not_match` operate on a single pattern only.

---

feat(schema) always run validators in the same order

Counting this as a feature rather than a bugfix, since
it improves on the specification.
This makes our tests more reliable.

---

feat(schema) hard-code validators ordering

Instead of relying on their alphabetical sort, we order validators per
the following rules:

1. Most "narrow"/limiting validators first
2. Negative validators
3. Positive validators
4. Custom validators

This is to avoid overriding error messages/validation of "simpler" rules
being overridden by rules validating larger scopes.

We also avoid building the ordering of validators for each call to
`validate()`.
Adds schemas for two entities of the new model:
routes and schemas.

----------------------------------------

fix(db) make service mandatory in route schema/specs
---

feat(db) basic __tostring in db.errors

---

tests(unit) fix mis-usage of busted
* [tmp] add 'spaced_newlines' arg to unindent()
* feat(db) luasocket support for postgres connector

    Used with init, init_worker, and CLI.
----------------------------------------

fix(cassandra) deserialize rows in `each()` API
Until our new DB module supports migrations, we will write such
migrations using the old DAO's migration mechanism. This means that
migrations need to be implemented manually for now.

This first migration implements our first two new entities from our new
model.
* tests(dao) add blueprints feature
* tests(dao) add blueprint:insert_n and remove http_route bp
* tests(blueprints) make existing tests pass
* tests(dao) implement basic blueprint support for all current entities
* refactor(utils) move deep_merge to utils
* tests(blueprints) move methods closer to their definitions
* tests(blueprints) add blueprints for all entities and plugins

---

refactor(blueprints) remove not needed fields from blueprints
---

fix(core) fix kong.init to call build_router with singletons.db instead of singletons.dao
fix(admin) paging and params parsing fixes

* proper foreign key api auto-generated endpoints and paging fixes
* parse_params fixes to pass tests again and small cleanups
* ngx.null is encoded/decoded as ""
* true and false as "true" and "false"
* arrays as x[1]=a&x[2]=b
* add Services placeholder for DB and strategies
* remove custom routes dao methods as they are auto-generated now
---

feat(handler) add back the support for http to https upgrade headers

---

tests(router) fix tests using kong.db Routes and Services
hishamhm and others added 21 commits February 19, 2018 16:11
* add api_id support in Plugins schema
* update DB migrations to _not_ delete the api_id column
* add validation logic around specifying both of api_id and service_id
  or route_id
* appropriate unit tests
These tests were failing, due to the changes on how query strings are
handled in the new `feat/routes-and-services` branch.
Contains squashed changes to plugins from:

- Aapo Talvensaari <aapo.talvensaari@gmail.com>
- Enrique García Cota <kikito@gmail.com>
- Hisham Muhammad <hisham@gobolinux.org>
- Thibault Charbonnier <thibaultcha@me.com>

Signed-off-by: Thibault Charbonnier <thibaultcha@me.com>
Since they are tables in the old DAO, this would fail with a super-sed:
error "bad argument #2 to 'assert'"
…vices

This adds tests in the legacy test suite to check that anonymous reports
for plugins added to `apis` take place (using both `/plugins` and
`/apis/:api_id/plugins`), and tests to the main test suite
to check for similar functionality for `routes` and `services` (using
`/plugins`).

These tests run fine locally but intermittently fail on Travis.
They require port 61829 to be available, which is on the range of
ephemeral ports, they are thus marked as #flaky.
@thibaultcha thibaultcha force-pushed the feat/model-routes-and-services branch from fb7afa8 to 1fbfda2 Compare February 20, 2018 00:11
bungle and others added 4 commits February 23, 2018 20:30
Cassandra 2.x uses a different schema for its system schema - we need to
tweak the way we run some queries. Major version of the cluster is
retrieved in the `init()` phase of the DB connector.
@thibaultcha thibaultcha merged commit e828b6d into next Feb 24, 2018
@kswen
Copy link

kswen commented Mar 8, 2018

so how does ring-balancer works with routes/services model?

there`s no targets node?

@thibaultcha
Copy link
Member Author

@KwSen The load balancer hasn’t changed in behavior. It gets triggered for Services the same way it does for APIs as of 0.12: when the hostnames (of the API or Service in question) is the same as that of one of the registered Upstreams

@bungle bungle deleted the feat/model-routes-and-services branch March 22, 2018 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/ready (but hold merge) No more concerns, but do not merge yet (probably a conflict of interest with another PR or release)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants