Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cassandra schema possible improvements #18

Closed
thibaultcha opened this issue Feb 19, 2015 · 11 comments
Closed

Cassandra schema possible improvements #18

thibaultcha opened this issue Feb 19, 2015 · 11 comments

Comments

@thibaultcha
Copy link
Member

Mainly just writing thoughts down here, and trying to discuss the limitations of a future schema. Not a priority at the moment.

The current schema was built using different column families for accounts, applications, apis, plugins. We should probably handle relations the way Cassandra handles them:

Relations

CREATE TYPE applications(
  public_key text, -- This is the public
  secret_key text, -- This is the secret key, it could be an apikey or basic password
  created_at timestamp
);

CREATE TABLE IF NOT EXISTS accounts(
  id uuid,
  provider_id text,
  applications set<applications>,
  created_at timestamp,
  PRIMARY KEY (id)
);

CREATE INDEX ON accounts(applications);

Here, the index would allow us to query the accounts table by application's values (especially public_key, as it is the only value that will be queried), but it needs to happen like this:

SELECT * FROM accounts WHERE applications CONTAINS ('abcd'); -- 'abcd' being a public_key
  • This model is a better fit for relations in Cassandra, less data duplication
  • Not 100% sure about the efficiency of querying a User Defined Type column vs. a text field like currently. Also as mentioned, not sure if a set can be paginated if it has a lot of entities...
  • We still have to check the unicity of a public_key, like currently

The same applies for plugins. They are currently a table on their own, but a plugin is attached to an API, and optionally to an application.

Community plugins

Plugins from the community will have to use the value property and encode their data to store things. We could provide them with a way of creating a table, or a UDT.

@thibaultcha
Copy link
Member Author

Doing such a schema change could also fix this ugly problem which is plugins selection "fixed" by bb83890

@subnetmarco
Copy link
Member

Credentials

I am not happy by the way we handle credentials and I think we should think more carefully how we want to support this. I think having something like the following schema could make sense:

We do have the following generic core entities:

  • accounts, the base entity who owns of zero or more applications.
  • applications, generic credential holder, that can hold different credential types

Each authentication plugin can create a credential type on the datastore, like:

  • query
  • basic
  • header
  • ldap
  • oauth

This means that we need to introduce the possibility for plugins to edit the datastore during their installation using a DSL.

Plugins DB DSL

If plugins can modify the datastore, that should be done with a DSL instead of plain simple SQL, to avoid doing illegal operations on the datastore. An example could be:

return { create = {
    {
      type = "table",
      name = "ldap",
      properties = [[
        id uuid,
        key text,
        created_at timestamp,
        PRIMARY KEY (id)
      ]]
    },
    {
      type = "datatype",
      name = "ldap_credential",
      properties = [[
        public text,
        secret text
      ]]
    }
  }
}

By having the DSL we're limiting the number of operations that the plugin can execute on the datastore, like deleting other tables, or modifying existing data.

We could also implement a rollback function to execute DELETE statements on whatever datastore entity has been created during the provisioning of the plugin (it needs to be explicit because it could cause loss of data).

@subnetmarco
Copy link
Member

This is even nicer, the action to create those entities could be implicit, and we don't accept DB-specific constructs to support in the future any other datastore.

{
  entities = {
    { 
      name = "ldap",
      properties = {
        { name = "id", type = "id" },
        { name = "key", type = "string" },
        { name = "created_at", type = "timestamp" },
        { name = "type", type = "ldap_credential"}
        { primary = "id"}
      }
    },
    {
      name = "ldap_credential",
      properties = {
        { name = "public", type = "string", unique = true },
        { name = "secret", type = "string" }
      }
    }
  }
}

The example above is a quick demonstration, but a DSL like this could technically be ported to any datastore without having to update the plugins if a new DAO is being introduced. The DAO will take care of translating the DSL to an executable statement.

And as long as the DSL is verbose enough, the DAO can then decide to handle edge-cases (like treating child entities like ldap_credential as datatypes in Cassandra, or just another table in other datastores - it's up to the DAO).

@subnetmarco
Copy link
Member

On a side note the more I look at the DSL above, the more it resembles the schemas we already have. If we decide to implement a DSL to DB translation, I wonder if we can automatically generate the migration script file by parsing all of the schemas, thus automating the creation of migrations files.

@thibaultcha
Copy link
Member Author

Relations

  • Edited original comment ^ to raise the question of paginating a set, ie: applications of an accounts.

Credentials

  • Each credential could be a UDT for Cassandra. If we have a DSL, we need to make sure we can handle this for other DBs.

DSL

I like the idea, just time consuming for DAOs to implement. Otherwise:

  • Could even skip the migration file creation. Migration files could be DAO agnostic and simply DSL files executed on the go.
  • If we are talking migrations, the DSL also needs to be able to ALTER or DROP, and that does not protect the DB against malicious plugins either as we discussed before.
  • What does protect against that is a plugins-only DSL, that just offers the possibility to create UDTs for the plugins table.
  • But at the end of the day, any Lua code executed under a Kong instance can simply require the factory and call the drop method @thefosk. It's up to us to distribute valid, trusted plugins.
  • Official, trusted plugins could be released and signed by PGP.
  • And our DSL could just do anything it wants, since an official plugin is trusted.

@subnetmarco
Copy link
Member

Regarding the DSL, a few points:

  • We could automatically prepend the plugin name in front of any entity that is being created, like basicauth.keys. This is a good idea for two reasons:
    • Avoid name clashes if two plugins want to use the same table name.
    • Make sure that a plugin can't change anything that doesn't belong to itself.
  • There may be a way to limit the scope of drop or other reserved calls just to some special packages. Thus we can block every call that comes from the kong.plugins package.
  • We should still sign plugins and avoid running those that are not authorized.

@thibaultcha
Copy link
Member Author

Too bad we're not using PostgreSQL: http://leafo.net/lapis/reference/database.html#database-schemas

A great contribution would be implementing a Cassandra adapter to Lapis as mentioned in #80.

This was referenced Mar 18, 2015
@subnetmarco
Copy link
Member

Cassandra 3.0 will support this: https://issues.apache.org/jira/browse/CASSANDRA-8473

@thibaultcha
Copy link
Member Author

Following the discussion we had yesterday, here are the decisions we took:

  • accounts:renamed to consumers.
    • id: same
    • provider_id: renamed to custom_id (same purpose) required if no username
    • username: required if no provider_id
    • extra: Maybe a field for extra informations
  • apis: nothing new
  • plugins:
    • They can plug themselves into the lifecycle of a request (this hasn't changed)
    • They can access the consumers table
    • They can expand the DB to add tables and perform additional queries of their own
    • They can expand the API routes
    • Once installed, one can create configuration(s) (said configuration entry of that plugin, linked to an api and optionally, a consumer. This allows a plugin to be enabled on an API, as well as being overridable for a specific consumer.

We started talking about having a whitelist/blacklist for configuration entries (to be able to enable/disable a configuration entry for a lot of apis/consumers at once, but this ran into implementation issues as illustrated in the following picture.

This discussion was related to #50 (Plugins system), #91 (refactor applications), #93 (Plugins API), #98 (Better API routing)

Here is a pic of the whiteboard:

img_4229

@thibaultcha
Copy link
Member Author

Improvements described in the previous comment are implemented, appart from:

  • plugins expand the DB to add tables and perform additional queries of their own
  • plugins expand the API routes

Those things need to be done in order to provide a good development environment for plugins but will be part of another discussion: #93.

@subnetmarco
Copy link
Member

Plugins do expand the API routes, it has been implemented: https://github.com/Mashape/kong/blob/master/kong/api/app.lua#L76

We are waiting for the DAO part to have complete separation.

gszr pushed a commit that referenced this issue Jun 17, 2021
* add eu-north-1 (Stockholm)
* add me-south-1 (Bahrain)
* add eu-west-3
gszr pushed a commit that referenced this issue Aug 18, 2021
### Summary

Release 2.4.0 that also bumps `lua-resty-session` dependency to `3.3`.
gszr pushed a commit that referenced this issue Aug 19, 2021
gszr pushed a commit that referenced this issue Aug 31, 2021
gszr pushed a commit that referenced this issue Oct 26, 2021
gszr pushed a commit that referenced this issue Oct 27, 2021
hutchic added a commit that referenced this issue Jun 10, 2022
* fix(build-kong) remove nono-idempotent postinstall script

* fix(tests) make the test failures more verbose

* fix(tests) update to use stable/kong helm chart

* fix(travis) pin minikube and helm versions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants