Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to define mapping policy for new fields #1257

Merged
merged 49 commits into from Apr 25, 2019

Conversation

@Aschen
Copy link
Contributor

commented Mar 7, 2019

What does this PR do ?

Adds the ability to define the dynamic property for collection mapping.
This property can take 3 values:

  • "true": Stores the document and updates the collection mapping with inferred type
  • "false": Stores the document and does not update the collection mapping (fields are not indexed)
  • "strict": Rejects document

See https://www.elastic.co/guide/en/elasticsearch/guide/current/dynamic-mapping.html

This is configurable in the kuzzlerc within the server.db.dynamic property.
The default value is true which allows the user to push arbitrary documents to Kuzzle and Elasticsearch will try to infer the corresponding mapping. (This is the current behaviour in Kuzzle)

Admin-Console (wip): kuzzleio/kuzzle-admin-console#502

Documentation: kuzzleio/documentation#271

Other changes

Fixed the usage of the _meta property for ES collection. This property must be passed to each call to putMapping.
See #1268

chmod 777 on node_modules/ because I'm tired of doing sudo chmod +x node_modules/.bin/* to run test locally

How should this be manually tested?

  • Step 1 : Create a collection with the dynamic: 'strict' and no mapping
  • Step 2 : Try to create a document with a random field
@codecov-io

This comment has been minimized.

Copy link

commented Mar 8, 2019

Codecov Report

Merging #1257 into 1-dev will increase coverage by <.01%.
The diff coverage is 98.41%.

Impacted file tree graph

@@            Coverage Diff             @@
##            1-dev    #1257      +/-   ##
==========================================
+ Coverage   93.83%   93.84%   +<.01%     
==========================================
  Files          98       98              
  Lines        6814     6840      +26     
==========================================
+ Hits         6394     6419      +25     
- Misses        420      421       +1
Impacted Files Coverage Δ
lib/api/controllers/controller.js 100% <100%> (ø) ⬆️
lib/api/core/indexCache.js 97.64% <100%> (-0.09%) ⬇️
lib/services/internalEngine/index.js 82.55% <100%> (+0.99%) ⬆️
lib/api/controllers/collectionController.js 100% <100%> (ø) ⬆️
lib/api/core/janitor.js 88.94% <100%> (ø) ⬆️
lib/util/esWrapper.js 96.22% <100%> (ø) ⬆️
lib/services/elasticsearch.js 93.58% <96%> (-0.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3258c0f...36907c0. Read the comment docs.

Aschen added some commits Mar 8, 2019

Merge branch 'add-ability-to-restrict-collection-mapping' of github.c…
…om:kuzzleio/kuzzle into add-ability-to-restrict-collection-mapping

@Aschen Aschen removed the wip label Mar 8, 2019

@scottinet
Copy link
Member

left a comment

Apart from my comments, I have a couple issues with that PR:

  1. It seems that the new parameter is applied only to new or updated collections: existing ones are left unaffected. This is very confusing and I'm not sure this is the expected behavior. You should probably have to check the mapping for all collections and apply the parameter to all collections with a different value at initialization.
  2. What is the exact use case? Because I wonder if it's a global parameter we need, or a per index one, or a mix of both
Show resolved Hide resolved .kuzzlerc.sample Outdated
Show resolved Hide resolved lib/services/elasticsearch.js Outdated
Show resolved Hide resolved lib/services/elasticsearch.js Outdated

@Aschen Aschen added the wip label Mar 12, 2019

Aschen added some commits Mar 13, 2019

Merge branch 'add-ability-to-restrict-collection-mapping' of github.c…
…om:kuzzleio/kuzzle into add-ability-to-restrict-collection-mapping

@Aschen Aschen removed the wip label Mar 26, 2019

Aschen added some commits Mar 26, 2019

nit
Show resolved Hide resolved .kuzzlerc.sample
Show resolved Hide resolved .kuzzlerc.sample
Show resolved Hide resolved .kuzzlerc.sample
Show resolved Hide resolved .kuzzlerc.sample
Show resolved Hide resolved lib/services/elasticsearch.js
Show resolved Hide resolved lib/services/internalEngine/index.js Outdated
Show resolved Hide resolved lib/util/esWrapper.js
Show resolved Hide resolved lib/util/esWrapper.js
Show resolved Hide resolved lib/services/internalEngine/index.js

xbill82 and others added some commits Mar 26, 2019

Apply suggestions from @xbill82
Co-Authored-By: Aschen <amaret93@gmail.com>
Merge branch 'add-ability-to-restrict-collection-mapping' of github.c…
…om:kuzzleio/kuzzle into add-ability-to-restrict-collection-mapping
@@ -799,7 +807,15 @@ class ElasticSearch extends Service {
getMapping(request) {
const esRequest = initESRequest(request);

return this.esWrapper.getMapping(esRequest);
const includeKuzzleInfo = (() => {

This comment has been minimized.

Copy link
@scottinet

scottinet Apr 19, 2019

Member

... this might be one of the most complex boolean calculation I've ever seen 😲

Consider this replacement:

const includeKuzzleInfo = request.input.args.includeKuzzleInfo === true;

Also, we probably should enforce this option to a boolean value, and throw if another kind of value is set, instead of coercing the string "false" to the boolean false, and any other kind of non-falsey value to true.
If you go that route, then this should probably look like this:

if (!_.isNil(request.input.args.includeKuzzleInfo) && typeof request.input.args.includeKuzzleInfo !== 'boolean') {
  throw new BadRequestError(`Invalid includeKuzzleInfo value ${request.input.args.includeKuzzleInfo}: boolean expected`);
}

const includeKuzzleInfo = Boolean(request.input.args.includeKuzzleInfo);

This comment has been minimized.

Copy link
@Aschen

Aschen Apr 19, 2019

Author Contributor

Yes but when you use this parameter in an url with the HTTP protocol (eg: ?includeKuzzleInfo=false) you will never have a boolean value, only a string. This is why I have to do this complex boolean calculation.

By the way, we have the same misbehaviour with ?pretty=false

This comment has been minimized.

Copy link
@scottinet

scottinet Apr 19, 2019

Member

I don't like dealing with HTTP specifics outside the HTTP protocol code.

Actually, you mention the pretty option, and it is a very good example:

$ curl 'http://localhost:7512/users/_me?pretty=false'                                                     
{
  "requestId": "b705dc66-c6b9-41ba-a951-38785b40c856",
  "status": 200,
  "error": null,
  "controller": "auth",
  "action": "getCurrentUser",
  "collection": null,
  "index": null,
  "volatile": null,
  "result": {
    "_id": "-1",
    "_source": {
      "profileIds": [
        "anonymous"
      ],
      "name": "Anonymous"
    },
    "_meta": {},
    "strategies": []
  }
}

So... it's pretty printed.

This is because in URLs we decided to deal with flags (what booleans are) as.. flags: if you put the option in the querystring, then it's set. If you want to unset the option, you remove it from the URL.
Kuzzle ignores any value you put to the pretty option. As long as it's there, Kuzzle pretty prints.

Also, my remark about your code being complicated is mainly about the anonymous function that you create and execute just after that, I don' get the reason behind that.

This comment has been minimized.

Copy link
@scottinet

scottinet Apr 19, 2019

Member

After giving a long, hard think about it, I guess the only way to deal with this situation is to make the HTTP problem explicit:

// In HTTP, booleans are flags: if it's in the querystring, it's set, whatever
// its value.
// If a user needs to unset the option, they need to remove it from the querystring.
if (request.context.connection.protocol !== 'http' 
  && !_.isNil(request.input.args.includeKuzzleInfo) 
  && typeof request.input.args.includeKuzzleInfo !== 'boolean') {

  throw new BadRequestError(`Invalid includeKuzzleInfo value ${request.input.args.includeKuzzleInfo}: boolean expected`);
}

const includeKuzzleInfo = Boolean(request.input.args.includeKuzzleInfo);

This comment has been minimized.

Copy link
@Aschen

Aschen Apr 22, 2019

Author Contributor

I had to do slightly differently because an empty string is considered as a false value.

I found strange to deal with this kind of protocol specific code inside the elasticsearch service. I think that the elasticsearch service should not be aware of our Request object. The controllers should extract the different parameters and then pass them to the service.
getMapping (index, collection, includeKuzzleInfo = false) should be the method signature

This comment has been minimized.

Copy link
@scottinet

scottinet Apr 24, 2019

Member

@Aschen > One last question before we merge this PR: I wonder if checking for HTTP specifics wouldn't be better made in the controllers invoking that method?

This is a last minute thought I just got. Tell me how you feel about it, so I can give my validation and finally get this much awaited feature a go ;-)

This comment has been minimized.

Copy link
@Aschen

Aschen Apr 24, 2019

Author Contributor

Yes I agree that this will be better but still I think we should refactor the way we extract param from the request and do this in the controllers

This comment has been minimized.

Copy link
@scottinet

scottinet Apr 25, 2019

Member

This may warrant a workshop, there is a lot to say about this subject (I do, at least ;-) ).

// Preserve old version of kuzzle metadata mapping
if (oldMapping.properties && oldMapping.properties._kuzzle_info) {
Object.assign(
commonMapping._kuzzle_info.properties,

This comment has been minimized.

Copy link
@scottinet

scottinet Apr 19, 2019

Member

This mutates either the in-memory copy of the default mapping from the configuration, or the one from the index cache.

Show resolved Hide resolved lib/services/elasticsearch.js
* @param {string} collection
* @returns {Promise.<object>} returns the updated default mapping
*/
applyDefaultMapping (index, collection, mapping) {

This comment has been minimized.

Copy link
@scottinet

scottinet Apr 19, 2019

Member

the jsdoc is outdated

// Preserve old version of kuzzle metadata mapping
if (existingMapping.properties && existingMapping.properties._kuzzle_info) {
Object.assign(
defaultMapping._kuzzle_info.properties,

This comment has been minimized.

Copy link
@scottinet

scottinet Apr 19, 2019

Member

I checked, and the mapping parameter comes from the cache (index cache or configuration data).
You should not mutate it and, therefore, you have to change this Object.assign call

Aschen added some commits Apr 19, 2019

@Aschen Aschen force-pushed the add-ability-to-restrict-collection-mapping branch from 1787f65 to 8c550e5 Apr 22, 2019

Aschen added some commits Apr 22, 2019

Merge branch 'add-ability-to-restrict-collection-mapping' of github.c…
…om:kuzzleio/kuzzle into add-ability-to-restrict-collection-mapping

Aschen and others added some commits Apr 25, 2019

@kuzzle

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

SonarQube analysis reported 1 issue

Note: The following issues were found on lines that were not modified in the pull request. Because these issues can't be reported as line comments, they are summarized here:

  1. INFO index.js#L104: Complete the task associated to this TODO comment. rule

@Aschen Aschen merged commit df1e6b1 into 1-dev Apr 25, 2019

4 checks passed

LGTM analysis: JavaScript No new or fixed alerts
Details
codecov/project 93.84% (+<.01%) compared to 3258c0f
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
sonarqube SonarQube reported 1 issue, no criticals or blockers

@Aschen Aschen deleted the add-ability-to-restrict-collection-mapping branch Apr 25, 2019

@kuzzle kuzzle referenced this pull request Apr 29, 2019

Merged

Release 1.7.3 #1293

Aschen added a commit that referenced this pull request Apr 29, 2019

Release 1.7.3 (#1293)
# [1.7.3](https://github.com/kuzzleio/kuzzle/releases/tag/1.7.3) (2019-04-29)


#### Bug fixes

- [ [#1288](#1288) ] [bulk] fix an error when trying a non-partial bulk update   ([scottinet](https://github.com/scottinet))
- [ [#1286](#1286) ] [bugfix] allows bulk inserts on aliases   ([benoitvidis](https://github.com/benoitvidis))
- [ [#1282](#1282) ] [bugfix] scan keys on redis cluster   ([benoitvidis](https://github.com/benoitvidis))
- [ [#1279](#1279) ] Users must be authenticated to use auth:logout   ([scottinet](https://github.com/scottinet))

#### Enhancements

- [ [#1292](#1292) ] KZL 1032 - Throw an error when the realtime controller is invoked by plugin developers   ([benoitvidis](https://github.com/benoitvidis))
- [ [#1257](#1257) ] Add ability to define mapping policy for new fields   ([Aschen](https://github.com/Aschen))
- [ [#1291](#1291) ] [Kuzzle CLI] Fix --help on subcommands   ([Yoann-Abbes](https://github.com/Yoann-Abbes))
- [ [#1289](#1289) ] [WebSocket] Handle ping/pong packets   ([scottinet](https://github.com/scottinet))
- [ [#1273](#1273) ] Fix incomplete access logs   ([scottinet](https://github.com/scottinet))
---

alexandrebouthinon added a commit to kuzzleio/documentation that referenced this pull request Apr 29, 2019

Release 2.1.1 (#289)
* Add documentation page about mappings (#271)

Adds a documentation essentials page about mappings:

dynamic mapping policy
collection metadata
properties types definitions
Also update SDKs

See kuzzleio/kuzzle#1257

* Embedded protocols (#286)

Documentation for embedded protocols, notably for MQTT.

* Extending the JS SDK with controllers (#284)

Add a page about how to extend the JS SDK with custom controllers.  
https://deploy-preview-284--kuzzle-doc-v2.netlify.com/sdk-reference/js/6/extend-sdk/

Also document:
 - BaseController class
 - Kuzzle.useController method

See kuzzleio/sdk-javascript#388

* [KZL-907] Getting started dev plugin (#276)

## What does this PR do?

This PR responds to this [ticket](https://jira.kaliop.net/browse/KZL-907).
Also, it increases the readability of documentation's plugin part.

### How should this be manually tested?

[Netify](https://deploy-preview-276--kuzzle-doc-v2.netlify.com/plugins/1/essentials/introduction/)

### Other changes

- Re-arrange plugin documentation path
- Update dead link in boilerplate repository

### Boyscout

Prettier

* Release 2.1.1

@Aschen Aschen referenced this pull request Jun 14, 2019

Merged

Release 1.8.0 #1322

Aschen added a commit that referenced this pull request Jun 14, 2019

Merge pull request #1322 from kuzzleio/1.8.0-proposal
Release 1.8.0

Bug fixes

    [ #1311 ] Fix promise leaks (scottinet)
    [ #1298 ] Fix disabled protocol initialization (Aschen)
    [ #1297 ] Fix timeouts on plugin action returing the request (benoitvidis)
    [ #1288 ] Fix an error when trying a non-partial bulk update (scottinet)
    [ #1286 ] Allows bulk inserts on aliases (benoitvidis)
    [ #1282 ] Scan keys on redis cluster (benoitvidis)
    [ #1279 ] Users must be authenticated to use auth:logout (scottinet)

New features

    [ #1315 ] Add the new Vault module to handle encrypted application secrets (Aschen)
    [ #1302 ] Add write and mWrite (Aschen)
    [ #1305 ] Add pipes & hooks wildcard event (thomasarbona)

Enhancements

    [ #1318 ] Add a maximum ttl to auth:login (benoitvidis)
    [ #1301 ] Upgrade the WebSocket libraries (scottinet)
    [ #1308 ] Events triggering refactor (scottinet)
    [ #1300 ] Collection specifications methods cloisoned to a collection (thomasarbona)
    [ #1295 ] Improve validation error messages (benoitvidis)
    [ #1292 ] Throw an error when the realtime controller is invoked by plugin developers (benoitvidis)
    [ #1257 ] Add ability to define mapping policy for new fields (Aschen)
    [ #1291 ] Fix --help on subcommands (Yoann-Abbes)
    [ #1289 ] Handle ping/pong packets (scottinet)
    [ #1273 ] Fix incomplete access logs (scottinet)

Others

    [ #1317 ] Add ps dependency to plugin-dev Docker image for pm2 (benoitvidis)
    [ #1312 ] Check that .kuzzlerc.sample is well-formed (scottinet)
    [ #1299 ] Add Kuzzle Nightly & Redis 3 and 4 test (alexandrebouthinon)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.