Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to ignore filters when value is undefined #123

Closed
antonsamper opened this issue Mar 13, 2017 · 31 comments
Closed

Option to ignore filters when value is undefined #123

antonsamper opened this issue Mar 13, 2017 · 31 comments

Comments

@antonsamper
Copy link
Contributor

antonsamper commented Mar 13, 2017

This is a feature idea/enhancement as opposed to an issue.

I'm trying to build a query that will either match a specific value or allow any value. In my particular case I'm having to do this because the value is optional and therefore if its not set by the user, then it should not filter by any value.

My current solution is to firstly figure out if the value has been passed in, if so, then i will set .filter('match', 'message', 'value') and if it hasn't been set, then i will use a .filter('wildcard', 'message', '*'). This is working but I don't like that I'm having to work out what type of filter I should be using.

As a result, what do you think about allowing conditional filters to be applied? Have a look a builder below. To my knowledge, the current output from the builder doesn't do anything (although I could be wrong). However, my proposed solution would be that if undefined is passed in as the value, then the builder wont generate the code for that filter.

bodybuilder()
  .filter('match', 'message', undefined)
  .build()
Current output
{
  "query": {
    "bool": {
      "filter": {
        "match": {
          "field": "message"
        }
      }
    }
  }
}
Proposed output
{}

This could be applied to both .query() and .filter().

What do you think? Can you see any implications?

@antonsamper antonsamper changed the title Optioin to ignoring filters when value is undefined Option to ignore filters when value is undefined Mar 13, 2017
@johannes-scharlach
Copy link
Collaborator

The number one way I use bodybuilder is in this style:

const builder = bodybuilder()

if (author) builder.filter('term', 'author', author)
if (message) builder.filter('match', 'message', message)

const body = builder.build()

People reading this code have it easy figuring out what's going on. With your proposal it could sure be shortened.

My main concern is developer confusion: If your variable is accidentally undefined, you'll have a much harder time figuring out why you don't see the expected output. Currently the search would fail and you'll start investigating why you're passing in undefined.

@johannes-scharlach
Copy link
Collaborator

👋 @antonsamper Let me know if you think there is still need for this request and we can reopen the issue 💯

@antonsamper
Copy link
Contributor Author

@johannes-scharlach yeah ive been using a similar technique but it becomes taxing when having to conditionally check many fields in order to build complex queries.

Although, i agree that your solution makes the code easy to understand, if a developer doesnt code it that way and accidentally passes undefined, then the query output itself is quite confusing. I personally had to debug the query code and then check the ES docs to see what was happening.

@johannes-scharlach
Copy link
Collaborator

🤔 there is no reasonable query coming out of passing in undefined, so maybe it would be best to just throw an error? Anything else would be unintended...

@antonsamper
Copy link
Contributor Author

Although its not my preferred solution, i think throwing an error or even a warning would better than the current the current output. Worth noting though that throwing an error would be a breaking change where as a warning could pass as a version bump

@johannes-scharlach
Copy link
Collaborator

Well, the current solution throws an error when you actually use it, while throwing an error when you pass in undefined would throw immediately. Am I missing something?

If that's the case, then just throwing an error earlier doesn't sound like a major bump, it's actually improved developer experience and hence a patch?

@johannes-scharlach
Copy link
Collaborator

yeah ive been using a similar technique but it becomes taxing when having to conditionally check many fields in order to build complex queries.

sorry, I missed this part and only thought about the following section of your comment. I actually have a similar situation – but for me if a property is set I first need to modify it slightly and then I can add it to the query. This is turning into a real fun problem to think about and maybe there is more to it than just handling undefined – which wouldn't be enough in my case.

@antonsamper
Copy link
Contributor Author

hi @johannes-scharlach , the following code doesn't throw an error or output any warnings:

bodybuilder()
  .filter('match', 'message', undefined)
  .build()

Above, where you mention the current solution, what are you referring to? can you provide an example?

Also, could you share an example of the type of transformations you are having to do?

I wonder if it would be nice to build some kind of plugin feature. So we could leave bodybuilder as it is but also include the plugins:

bodybuilder({
  plugins: ['ignore-undefined']
})

Although this may be going above and beyond the scope of this project.

@johannes-scharlach
Copy link
Collaborator

Sorry, I thought

elasticsearchClient.search({
  index,
  body: bodybuilder().filter('match', 'message', undefined).build()
})

would throw. But instead it just returns no results. So you're right, it would be a breaking change.

I love the plugin idea. I'd actually like to work on a v3 of bodybuilder, and allowing for plugins should most likely be one of the features.

some examples on how I'm using bodybuilder:

// A
if (filters.tag) {
  builder.filter('term', 'tag.lowercase', filters.tag.toLowerCase())
}

// B
if (filters.network) builder.filter('term', 't', filters.network)

// C
if (filters.endDate || filters.startDate) {
  builder.filter('range', 'z', {
    gte: moment(filters.startDate || 0).startOf('day').toISOString(),
    lte: moment(filters.endDate).endOf('day').toISOString()
  })
}

This proposal would shorten B and since I can ensure the lowercasing elsewhere also A. But for C I currently wouldn't see a way. Maybe it's ok if some more complex cases still need to be handled outside of bodybuilder, but we'd simplify the usage of bodybuilder.

The idea of the feature is growing on me, I must say!

@johannes-scharlach
Copy link
Collaborator

Reopening the issue, as the discussion is ongoing

@ferronrsmith
Copy link
Collaborator

It's very easy to validate your parameters before passing them to bodybuilder. It's not really the libraries job to that. There are cases where passing an undefined/needed is necessary. e.g

bodybuilder()
        .filter('exists', 'user')
        .orFilter('term', 'user', 'johnny')
        .aggregation('terms', 'user')
        .build()

@antonsamper
Copy link
Contributor Author

hi @ferronrsmith, I agree in that the core library shouldn't handle validation but I wouldn't class this as a validation issue, I would argue that the current output is incorrect as it looks like the generated query is doing something, but it isn't.

When you say "there are cases for passing in undefined", can you give an example? I'm not too sure what output you would be expecting.

@ferronrsmith
Copy link
Collaborator

ferronrsmith commented Sep 8, 2017

The example is above.

bodybuilder()
        .filter('exists', 'user').build() // 

That's one case I can think of at the top of my head

{
  "query": {
    "bool": {
      "filter": {
        "exists": {
          "field": "user"
        }
      }
    }
  }
}

Having a search term would be invalid,

bodybuilder()
.filter('exists', 'user', 'as').build() //

{
  "query": {
    "bool": {
      "filter": {
        "exists": {
          "user": "as"
        }
      }
    }
  }
}

Like the vanilla DSL, you still have to know what you're doing. bodybuilder is facade for writing less verbose code.

@johannes-scharlach
Copy link
Collaborator

johannes-scharlach commented Sep 8, 2017

@ferronrsmith I think that's a slight misunderstanding of what we're talking about here. The use-case is

var foo

bodybuilder().filter('term', 'user', foo).build()

now for foo a value was explicitly passed in, but it's undefined. Currently that leads to the output

{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "field": "user"
        }
      }
    }
  }
}

but we all know that this is not what anybody wanted. (Similarly bodybuilder().filter('exists', undefined).build() can't lead to a good query)

The question is:

  1. should bodybuilder decide that this filter should be dropped
  2. should bodybuilder throw an error, because a user can easily make sure that no undefined value is passed to bodybuilder.
  3. Is there another reasonable way to handle this case?

Bodybuilder should improve the developer's experience. The status quo is pretty bad at that, so we're just wondering how to improve that.

@ferronrsmith
Copy link
Collaborator

If you throw an error this case would fail.

bodybuilder()
        .filter('exists', 'user', undefined).build() // 

Which is a valid query. I understand what you guys are saying but I don't see the real value. I might be missing something.

const builder = (args) => {
 // args schema val
// fail/continue
bodybuilder()
        .filter(args.type, args.name, args.value).build() // 
}

I like the fact that it's close to the DSL as possible, but not too safe. That might be for my own selfish reasons (learning XP). If you find the feature useful then go ahead.

@antonsamper
Copy link
Contributor Author

antonsamper commented Sep 8, 2017

@ferronrsmith when you say this is a valid query...

bodybuilder().filter('exists', 'user', undefined).build()

... do you mean the output is a valid JSON structure or that the query would potentially return a set of results? I haven't been able to find any information that suggests this is a valid elasticsearch query.

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html

@danpaz what are your thoughts on this?

@ferronrsmith
Copy link
Collaborator

ferronrsmith commented Sep 9, 2017

yes it will return results, if the field exists

bodybuilder().filter('exists', 'user').build()

https://www.elastic.co/guide/en/elasticsearch/reference/5.5/query-dsl-exists-query.html

The docs shows

bodybuilder().notQuery('exists', 'user').build()

=>

{
  "query": {
    "bool": {
      "must_not": [
        {
          "exists": {
            "field": "user"
          }
        }
      ]
    }
  }
}

There is actually test cases that you can look at.

https://github.com/danpaz/bodybuilder/blob/master/test/query-builder.js#L281
https://github.com/danpaz/bodybuilder/blob/master/test/index.js#L14

@johannes-scharlach
Copy link
Collaborator

// (1)
bodybuilder().filter('exists', 'user').build()
// (2)
bodybuilder().filter('exists', 'user', undefined).build()

these two currently return the same result, but bodybuilder could treat them differently. I understand that (1) is a valid & important query, I'm only wondering what behaviour you expect from bodybuilder in the case of (2).

// (3)
bodybuilder().filter('exists', undefined).build()

here is another use-case where bodybuilder currently doesn't really give you a very useful output. (technically there is bodybuilder().filter('match_all').build(), but again I wouldn't pass in a second argument at all)

@ferronrsmith
Copy link
Collaborator

// (3)
bodybuilder().filter('exists', undefined).build()

I see that more as common-sense. A query type must be executed against a field. Leaving this out in the DSL will return an error.

We can throw exception, but the problem with this is that. I'll have to cater for this in a try-catch block (might be a breaking change for some).

There's a lot of query combination/arguments that can actually generate errors. It does require to the user to have knowledge of what they are doing, which is fine.

e.g. The query is invalid, but no error is thrown, but when submitted to ES it does.

bodybuilder().filter('exists', 'score', 'something').build()

@johannes-scharlach
Copy link
Collaborator

The question we're trying to answer within this thread is how to react when undefined is actively passed in as an argument.

Do you think the current behaviour is particularly useful to anyone? What can bodybuilder do to provide the best possible experience to users?

Any other validation (or lack thereof) doesn't seem to be in the scope of this discussion.

@ferronrsmith
Copy link
Collaborator

ferronrsmith commented Sep 9, 2017

Current behavior is useful to me. Can't speak for no one else, but it might be because my use case is different from everyone else

Actually it doesn't matter. Ensuring type/field is present might be useful.

  1. should bodybuilder decide that this filter should be dropped
    Not a fan of this, simply because it doesn't really say why the filter was dropped
  2. should bodybuilder throw an error, because a user can easily make sure that no undefined value is passed to bodybuilder.
    Throwing an exception when these arguments are missing., Special they are required
  3. Leave as is. User handles validation.

Fine with 2, 3 or 2+3 works for me (Activate 2 via plugin mechanism, but still retain 3)

@antonsamper
Copy link
Contributor Author

In its current form, bodybuilder is assuming that by passing undefined that the user doesn't want to match a string and instead decides to match on field existence. I'm not fond of this behaviour as my original query has changed (i realise that not everyone may think about it the same way).

@ferronrsmith @johannes-scharlach In order to create minimal impact to current users, my favourite solution is therefore a plugin mechanism. How feasible is this?

@johannes-scharlach
Copy link
Collaborator

johannes-scharlach commented Sep 12, 2017

I think given the current status of pushQuery, it won't be straightforward to allow for plugins for generating queries. That's because right now the code to generate a new query and to add it to current queries is closely coupled.

It is one of the biggest pain-points in maintaining the current bodybuilder, so for v3 I'd expect to rewrite that section anyways. I can imagine a plugin system of the following form:

bodybuilder.registerSerializer({
  test (field, value, options) {
    // check if you want to handle the query arguments in your own way.
    // if falsy, fall back to standard serializer(s)
    return true
  },
  build (field, value, options) {
    // build the query however you like
    // return null/undefined if you want to drop the query
    return { foo: 'bar' }
  }
})

This would make it easy to register a custom serializer for a cummulative_sum query too, see #149

Such a plugin system would also make it easy to say that we only require value to be present for certain queries. (e.g. if it's an exist query, I wouldn't actively care if no value was passed or if it's just undefined, but for a terms query I would throw a validation error either way...). I think this gets close to what @ferronrsmith was suggesting above 🎉

If you have other ideas on how you'd like to define plugins, keep them coming! I was heavily inspired by jest's api to define snapshot serializers.

@antonsamper
Copy link
Contributor Author

I quite like that approach with the test and build methods. The only thing i would suggest is maybe allow an array to be passed in as opposed to an object, that way you could register multiple:

import plugin1 from 'bodybuilder-plugin1';
import plugin2 from 'bodybuilder-plugin2';

bodybuilder.registerSerializers([plugin1, plugin2])

What do you think?

@johannes-scharlach
Copy link
Collaborator

johannes-scharlach commented Sep 14, 2017

I think the standard will be to add only one serializer. Going into a little more detail:

import plugin1 from 'bodybuilder-plugin1';
import plugin2 from 'bodybuilder-plugin2';

bodybuilder.registerSerializer(plugin1)
bodybuilder.registerSerializer(plugin2)

bodybuilder.serializers
// returns
// [ /*plugin2*/, /*plugin1*/, /*default serializer*/]

bodybuilder.resetSerializers()
bodybuilder.serializers // [/*default serializer*/]

bodybuilder.serializers = [plugin1, plugin2]
bodybuilder.serializers // [plugin1, plugin2]

@chr4ss1
Copy link

chr4ss1 commented Sep 14, 2017

I too miss the ability to have conditional query output. It makes the resulting queries much harder to understand, and not neat at all.

The pattern I currently follow is to explictly say that we are dealing with conditional method (eg filterConditionally), and then you also have to pass in explictly the filter inclusion condition as well.

Here's an example:

 qb.andFilter('bool', (b) => b
            .orFilter('bool', (b1) =>
                b1
                    .filter('term', '_type', 'transferMoney')
                    .filter('term', 'from.businessId', args.businessId)

            .orFilter('bool', (b1) => 
                b1
                    .filter('term', '_type', 'transferMoney')
                    .filter('term', 'to.businessId', args.businessId)
                    .filterConditionally('term', 'to.cashAccount.currencyId', args.currencyId, !!args.currencyId)))));

the logic is simple, every last argument of *Conditionally method has the include/don't include boolean param, which then decides if to evaluate the query or just pass it through.

@ferronrsmith
Copy link
Collaborator

ferronrsmith commented Sep 14, 2017

Not that hard to monkey-patch

const customBuilder = () => {
  const builder = new Bodybuilder();

  Object.assign(builder, {
    filterConditionally: () => {
      const args = Array.prototype.slice.call(arguments, 0, arguments.length - 1);
      const condition = arguments[arguments.length - 1];

      return condition ? this.filter.apply(this, args) : this;
    }
  });
  return builder;
};

// using
var body = customBuilder();
body.filterConditionally('range', 'age', { lte: 60, gte: 18 }, !!whatever);
body.build();

@chr4ss1
Copy link

chr4ss1 commented Sep 14, 2017

I've already tried something similar however due to the nature how it's built, it's hard to monkey patch it.

I am using v2, what I tried is this:

import * as bodybuilder from 'bodybuilder';
const qb = bodybuilder();
Object.assign(qb, {
    filterConditionally: () => {
      const bargs = Array.prototype.slice.call(arguments, 0, arguments.length - 1);
      const condition = arguments[arguments.length - 1];
      return condition ? this.filter.apply(this, bargs) : this;
    },
  });

and how I use it:

qb.orFilter('bool', (b1) => {
                return b1
                    .filter('term', '_type', 'transferMoney')
                    .filterConditionally('term', 'from.cashAccount.currencyId', args.currencyId, !!args.currencyId)
                    .filter('term', 'from.businessId', args.businessId);
            })

message: 'b1.filter(...).filterConditionally is not a function',

do let me know if I misundertood your original code. Note how I am chaining it (while being nested). I believe it would work when I do it directly (qb.filterConditionally) however it wouldn't be of use.

(and note when I said the pattern I currently follow was meant that I use when using other query builders (eg knex) - I haven't found a way to do it for bodybuilder)

@ferronrsmith
Copy link
Collaborator

ferronrsmith commented Sep 15, 2017

var wrap = (builder) => {
    Object.assign(builder, {
        filterConditionally: () => {
            const args = Array.prototype.slice.call(arguments, 0, arguments.length - 1);
            const condition = arguments[arguments.length - 1];
            return condition ? this.filter.apply(this, args) : this;
        }
    });
    return builder;
};

// wrap if access to patches are needed in the outer scope
const qb = wrap(new Bodybuilder());

qb.orFilter('bool', (b1) => {
    // because the nestedCallback recreates query/filter builder
    // the nested callback has to be wrapped in order to access patches
   // NB : each nesting must be wrapped
    return wrap(b1)
        .filter('term', '_type', 'transferMoney')
        .filterConditionally('term', 'from.cashAccount.currencyId', args.currencyId, !!args.currencyId)
        .filter('term', 'from.businessId', args.businessId);
});

@johannes-scharlach
Copy link
Collaborator

I think it should be even simpler: Only out of convenience for the user is bodybuilder passing in an initialised builder in the callback. You can just do

function myBB () {
  return Object.assign(bodybuilder(), {
        filterConditionally: () => {
            const args = Array.prototype.slice.call(arguments, 0, arguments.length - 1);
            const condition = arguments[arguments.length - 1];
            return condition ? this.filter.apply(this, args) : this;
        }
    })
}

const qb = myBB().orFilter('bool', () => {
  return myBB()
    .filter('term', '_type', 'transferMoney')
    .filterConditionally(
      'term',
      'from.cashAccount.currencyId',
      args.currencyId,
      !!args.currencyId
    )
    .filter('term', 'from.businessId', args.businessId)
})

(after typing this up, I realised that there are almost no differences to @ferronrsmith's suggestion. This has been a conscious decision before though that you can always pass any bodybuilder/ish object as a return to the callback.)

@chr4ss1
Copy link

chr4ss1 commented Sep 15, 2017

cool, that should work indeed, I'll give it a go in the future.

I'll show you how I have defined conditional operators in the past for other libraries (knex specifically) maybe it'll spark some ideas if you ever think about incorporating it in to the library itself.

const sql = queryBuilder('product')
  .distinctOn('product.id')
  .select('product.id as id', 'productvariant.id as productvariantid')
  .join('productvariant', 'productvariant.productid', 'product.id')      
  .conditionalJoin('categoryproduct', 'product.id', '=', 'categoryproduct.productid', includeCategories)
  .conditionalJoin('category', 'category.id', '=', 'categoryproduct.categoryid', includeCategories)
  .conditionalJoin('autocolourtag', 'autocolourtag.productvariantid', '=', 'productvariant.id', includeAutoColourtag || tagged)
  .conditionalWhere('product.price', '>', fromPrice, fromPrice)
  .conditionalWhere('product.price', '<', toPrice, toPrice && toPrice != 100000)
  .conditionalWhere('product.name', '~*', searchText, !!searchText)
  .where(function() {
    if (tags && tags.length > 0) {
      tags.forEach((tag, index) => {     
        if (tag == 1) {
          this.orWhere('autocolourtag.temp', 1); // warm
        } else if (tag == 2) {
          this.orWhere('autocolourtag.temp', 2); // cool
        } else if (tag == 3) {
          this.orWhere('autocolourtag.sat', 1); // soft
        } else if (tag == 4) {
          this.orWhere('autocolourtag.sat', 2); // bright          
        } else if (tag == 5) {
          this.orWhere('autocolourtag.light', 4); // light          
        } else if (tag == 6) {
          this.orWhere('autocolourtag.light', 3); // medium          
        } else if (tag == 7) {
          this.orWhere('autocolourtag.light', 2); // dark
        }
      });
    }
  })
  .where(function () {
    if (neutrals != 1 && hsls && hsls.length > 0) {
      const radius = 10;
      hsls.forEach((hsl, index) => {
        this.where(function () {
          this.andWhereBetween('autocolourtag.h', [(parseFloat(hsl.h) - radius), (parseFloat(hsl.h) + radius)]);
          this.andWhereBetween('autocolourtag.s', [(parseFloat(hsl.s) - radius), (parseFloat(hsl.s) + radius)]);
          this.andWhereBetween('autocolourtag.l', [(parseFloat(hsl.l) - radius), (parseFloat(hsl.l) + radius)]);
        })
      });
    }
  })
  .where(function() {
    if (neutrals != 1 && colours && colours.length > 0) {
      colours.forEach((colour, index) => {
        this.orWhereRaw('LOWER(autocolourtag.name) = \'' + colour.toLowerCase() + '\'');
      });
    }
  })
  .where(function() {
    if (tagged) {
      this.andWhereBetween('autocolourtag.prop', [0, 100]);
    }
  })
  .where(function() {
    if (includeCategories) {
      this.orWhereIn('category.id', categories);
      this.orWhereIn('category.parentid', categories);
    }
  })
  .conditionalWhereRaw('LOWER(autocolourtag.name) IN (\'black\', \'white\', \'grey\')', neutrals === 1)
  .conditionalWhereRaw('LOWER(autocolourtag.name) NOT IN (\'black\', \'white\', \'grey\')', neutrals === 2)      
  //.conditionalWhereIn('category.id', categories, includeCategories)
  //.conditionalWhereIn('category.parentid', categories, includeCategories)
  .conditionalWhereIn('product.brandid', brands, brands && brands.length > 0)
  .orderBy('product.id', 'desc')
  .limit(limit)
  .offset(skip)
  .toString();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants