Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

any.label() significantly slows validation of large arrays #874

Closed
mikejpeters opened this issue Apr 25, 2016 · 7 comments
Closed

any.label() significantly slows validation of large arrays #874

mikejpeters opened this issue Apr 25, 2016 · 7 comments
Assignees
Labels
breaking changes Change that can breaking existing code feature New functionality or improvement
Milestone

Comments

@mikejpeters
Copy link

Context

  • node version: 4.3.0
  • joi version: 8.0.5
  • environment: node

What are you trying to achieve or the steps to reproduce?

Use any.label() on a schema within an array, or on the array schema directly.
We use Joi to validate our API responses, so it's common that we are validating large arrays.

const Joi = require('joi');

const users = getUsers();
const schema = Joi.array().items(
  Joi.object().keys({
    username: Joi.string().required()
  })
);

console.time('without label');
Joi.validate(users, schema);
console.timeEnd('without label');

console.time('with label');
Joi.validate(users, schema.label('Users'));
console.timeEnd('with label');

///

function getUsers() {
  const users = [];
  for (var i=0; i<10000; i++) {
    users.push({username: 'example'});
  }
  return users;
}

Which result you had?

A significant impact on validation performance:

without label: 163ms
with label: 448ms

What did you expect?

No noticeable impact on performance.

@Marsup
Copy link
Collaborator

Marsup commented Apr 26, 2016

Can you use proper benchmarking tools to do that ? (eg. bench)

@mikejpeters
Copy link
Author

'use strict';

const Benchmark = require('benchmark');
const Joi = require('joi');

const users = getUsers();

const schema = Joi.array().items(
  Joi.object().keys({
    username: Joi.string().required()
  })
);

const schemaWithInnerLabel = Joi.array().items(
  Joi.object().keys({
    username: Joi.string().required()
  }).label('User')
);

const schemaWithOuterLabel = Joi.array().items(
  Joi.object().keys({
    username: Joi.string().required()
  })
).label('Users');

new Benchmark.Suite()
  .add('no label', () => {
    Joi.validate(users, schema);
  })
  .add('inner label', () => {
    Joi.validate(users, schemaWithInnerLabel);
  })
  .add('outer label', () => {
    Joi.validate(users, schemaWithOuterLabel);
  })
  .on('cycle', (event) => {
    console.log(String(event.target));
  })
  .run();

///

function getUsers() {
  const users = [];
  for (var i=0; i<100; i++) {
    users.push({username: 'example'});
  }
  return users;
}
no label x 745 ops/sec ±2.24% (81 runs sampled)
inner label x 185 ops/sec ±2.23% (79 runs sampled)
outer label x 247 ops/sec ±2.26% (79 runs sampled)

I believe the cause of the speed difference is in lib/object.js#L150 the options object is cloned and then modified to prevent the label option from being inherited:

    // Children mustn't inherit the current label if it exists
    const childOptions = options.language && options.language.label ?
        Hoek.applyToDefaults(options, { language: { label: null } }, true) :
        options;

I'm not familiar enough with the code, but it seems strange to me that this must run for every element in the array when validating an array.

@glennjones
Copy link

Was just working on the same thing. Built a little github repo and used some more complex examples to try and see where the load is https://github.com/glennjones/benchmarking-joi-swagger. It does like adding any property label, meta etc slows things down more than would be expected.

@Marsup
Copy link
Collaborator

Marsup commented Apr 26, 2016

I can pretty much guarantee meta won"t slow anything down in the validation process. The childOptions might be computed too many times but it's the only sane way to deal with the propagation of options. 1st step would be to try to optimize this removal of label with a simpler version of it, we probably don't need a full clone (I didn't expect it to be so consuming). If anyone feels like working on it...

@hueniverse hueniverse added help wanted feature New functionality or improvement labels Apr 26, 2016
@Marsup
Copy link
Collaborator

Marsup commented Apr 26, 2016

I have another idea (that involves breaking more stuff) that I'm going to try but don't let it stop you ;)

@mikejpeters
Copy link
Author

Optimizing the line I referenced in lib/object.js (that clones the options to prevent children from inheriting labels), will improve the performance when validating large arrays where a label is applied to the array schema itself, but not where the label is applied to the schema within the array. I tried to figure out why that case runs slowly but I'm stumped unfortunately.

@Marsup
Copy link
Collaborator

Marsup commented Apr 26, 2016

My alternative strategy improves it a bit but still a 40% loss over baseline, I'm not yet explaining this. That's for another day if you don't mind.

@Marsup Marsup added this to the 9.0.0 milestone Jun 11, 2016
@Marsup Marsup self-assigned this Jun 11, 2016
@Marsup Marsup closed this as completed in 8c48da5 Jun 11, 2016
@Marsup Marsup added the breaking changes Change that can breaking existing code label Jun 11, 2016
@lock lock bot locked as resolved and limited conversation to collaborators Jan 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
breaking changes Change that can breaking existing code feature New functionality or improvement
Projects
None yet
Development

No branches or pull requests

4 participants