Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema definition is too slow #182

Closed
kirs opened this issue Jun 11, 2016 · 15 comments

Comments

@kirs
Copy link

commented Jun 11, 2016

First of all, thank you so much for making dry-validation. At @Shopify, we are looking forward to try it on our scale.

This is not particularly dry-validation issue, but for some reason the gem require time is quite slow:

require 'bundler/setup'
require 'benchmark'

puts Benchmark.measure {
  require 'dry-validation'
}
$ ruby bench.rb
0.240000   0.150000   0.390000 (  0.401198)

That's 400ms or almost half a second just to require the gem.
All credits to @casperisfine for discovering this problem.

@solnic @timriley

@kirs

This comment has been minimized.

Copy link
Author

commented Jun 11, 2016

Here is the output of the dependencies profile:

TOP: 6.4258 MiB
  dry-validation: 6.3789 MiB
    dry/validation: 6.375 MiB
      dry-configurable: 4.8516 MiB (Also required by: dry/container)
        dry/configurable: 4.8438 MiB
          concurrent: 4.8086 MiB (Also required by: dry/container, dry/types)
            concurrent/configuration: 1.8945 MiB (Also required by: concurrent/scheduled_task, concurrent/options, and 2 others)
              concurrent/delay: 1.4492 MiB (Also required by: concurrent/utility/processor_counter, concurrent)
                concurrent/concern/obligation: 0.8594 MiB (Also required by: concurrent/ivar)
                  concurrent/atomic/event: 0.6992 MiB (Also required by: concurrent/executor/immediate_executor, concurrent/atomics, and 3 others)
                    concurrent/synchronization: 0.625 MiB (Also required by: concurrent/executor/abstract_executor_service, concurrent/utility/at_exit, and 29 others)
                concurrent/executor/immediate_executor: 0.5234 MiB (Also required by: concurrent/configuration, concurrent/executors)
                  concurrent/executor/abstract_executor_service: 0.4609 MiB (Also required by: concurrent/executors, concurrent/executor/ruby_executor_service)
            concurrent/executors: 1.3594 MiB
            concurrent/atomics: 1.0313 MiB (Also required by: concurrent/executor/simple_executor_service)
              concurrent/atomic/reentrant_read_write_lock: 0.3477 MiB
      dry/validation/schema: 1.3047 MiB (Also required by: dry/validation/schema/form, dry/validation/schema/json)
        dry/validation/input_processor_compiler: 0.418 MiB
          dry/types: 0.3516 MiB
        dry/validation/messages: 0.3867 MiB
          dry/validation/messages/yaml: 0.3672 MiB
            yaml: 0.3281 MiB
              psych: 0.3281 MiB
@kirs

This comment has been minimized.

Copy link
Author

commented Jun 11, 2016

My two cents after looking a bit into dependencies:

  1. dry-validation doesn't need an extensive config management. Maybe we could avoid loading concurrent-ruby just for a few accessors?
  2. Maybe we could delay loading validation messages and I18n until they're required for compiling error messages? dry/validation/messages takes around 0.4 MiB
@solnic

This comment has been minimized.

Copy link
Member

commented Jun 11, 2016

This is pretty crazy :/ Could you tell me which ruby version do you use? Is it in a rails app?

We used dry-v in a non-rails app and we had cases where ie loading a single spec which requires dry-v + a bunch of other things + running a spec would take less than 400ms.

dry-validation doesn't need an extensive config management. Maybe we could avoid loading concurrent-ruby just for a few accessors?

We use concurrent-ruby for thread-safe caches too, but maybe we could improve that dep start-up time somehow too.

Maybe we could delay loading validation messages and I18n until they're required for compiling error messages? dry/validation/messages takes around 0.4 MiB

Yep, that's definitely a good idea. In fact originally we didn't even require i18n support by default, but since that's so slow I'll change that.

I'm on short holidays right now, going back home on Monday, so I'll look into this further later next week. Thanks for reporting this and trying out dry-v btw, we're wrapping up a major release this month, lots of improvements, polishing and bug fixes :)

@kirs

This comment has been minimized.

Copy link
Author

commented Jun 11, 2016

Could you tell me which ruby version do you use? Is it in a rails app?

My results are from MRI 2.3.0.
Yes, we use dry-v in a Rails app but all my benchmarks are taken outside of Rails app with a separate Gemfile.

Here is an update: I could isolate the problem and turns out the massive slowdown is coming not from require, but from the schema definition.

require 'benchmark'
puts Benchmark.measure { require 'dry-validation' }
puts Benchmark.measure {
    require_relative './order_schema_definition'
}
cold start:
require: 0.460174s
schema: 4.934889s

hot start:
require: 0.115078s
schema: 4.936531s

(require is still slow, but it's fast when compared to schema definition timing)

It takes 5 seconds to define the schema and build the AST.
Here is an example of our huge schema (I had to strip the field names): https://gist.github.com/kirs/f12ce793376d8f454c2f1a4ce997481b

@kirs

This comment has been minimized.

Copy link
Author

commented Jun 11, 2016

Here is a stackprof profile from the schema definition:

     TOTAL    (pct)     SAMPLES    (pct)     FRAME
     10598 (214.8%)        2802  (56.8%)     Dry::Validation::InputProcessorCompiler#visit_and
      4826  (97.8%)         345   (7.0%)     Dry::Types::Compiler#merge_with
     21150 (428.7%)         309   (6.3%)     Dry::Validation::InputProcessorCompiler#visit
      3845  (77.9%)         192   (3.9%)     Dry::Types::Compiler#visit_type
      6670 (135.2%)         159   (3.2%)     Dry::Types::Compiler#visit
       152   (3.1%)         113   (2.3%)     Dry::Validation::InputProcessorCompiler#visit_predicate
        59   (1.2%)          59   (1.2%)     Concurrent::Collection::NonConcurrentMapBackend#get_or_default
        48   (1.0%)          48   (1.0%)     Dry::Types::Options#initialize
       113   (2.3%)          44   (0.9%)     Dry::Types::Builder#|
        39   (0.8%)          39   (0.8%)     Dry::Validation::InputProcessorCompiler#type
      2180  (44.2%)          38   (0.8%)     Dry::Types::Compiler#visit_key

@rafaelfranca

This comment has been minimized.

Copy link

commented Jun 11, 2016

@kirs thank you for the reproduction steps. Maybe it is better to change the issue description and title since what is slow is not the gem require but the schema definition.

@solnic here is the issue that I told you in Gitter.

@kirs kirs changed the title Gem require is too slow Schema definition is too slow Jun 11, 2016

@kirs

This comment has been minimized.

Copy link
Author

commented Jun 11, 2016

Thanks Rafael, I renamed it.

One more thing I just discovered: all slowdowns apply only to Dry::Validation.Form. Dry::Validation.Schema works quite fast.

@solnic

This comment has been minimized.

Copy link
Member

commented Jun 12, 2016

Thanks for the detailed info, I'll make it much faster in 0.8.0. Stay tuned.

@solnic solnic self-assigned this Jun 12, 2016

@kirs

This comment has been minimized.

Copy link
Author

commented Jun 13, 2016

Thanks! I'll be looking forward to try it from master.

@solnic

This comment has been minimized.

Copy link
Member

commented Jun 16, 2016

I compared this with a plain schema w/o coercions and it's pretty clear that the bottleneck is coercion-inference. It takes ~0.05 second to define your huge schema w/o coercions, I'm pretty sure we can make it almost as fast with coercions too.

We are in the process of discussing a new explicit API for defining types inside a schema, which will remove the need to have this crazy input-processor compilers. For more info see #181

@solnic

This comment has been minimized.

Copy link
Member

commented Jun 17, 2016

There we go:

Schema defined in 0.075425
Form defined in 5.43349
Form (with type specs) defined in 0.077061
------------------------------------------------------------------------------
Warming up --------------------------------------
              Schema     1.000  i/100ms
                Form     1.000  i/100ms
Form with type specs     1.000  i/100ms
Calculating -------------------------------------
              Schema     15.248  (±13.1%) i/s -     75.000  in   5.045985s
                Form      0.174  (± 0.0%) i/s -      1.000  in   5.749034s
Form with type specs     14.810  (±20.3%) i/s -     71.000  in   5.000067s

Comparison:
              Schema:       15.2 i/s
Form with type specs:       14.8 i/s - same-ish: difference falls within error
                Form:        0.2 i/s - 87.66x slower

This means there will be no practical difference between schema with or without coercions (via type specs) :) This is in a branch, I gotta clean some things up and merge it to master. I'll close this issue once that's done.

@kirs

This comment has been minimized.

Copy link
Author

commented Jun 17, 2016

Thank you so much!
Looking forward to try the new version.

@solnic

This comment has been minimized.

Copy link
Member

commented Jul 1, 2016

Folks, I'm gonna close this one as with dry-v 0.8.0 you'll be able to use explicit type-specs which will be used to configure coercion separately from the rules. Good news is, as mentioned already, it's gazillion times faster, bad news is that it's gonna be more verbose syntax-wise, ie:

required(:age, [:nil, :int]).maybe(:int?)

There are various ways how this can be improved, but there are lots of considerations here so I decided to postpone this for another release; in the meantime people can experiment with their own methods to remove some duplication and we can see what comes out of that. I just added a public API for extending the DSL with custom methods, ie:

dsl_ext = Module.new do
  def maybe_int(name, *predicates, &block)
    required(name, [:nil, :int]).maybe(:int?, *predicates, &block)
  end
end

Dry::Validation::Schema.configure do |config|
  config.dsl_extensions = dsl_ext
end

schema = Dry::Validation.Schema do
  maybe_int(:age)
end

I'll release new versions of dry-validation/types/logic today so please check it out and lemme know if things work for you.

@solnic solnic closed this Jul 1, 2016

@kirs

This comment has been minimized.

Copy link
Author

commented Jul 18, 2016

Thank you for the new release!
I've checked it with our huge schema and results are pretty good. It takes around 7ms for the schema definition 🚀 🚀 🚀

@solnic

This comment has been minimized.

Copy link
Member

commented Jul 19, 2016

@kirs thanks for testing it out :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.