Rethink middleware #186

Closed
rmosolgo opened this Issue Jul 19, 2016 · 10 comments

Comments

2 participants
@rmosolgo
Owner

rmosolgo commented Jul 19, 2016

Middleware was designed with a simple, Rack-like API, but turns out that's a bug, not a feature:

Similar issues affect graphql-ruby. For example, if you use graphql-batch, the actual processing is done "out of bounds", not within next_middleware.call. With DeferredExecution, the problem is even worse: there's no way to detect the actual end of the query, since any deferred fields are executed outside the root node's middleware call.

So, we need a system that:

  • can intercept field resolves
  • can detect the start & end of queries (even when processing is done "out of bounds", eg graphql-batch)
  • supports middleware-local state (eg, timing data)
  • has a user-friendly API

Whatever happens, it will be important to support existing middleware somehow. I imagine that will be simple enough, wrapping an existing middleware to port it to the new API, perhaps

legacy_middleware = CustomMiddleware.new 
new_middleware = LegacyMiddlewareAdapter.new(legacy_middleware)
MySchema.middleware << new_middleware 

Some ideas

Make it like query analyzers

  • before_query returns an initial memo
  • subsequent hooks return a new memo
  • problem: Previous middlewares could affect field resolution by returning a value and not calling next_middleware.call. This implementation requires the return value to be memo, how can you intercept a field call?
MyCustomMiddleware = GraphQL::Middleware.define do 
  before_query -> (memo, env) { 
    { 
      counter: 0
     }
  }

  each_field -> (memo, env) {
    memo[:counter] += 1 
    env[:next_middleware].call 
    memo 
  } 

  after_query -> (memo, env) { 
    puts "Total fields: #{memo[:counter]}"
  }
end 

Make it an event listener

  • Execution strategy would be responsible for triggering events at the right time
  • easy to add new events in the future
  • problem: again, how to modify field resolutions? (I don't want to use exceptions for control flow)
MyCustomMiddleware = GraphQL::Middleware.define do 
  on(:begin_query) { ... } 
  on(:end_query) { ... } 
  on(:begin_field) { ... }
  on(:end_field) { ... }
end 

Extend the current definition

  • Add more hooks to middleware objects (a bit like query analyzer's initial_value etc)
  • problem: we have to keep the long list of random args
  • problem: how to track out-of-bound field duration?
  • problem: when we need more hooks, do we just add more methods? seems ... not elegant.
class MyCustomMiddleware 
  def call(*long_list_of_random_args)
    # ... 
  end 
  def before_query(*more_random_args)
  end 
  def after_query(*more_random_args)
  end 
end 

Make it like Rails instrumentation

In this API, code can be run inside an instrumentation block. Handlers can respond to incoming data in some way.

resolved_value = query.instrument("field.resolve", type_defn, field_defn) do 
  # call the resolve function 
end 

query.instrument("query.begin")
# ... do a bunch of deferred stuff 
query.instrument("query.end")

batch_result = query.instrument("batch.resolve", loader) do 
  # Run the batch
end 

Some questions here:

  • How are handlers identified (names, symbols) ? Is there a nesting structure? (I hope not, sounds hard)
  • Can handlers interfere with the block, or only the arguments to instrument?
@rmosolgo

This comment has been minimized.

Show comment
Hide comment
@rmosolgo

This comment has been minimized.

Show comment
Hide comment
@rmosolgo

rmosolgo Sep 17, 2016

Owner

Here's an important point after chatting with the GitHub GraphQL team: middlewares add a lot of overhead. They get called for every field resolution, even for fields where they make no sense or where we could know ahead of time that they don't apply.

The next implementation needs to be less wasteful, for example, allow fields to opt in to certain wrappers?

Owner

rmosolgo commented Sep 17, 2016

Here's an important point after chatting with the GitHub GraphQL team: middlewares add a lot of overhead. They get called for every field resolution, even for fields where they make no sense or where we could know ahead of time that they don't apply.

The next implementation needs to be less wasteful, for example, allow fields to opt in to certain wrappers?

@cjoudrey

This comment has been minimized.

Show comment
Hide comment
@cjoudrey

cjoudrey Sep 17, 2016

Collaborator

We could have "instrumenters" that have an "attach" method and we attach
the "instrumenters" at schema definition time.

For instance, a "instrumenter" that instruments usage of deprecated fields
could have a "def attach(field)" method that can wrap the field's resolver
proc but it would only do so for fields that are deprecated.

We would need different instrumenter types for each instrumentable thing,
i.e field, argument, etc..

Would that work?

On Friday, 16 September 2016, Robert Mosolgo notifications@github.com
wrote:

Here's an important point after chatting with the GitHub GraphQL team:
middlewares add a lot of overhead. They get called for every field
resolution, even for fields where they make no sense or where we could know
ahead of time that they don't apply.

The next implementation needs to be less wasteful, for example, allow
fields to opt in to certain wrappers?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXg9g4vm8SncWXXopkSq8hrURhwDUeNks5qqy5egaJpZM4JPwM3
.

Collaborator

cjoudrey commented Sep 17, 2016

We could have "instrumenters" that have an "attach" method and we attach
the "instrumenters" at schema definition time.

For instance, a "instrumenter" that instruments usage of deprecated fields
could have a "def attach(field)" method that can wrap the field's resolver
proc but it would only do so for fields that are deprecated.

We would need different instrumenter types for each instrumentable thing,
i.e field, argument, etc..

Would that work?

On Friday, 16 September 2016, Robert Mosolgo notifications@github.com
wrote:

Here's an important point after chatting with the GitHub GraphQL team:
middlewares add a lot of overhead. They get called for every field
resolution, even for fields where they make no sense or where we could know
ahead of time that they don't apply.

The next implementation needs to be less wasteful, for example, allow
fields to opt in to certain wrappers?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXg9g4vm8SncWXXopkSq8hrURhwDUeNks5qqy5egaJpZM4JPwM3
.

@rmosolgo

This comment has been minimized.

Show comment
Hide comment
@rmosolgo

rmosolgo Sep 17, 2016

Owner

Oh, that's a really cool idea! If I understand correctly:

  • You define a bunch of instrumenters (maybe Ruby classes) and the terms for when they "hook themselves up"
  • At schema build time, check the provided instrumenters against members of the schema, and attach them where they belong
  • At query time, do nothing special: the resolve procs have already been wrapped with any modifications that the user wants

Does that sound right?

If so, it's a definite improvement over functional composition-style instrumentation, which I was leaning towards:

resolve_func = -> (obj, args, ctx) { ... } 
# ... 
# attach an instrumenter by hand:
field :something, resolve: instrument(:my_instrumenter, resolve_proc)

An approach like that ☝️ one is simple (and works already), but it leads to "shotgun surgery" when modifying your instrumentation. Better to have something that doesn't interfere with schema definition directly, but rather "wraps" items on an opt-in basis.

Owner

rmosolgo commented Sep 17, 2016

Oh, that's a really cool idea! If I understand correctly:

  • You define a bunch of instrumenters (maybe Ruby classes) and the terms for when they "hook themselves up"
  • At schema build time, check the provided instrumenters against members of the schema, and attach them where they belong
  • At query time, do nothing special: the resolve procs have already been wrapped with any modifications that the user wants

Does that sound right?

If so, it's a definite improvement over functional composition-style instrumentation, which I was leaning towards:

resolve_func = -> (obj, args, ctx) { ... } 
# ... 
# attach an instrumenter by hand:
field :something, resolve: instrument(:my_instrumenter, resolve_proc)

An approach like that ☝️ one is simple (and works already), but it leads to "shotgun surgery" when modifying your instrumentation. Better to have something that doesn't interfere with schema definition directly, but rather "wraps" items on an opt-in basis.

@cjoudrey

This comment has been minimized.

Show comment
Hide comment
@cjoudrey

cjoudrey Sep 17, 2016

Collaborator

Correct! That's exactly what I meant. :)

The advantage is that you don't need to define it on every field BUT you
only wrap the fields that need to be at schema build time. Best of both
worlds. :)

Would love to collaborate on adding this!

On Saturday, 17 September 2016, Robert Mosolgo <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Oh, that's a really cool idea! If I understand correctly:

  • You define a bunch of instrumenters (maybe Ruby classes) and the
    terms for when they "hook themselves up"
  • At schema build time, check the provided instrumenters against
    members of the schema, and attach them where they belong
  • At query time, do nothing special: the resolve procs have already
    been wrapped with any modifications that the user wants

Does that sound right?

If so, it's a simple improvement over functional composition-style
instrumentation, which I was leaning towards:

resolve_func = -> (obj, args, ctx) { ... } # ... # attach an instrumenter by hand:
field :something, resolve: instrument(:my_instrumenter, resolve_proc)

An approach like that ☝️ one is simple (and works already), but it leads
to "shotgun surgery" when modifying your instrumentation. Better to have
something that doesn't interfere with schema definition directly, but
rather "wraps" items on an opt-in basis.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXg9miiV-15q_AgrT386f6AI44aA-Y_ks5qrDTAgaJpZM4JPwM3
.

Collaborator

cjoudrey commented Sep 17, 2016

Correct! That's exactly what I meant. :)

The advantage is that you don't need to define it on every field BUT you
only wrap the fields that need to be at schema build time. Best of both
worlds. :)

Would love to collaborate on adding this!

On Saturday, 17 September 2016, Robert Mosolgo <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Oh, that's a really cool idea! If I understand correctly:

  • You define a bunch of instrumenters (maybe Ruby classes) and the
    terms for when they "hook themselves up"
  • At schema build time, check the provided instrumenters against
    members of the schema, and attach them where they belong
  • At query time, do nothing special: the resolve procs have already
    been wrapped with any modifications that the user wants

Does that sound right?

If so, it's a simple improvement over functional composition-style
instrumentation, which I was leaning towards:

resolve_func = -> (obj, args, ctx) { ... } # ... # attach an instrumenter by hand:
field :something, resolve: instrument(:my_instrumenter, resolve_proc)

An approach like that ☝️ one is simple (and works already), but it leads
to "shotgun surgery" when modifying your instrumentation. Better to have
something that doesn't interfere with schema definition directly, but
rather "wraps" items on an opt-in basis.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXg9miiV-15q_AgrT386f6AI44aA-Y_ks5qrDTAgaJpZM4JPwM3
.

@rmosolgo

This comment has been minimized.

Show comment
Hide comment
@rmosolgo

rmosolgo Sep 18, 2016

Owner

collaborate

👍 🎉 😄

There are a couple other "events" that need coverage, how could those fit into a flow like this?

  • begin_query / end_query -- run custom code at the start of execution and end of execution
  • third-party hooks? graphql-batch is the quintessential example: how can I run code around batch execution?
Owner

rmosolgo commented Sep 18, 2016

collaborate

👍 🎉 😄

There are a couple other "events" that need coverage, how could those fit into a flow like this?

  • begin_query / end_query -- run custom code at the start of execution and end of execution
  • third-party hooks? graphql-batch is the quintessential example: how can I run code around batch execution?
@cjoudrey

This comment has been minimized.

Show comment
Hide comment
@cjoudrey

cjoudrey Sep 18, 2016

Collaborator

For begin_query / end_query we might be able to keep a similar syntax as
you initially proposed in combination with an "attach?" method such that we
don't wrap resolvers that are not eligible.

On Saturday, 17 September 2016, Robert Mosolgo notifications@github.com
wrote:

collaborate

👍 🎉 😄

There are a couple other "events" that need coverage, how could those fit
into a flow like this?

  • begin_query / end_query -- run custom code at the start of execution
    and end of execution
  • third-party hooks? graphql-batch is the quintessential example: how
    can I run code around batch execution?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXg9pzJEGqGujd2LXHrOHelbeNQvRwVks5qrIdRgaJpZM4JPwM3
.

Collaborator

cjoudrey commented Sep 18, 2016

For begin_query / end_query we might be able to keep a similar syntax as
you initially proposed in combination with an "attach?" method such that we
don't wrap resolvers that are not eligible.

On Saturday, 17 September 2016, Robert Mosolgo notifications@github.com
wrote:

collaborate

👍 🎉 😄

There are a couple other "events" that need coverage, how could those fit
into a flow like this?

  • begin_query / end_query -- run custom code at the start of execution
    and end of execution
  • third-party hooks? graphql-batch is the quintessential example: how
    can I run code around batch execution?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXg9pzJEGqGujd2LXHrOHelbeNQvRwVks5qrIdRgaJpZM4JPwM3
.

@rmosolgo

This comment has been minimized.

Show comment
Hide comment
@rmosolgo

rmosolgo Sep 23, 2016

Owner

For the benefit of any watchers, we had some back-and-forth about API & implementation here:

https://gist.github.com/cjoudrey/1c85ed1312403e9d73a4cc714cafc8f9

And later, cjoudrey suggested a nice option for attaching instrumenters in a non-destructive way: you could maintain a schema-level map of { field_instance => [instrumenter, instrumenter ...] }, and add step to field resolution which checks the map for any instrumenters on that field.

I think that's a very promising option for a few reasons:

  • It adds very little overhead for non-instrumented fields
  • It doesn't modify the GraphQL::Field instance, so it can be reused or shared †

There's one more question, how to support "events" like :begin_query. Perhaps those event names could be entries in the same map; it could be an "arbitrary" store for instrumenters in that way.

{
  :begin_query => [...], 
  #<GraphQL::Field> => [...],
  :begin_batch => [...], 
  :end_batch => [...],
}

† This doesn't actually work yet, but I want it too, or at least to keep the door open to it. There are some cases (field.name, connection) where graphql-ruby modifies field instances that you give it.

Owner

rmosolgo commented Sep 23, 2016

For the benefit of any watchers, we had some back-and-forth about API & implementation here:

https://gist.github.com/cjoudrey/1c85ed1312403e9d73a4cc714cafc8f9

And later, cjoudrey suggested a nice option for attaching instrumenters in a non-destructive way: you could maintain a schema-level map of { field_instance => [instrumenter, instrumenter ...] }, and add step to field resolution which checks the map for any instrumenters on that field.

I think that's a very promising option for a few reasons:

  • It adds very little overhead for non-instrumented fields
  • It doesn't modify the GraphQL::Field instance, so it can be reused or shared †

There's one more question, how to support "events" like :begin_query. Perhaps those event names could be entries in the same map; it could be an "arbitrary" store for instrumenters in that way.

{
  :begin_query => [...], 
  #<GraphQL::Field> => [...],
  :begin_batch => [...], 
  :end_batch => [...],
}

† This doesn't actually work yet, but I want it too, or at least to keep the door open to it. There are some cases (field.name, connection) where graphql-ruby modifies field instances that you give it.

@rmosolgo

This comment has been minimized.

Show comment
Hide comment
@rmosolgo

rmosolgo Oct 3, 2016

Owner

I was thinking about this some more, let's say you want a robust API for opt-in instrumentation, what are the ways you can imagine attaching handlers? I can see something like this:

Schema.define do 
  # Attach based on properties of the field: 
  instrument_with(RescueHandler, field_return_type: PersonType)
  instrument_with(IdAccessLogger, field_name: "id")
  # Attach to named events: 
  instrument_with(TimingHandler, event_name: [:query_begin, :query_end])
  # Somehow support third-party events: 
  instrument_with(TimingHandler, event_name: [:batch_begin, :batch_end])
  # Attach based on arbitrary code: 
  instrument_with(ConnectionLogger) { |target| target.is_a?(GraphQL::Field) && target.metadata[:relay_connection_field] } 
end 

Perhaps under the hood, they're all implemented like the last example: you check the possible instrumentation targets and if the conditions return true, add an entry to the instrumentation table.

Owner

rmosolgo commented Oct 3, 2016

I was thinking about this some more, let's say you want a robust API for opt-in instrumentation, what are the ways you can imagine attaching handlers? I can see something like this:

Schema.define do 
  # Attach based on properties of the field: 
  instrument_with(RescueHandler, field_return_type: PersonType)
  instrument_with(IdAccessLogger, field_name: "id")
  # Attach to named events: 
  instrument_with(TimingHandler, event_name: [:query_begin, :query_end])
  # Somehow support third-party events: 
  instrument_with(TimingHandler, event_name: [:batch_begin, :batch_end])
  # Attach based on arbitrary code: 
  instrument_with(ConnectionLogger) { |target| target.is_a?(GraphQL::Field) && target.metadata[:relay_connection_field] } 
end 

Perhaps under the hood, they're all implemented like the last example: you check the possible instrumentation targets and if the conditions return true, add an entry to the instrumentation table.

@tmeasday tmeasday referenced this issue in apollographql/optics-agent-ruby Oct 7, 2016

Closed

Robust Instrumentation via new graphql-ruby API #13

@rmosolgo rmosolgo referenced this issue Oct 26, 2016

Merged

Instrumentation #354

2 of 2 tasks complete
@rmosolgo

This comment has been minimized.

Show comment
Hide comment
@rmosolgo

rmosolgo Nov 1, 2016

Owner

Basic setup in #354, lots of room to grow there. instrument(:field, instrumenter) provides an opt-in, ahead-of-time path for modifying field resolve functions with no query-time overhead.

Owner

rmosolgo commented Nov 1, 2016

Basic setup in #354, lots of room to grow there. instrument(:field, instrumenter) provides an opt-in, ahead-of-time path for modifying field resolve functions with no query-time overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment