New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GraphQL::Tracing #917

Merged
merged 75 commits into from Aug 30, 2017

Conversation

@rmosolgo
Owner

rmosolgo commented Aug 25, 2017

GraphQL::Tracing wraps gem internals so that subscribers can receive the metadata.

TODO

  • write a guide?
  • Use only one . in the active support key

esjee and others added some commits Jul 14, 2017

Merge pull request #883 from rmosolgo/fix-fields-will-merge-input-obj…
…ects

fix(FieldsWillMerge) handle input objects properly
Merge pull request #873 from rmosolgo/file-aware-parser
feat(GraphQL.parse_file) add filename-aware parsing
Merge pull request #885 from jpserra/fix-active_record-dependency
Check if ActiveRecord::Relation is defined
Merge pull request #888 from rmosolgo/fix-variable-usage-in-required-…
…list

fix(VariableUsagesAreAllowed) a required variable inside a literal list satisfies [T!]
Fix Sequel::Error when limit is 0
Sequel, contrarily to ActiveRecord, does not support `limit(0)` queries
and raises a `Sequel::Error` instead.

To fix this, we need to deal with this differently by using `where(false)`
instead when we are attempting to perform a query for zero rows in Sequel.
Merge pull request #890 from cjoudrey/fix-arguments-are-defined-node
Use argument node instead of parent in ArgumentsAreDefined errors
@rmosolgo

This comment has been minimized.

Owner

rmosolgo commented Aug 25, 2017

cc @matsimitsu how does this ActiveSupport::Notification implementation look?

I didn't put a lot of metadata in the hash, but you can access anything from context, for example:

context = metadata[:context]
context.path # => ["field1", "field2", ... ]
context.query # => GraphQL::Query 
context.ast_node # => GraphQL::Language::Nodes::Field 
context.irep_node # => GraphQL::InternalRepresentation::Node 

rmosolgo and others added some commits Aug 25, 2017

Merge pull request #916 from esjee/double-tap-non-null
Do not allow double non-null constraints
Merge pull request #842 from esjee/travis-yml-without-rails
Add a Travis build without Rails installed
update limits guide to include the context parameter
without this, I was seeing:

```
TypeError:
        no implicit conversion of nil into String
```
@matsimitsu

This comment has been minimized.

Contributor

matsimitsu commented Aug 26, 2017

Awesome! The amount of instrumentation is really nice and should help debug any performance issues easily.

As discussed on the hackday yesterday there’s an issue with the way activesupport notifications are emitted that it’s difficult to see exactly what part of the query is causing the performance issues. This is because the graph tree is resolved breadth first. This means that the instrumentation does not follow the tree branches.

If we have the following (multiplexed) queries:

query {
  viewer {
    id
    username
    email
  }
}
query userPageQuery($username: String!) {
  user(username: $username) {
    id
    name
    isViewer
    username
    trips {
      id
      title
      subtitle
      viewerCanEdit
      header {
        ratio
        id
        url
      }
    }
  }
}

It looks something like this in a performance monitoring tool.

waterfall

As you can see it’s very difficult to see what part of the query took the most time and looks very much like the second graph in @emilebosch ’s comment on issue #896 (comment)

In order to create a better visualisation this we can re-order our instrumentation back into a tree-like structure.

With the path method on the context that’s emitted with the notifications we can rebuild the tree by looping over the instrumentation and figuring out what the current path is.

This example middleware below will build a hash that should represent the query structure above, based on the path in the notification context.

class BetterInstrumentationMiddleware
  def initialize(app, options = {})
    @app = app
    @options = options
    @instrumentation = []
    ActiveSupport::Notifications.subscribe(/.*/) do |*args|
      @instrumentation << ActiveSupport::Notifications::Event.new(*args)
    end
  end

  def call(env)
    rack_result = @app.call(env)
    nested_instrumentation = {}
    current_path = []

    # Loop over the instrumentation and check if there's a path array
    # if there is one we can add it to our nested hash
    # if there isn't we can assume this event happened in the current path
    @instrumentation.each do |event|
      current_path = event.payload[:context].try(:path) || current_path

      # Set current branch to base
      current_branch = nested_instrumentation

      # Loop the paths and navigate to branch in our result hash
      # if the branch is new, create a new hash for that branch
      current_branch.each do |part|
        current_branch = current_branch[part] ||= {}
      end
      current_branch['events'] ||= []
      current_branch['events'].push(event)
    end

    @instrumentation = []

    # Send it off to 3rd party monitoring
    send_instrumentation(nested_instrumentation)
    return rack_result
  end
end

The resulting hash would look something like this:

{
  "viewer" => {
    "events"   => [(events that happened when viewer was computed)],
    "username" => {"events" => [(events that happened when viewer name was computed)]},
    "id"       => {"events" => [(events that happened when viewer id was computed)],
    "email"    => {"events" => [(events that happened when viewer email was computed)],
  },
  "user" => {
    "events" => [(events that happened when user was computed)],
    "id"     => {"events" => [(events that happened when user id was computed)],
    "name"   => {"events" => [(events that happened when user name was computed)],
    # and so on
  }
}

Adding this middleware, sending the nested_instrumentation and adding an event for each tree branch gives us the following event tree, as you can see it’s much easier to see where time is spent in the graphql query and it looks more like the first image in @emilebosch ’s comment on issue #896 (comment)

better

I’m not entirely sure if/how we should handle this in the gem. I don’t think it’s possible to create something that re-emits the instrumentation in the right order, as other tools might be subscribed and would get all events twice.

I think documenting these findings would be a good start, so anyone that wants to implement their own visualisation of the gem performance has an understanding of how to reproduce the tree.

Another option could be that we create a helper that emits the query with all nested instrumentation, something that @emilebosch has done with his rails engine https://github.com/emilebosch/graphql-metrics.

What do you think @rmosolgo

@emilebosch

This comment has been minimized.

Contributor

emilebosch commented Aug 26, 2017

@rmosolgo @matsimitsu Awesome work. I'd love to incorporate this tracing in the dashboard. This looks very nice @matsimitsu. Now its a bunched of hacked stuff together, but with this i think i can get quite far and make it look super pretty.

Maybe its also cool to think about what we actually want to see:

  • Slow fields (i.e. fields that need to be optimized)
  • Slow queries (queries that are generally slow)
  • Most used fields, most unused fields? (i.e. scary to refactor)

I'm also thinking about the graphs that you might want to see. What's the low hanging fruit?

Merge pull request #921 from codebender/fix-limits-guide
update limits guide to include the context parameter
@emilebosch

This comment has been minimized.

Contributor

emilebosch commented Aug 26, 2017

@matsimitsu @rmosolgo Cant we emit a query end notification, that has a method to get the query back in the right order?

else
@value = new_value
end
end

This comment has been minimized.

@rmosolgo

rmosolgo Aug 26, 2017

Owner

Oops, this diff was added by mistake! I'll remove it.

@rmosolgo

This comment has been minimized.

Owner

rmosolgo commented Aug 26, 2017

Yes, let me take a look at these events and make a few changes:

  • Add some event which encloses the whole query execution.
  • Change execute_field somehow:
    • Right now, it includes child field resolution in some cases (not for graphql-batch-based fields)
    • Maybe it should only include the resolve function call
@rmosolgo

This comment has been minimized.

Owner

rmosolgo commented Aug 28, 2017

I added execute_multiplex which is the umbrella event for a call to Schema#execute or Schema#multiplex. When it finishes, you know everything is done!

@bdubaut

This comment has been minimized.

bdubaut commented Aug 28, 2017

@rmosolgo do you think it would be possible in the trace to highlight fields with a deprecation warning? Would be pretty useful IMO.

@emilebosch

This comment has been minimized.

Contributor

emilebosch commented Aug 28, 2017

@rmosolgo

This comment has been minimized.

Owner

rmosolgo commented Aug 30, 2017

@rmosolgo rmosolgo changed the base branch from master to 1.7.x Aug 30, 2017

@rmosolgo rmosolgo merged commit c21d770 into 1.7.x Aug 30, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@rmosolgo rmosolgo deleted the tracing branch Aug 30, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment