Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IMPORTANT] Memory Leak issue with graphql 2.0.17 #4370

Closed
andytpp opened this issue Mar 7, 2023 · 11 comments
Closed

[IMPORTANT] Memory Leak issue with graphql 2.0.17 #4370

andytpp opened this issue Mar 7, 2023 · 11 comments

Comments

@andytpp
Copy link

andytpp commented Mar 7, 2023

Describe the bug

After updating to graphql 2.0.17, the ruby processes running graphql just keep increasing memory usage linearly until OOM.
When goes back to graphql 2.0.16, everything is fine.

Versions

graphql version: 2.0.17
rails: 6.0.6.1
ruby: 2.6.6

@rmosolgo
Copy link
Owner

rmosolgo commented Mar 7, 2023

Hi! Yikes, sorry to hear it. I'd love to track this down and release a patch, but I can't find it locally. Could you run MemoryProfiler on one of your queries, then share the result? For example:

gem "memory_profiler" # in your Gemfile

Then, in a script or console:

report = MemoryProfiler.report(allow_files: "graphql") do 
  MySchema.execute(example_query_string, ...) 
end 

report.pretty_print(to_file: "graphql_memory.txt")

In the meantime, I'll be taking a look myself!

@rmosolgo
Copy link
Owner

rmosolgo commented Mar 7, 2023

A couple of things by way of background:

  1. This library does leave some retained memory behind: A single, empty Hash in Thread.current[:__graphql_runtime_info]. This is to be expected (but shouldn't grow query-after-query, since it's a single reused object).

  2. There is a benchmark in this library that does show retained memory, bundle exec rake:profile_large_result. It looks to me like a whole query object, along with its associated context info and result. I haven't worried about it so far because if I port the benchmark to a stand-alone file, it doesn't show the retained query, so I assumed it was part of the rake or benchmark setup

Now that I look into it, I see this benchmark started retaining memory after #4311. But the trouble is, that PR changed the benchmark and the code, so it'll require some digging to figure out exactly what started that extra retained memory.

Another thing is, re-enabling GraphQL::Dataloader on that benchmark makes the retained memory go away 🤔

No answers yet, but I'm still digging!

@rmosolgo
Copy link
Owner

rmosolgo commented Mar 7, 2023

Update: if you comment out GraphQL::Datataloader in that benchmark, it retains memory in GraphQL 2.0.16, too.

Are you using GraphQL::Dataloader?

@rmosolgo
Copy link
Owner

rmosolgo commented Mar 7, 2023

I'm just not finding any likely candidates 😖 Here's the diff between versions:

v2.0.16...v2.0.17

Do any of the bits of changed code there sound like they might have to do with any features that you use in a special way? Maybe some change there had an unexpected effect on certain uses of the library that I didn't consider 😖 Even if the code changes themselves don't make sense, maybe it would give us a clue where to look further.

@andytpp
Copy link
Author

andytpp commented Mar 7, 2023

Hi @rmosolgo, thanks for looking into this.
Here is the profiler, hopefully it helps.
graphql_memory.txt

Are you using GraphQL::Dataloader?

No, I don't.

@rmosolgo
Copy link
Owner

rmosolgo commented Mar 7, 2023

Thanks for sharing, @andytpp. A couple of things after reviewing that profile:

  1. It looks like the schema was initialized during report { ... } block. To keep schema initialization out of the memory report, could you try again with a script like this:

    # Run the query once to make sure all initialization is finished: 
    MySchema.execute(example_query_string, ...) 
    
    # Run it again to see what's allocated by _just_ this query:
    report = MemoryProfiler.report(allow_files: "graphql") do 
      MySchema.execute(example_query_string, ...) 
    end 
    
    report.pretty_print(to_file: "graphql_memory.txt")
  2. It looks like you're also using graphql-client, is that right? How is graphql-client involved in your system? Is it possible to run a benchmark without graphql-client?

    This seems like the biggest clue to me -- I've definitely introduced changes not knowing how graphql-client uses GraphQL-Ruby. I'd like to investigate whether this leak somehow comes from the way graphql-client and GraphQL-Ruby do (or don't) work together. So, if it's possible to run a query without graphql-client and compare the two profile results, that might help. Or, if you can explain how you use graphql-client in your app, maybe I replicate the problem locally and investigate further.

@andytpp
Copy link
Author

andytpp commented Mar 13, 2023

@rmosolgo Sorry for the delay, I couldn't make time until now.
Let me answer your first point first.
I've tried again the script with the initialization before the report block. Here are the profiles for versions .16 and .17 respectively.
graphql_memory_17_1.txt
graphql_memory_16_1.txt

@rmosolgo
Copy link
Owner

Thanks for sharing those. Weirdly, graphql_memory_17_1.txt still shows GraphQL schema initialization: all the retained memory has to do with GraphQL::Schema::Loader creating a schema from a JSON response to an introspection query. It's like the schema is rebuilding itself over and over!

I wonder if some change to GraphQL-Ruby internals made graphql-client think that it needs to rebuild itself. Could you share any information about how you're using graphql-client? I don't work with that gem usually but I'll try to replicate the scenario if you could share anything about your setup.

@andytpp
Copy link
Author

andytpp commented Mar 28, 2023

Hi @rmosolgo, thank you for your patience.

Based on your suspicion, we have investigated further and discovered a bug that was causing the schema to be rebuilt with every call. To be specific, the schema loading is called in every query execution.
We have fixed the issue, and as a result, the number of retained objects has dropped to zero.
I am not sure if you want to investigate further on why this occurred only with version .17 and not .16.

Attached are the profilers before and after our fix.
graphql_17_1.txt
graphql_17_2.txt

Please let me know if you need any additional information.
Thanks for your support.

@rmosolgo
Copy link
Owner

Glad you got to the bottom of it! I would like to investigate a bit why it happened starting in 2.0.17, do you mind sharing any details about the bug you discovered or how to fix it? (I'm wondering if maybe it should have been supported by GraphQL-Ruby, or at least, if you don't mind sharing, someone else might benefit from your work if they run into the same trouble.)

@andytpp
Copy link
Author

andytpp commented Mar 28, 2023

Sure, I'll try replicating the main parts, and I believe you can easily do the same.

We use graphql-client to execute queries against Contentful, and we have a wrapper class for this purpose:

module Services
  class MyClient
   #...
    def execute(query, variables: {})
      schema = ::GraphQL::Client.dump_schema(http_executor, context: context)
      client = ::GraphQL::Client.new(
        schema: schema,
        execute: http_executor
      )
      client.allow_dynamic_queries = true
      parsed_query = client.parse(query)
      client.query(parsed_query, variables: variables, context: context)
    end
   #...
  end
end

When we ran profiling on this class with version 2.0.17, we encountered a memory issue, as shown in the following code:

client = Services::MyClient.new(config: config)
query = <<~QUERY
  some query
QUERY
client.execute(query) # First call
report = MemoryProfiler.report(allow_files: "graphql") do
  10.times do |i|
    client.execute(query)
  end
end

To resolve the issue, we ensured that schema loading does not occur inside the execute method, and that the client instance is reused instead of being created again.

I hope this information is helpful.

@andytpp andytpp closed this as completed Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants