Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(core): Remove Python code node memory leak using Workers #13648

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

riseandignite
Copy link

@riseandignite riseandignite commented Mar 3, 2025

Summary

This PR addresses issue #7939 where users report significant memory growth when using Python in Code nodes. The solution implements a Worker approach that completely resolves the memory leak pattern.

Memory usage comparison charts: https://n8n-memory-git-master-nikitas-projects-2b098508.vercel.app/

CleanShot 2025-03-03 at 23 50 00@2x
CleanShot 2025-03-03 at 23 50 03@2x
CleanShot 2025-03-04 at 00 07 51@2x

Problem

Users have documented persistent memory growth issues when running Python code nodes:

  • Memory increases with each execution and never fully recovers
  • Baseline memory usage continually grows over time
  • Eventually leads to high memory consumption and OOM errors

Looking at the memory charts from my testing, the pattern is clear - our current implementation shows steadily increasing memory usage that never fully recovers. This matches exactly what users have reported in issue #7939, where memory grows from around 200MB to over 1GB and continues climbing with each Python execution.

Investigations

The root cause is in how Pyodide (our Python-in-browser implementation) manages memory:
When Python code executes through Pyodide, it creates WebAssembly memory allocations that aren't fully released afterward. Even when JavaScript references are cleared and Python's garbage collector runs, portions of this WASM heap remain allocated.

Solutions Tested

I tested three approaches:

  1. Removing the singleton pattern: This helped somewhat by creating a fresh Pyodide instance each time, but still left significant memory unreleased.
  2. Manual cleanup with sys.modules.clear() and gc.collect(): This provided minor improvements but couldn't reach all the memory being held.
  3. Worker implementation: This completely solved the issue by isolating each execution in its own worker thread and terminating it afterward.

The charts clearly show that both removing the singleton and using Workers address the memory growth, but the Worker approach is significantly more effective at reclaiming memory: https://n8n-memory-git-master-nikitas-projects-2b098508.vercel.app/

The workflow used was two python scripts executed one after the other every 20 seconds for ~10-15 minutes then turned off for ~10 minutes.

my_list = [0] * 1000000
return {"a": my_list}

CleanShot 2025-03-04 at 00 05 04@2x
CleanShot 2025-03-04 at 00 07 03@2x
CleanShot 2025-03-04 at 00 05 17@2x

Why Workers Work Better

The Worker solution is superior because terminating a worker forces the browser to reclaim all resources associated with it, including the WebAssembly heap that normal garbage collection can't reach. This creates a complete reset between executions without relying on Pyodide's internal cleanup mechanisms.

This implementation should resolve the frustrating experience users like @pablorq and @merlinxcy have reported where their n8n instances continuously consume more memory until they're forced to restart.

The implementation creates a new worker for each Python execution:

  1. Launch worker with Pyodide environment
  2. Send Python code and context to worker
  3. Receive results from worker
  4. Terminate worker completely, releasing all memory

Implementation Tradeoffs

This implementation creates some tradeoffs worth mentioning:

  1. Performance overhead: Creating a worker and initializing a new Pyodide environment for each execution adds <~150ms latency compared to reusing an existing instance depending on machine. This is typically negligible for most workflows, especially considering Python code execution itself is usually the more time-consuming part.

  2. Resource usage: Each worker temporarily increases memory usage by approximately 40-60MB during initialization, though these resources are completely released afterward. This temporary spike is far preferable to the permanent memory growth pattern we were seeing.

However, these tradeoffs are well justified given:

  • The complete resolution of memory leaks
  • Prevention of OOM crashes in production systems
  • More consistent performance over time
  • Elimination of need for periodic restarts

In testing, the initialization overhead proved to be a worthwhile tradeoff for the stability benefits. For most Python Code node use cases, the slight performance impact will be negligible compared to the benefits of reliable memory management.

Future Optimizations

While this implementation completely solves the memory leak issue, there are potential optimizations we could explore in the future:

  1. Worker pooling: If Python execution becomes performance-critical, we could implement a small pool of workers that are reused for a limited number of executions before being recycled. This would balance memory management with startup performance.

  2. Selective serialization: We could optimize the data passing between the main thread and worker by implementing smarter serialization that only includes the minimum required context.

These optimizations aren't necessary for the current fix but represent potential future enhancements if needed.

Related Linear tickets, Github issues, and Community forum posts

Fixes: #7939

Review / Merge checklist

  • PR title and summary are descriptive. (conventions)
  • Docs updated or follow-up ticket created.
  • Tests included.
  • PR Labeled with release/backport (if the PR is an urgent fix that needs to be backported)

@CLAassistant
Copy link

CLAassistant commented Mar 3, 2025

CLA assistant check
All committers have signed the CLA.

@riseandignite riseandignite changed the title fix(core): execute runCodeInPython in Worker to prevent memory leak fix(core): Remove Python code node memory leak using Web Workers Mar 3, 2025
@riseandignite riseandignite marked this pull request as ready for review March 3, 2025 16:12
@riseandignite riseandignite changed the title fix(core): Remove Python code node memory leak using Web Workers fix(core): Remove Python code node memory leak using Workers Mar 3, 2025
@n8n-assistant n8n-assistant bot added community Authored by a community member node/improvement New feature or request in linear Issue or PR has been created in Linear for internal review labels Mar 3, 2025
@Joffcom
Copy link
Member

Joffcom commented Mar 3, 2025

Hey @riseandignite,

Thanks for the PR, We have created "GHC-1035" as the internal reference to get this reviewed.

One of us will be in touch if there are any changes needed, in most cases this is normally within a couple of weeks but it depends on the current workload of the team.

Copy link
Member

@netroy netroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've experimented with something like this before, but had to drop the idea because loading pyodide for every execution gets very resource expensive as instances scale up.

Maybe instead of completely removing the singleton, we should have a pool of workers, that we can re-cycle after a certain fixed number of code executions, or after a certain amount of time.

That won't solve the memory leak properly, but every time a worker thread is recycled, that should release the memory back.

I think it might make a lot more sense to migrate Python support to Task-Runners, and switch to real python, and move away from pyodide completely.

@riseandignite
Copy link
Author

Thanks for the feedback @netroy! You raise an excellent point about scale.

Let me share some numbers to put this in context:

Current memory growth issue:

  • Users are reporting memory growth from ~200MB to over 1GB with standard Python usage
  • @pablorq had to restart their instance every few days due to this growth
  • The memory never decreases after Python code runs, forcing restarts

Worker initialization costs:

  • Each Pyodide initialization: ~40-60MB memory + ~150-200ms startup time
  • 10 simultaneous Python executions: ~400-600MB temporary memory. 100 executions: 4-6GB.
  • However, this memory is fully released afterward (unlike current approach)

You're absolutely right that a worker pool would be a better compromise. I actually mentioned this as a future optimization in my PR description, but it makes sense to implement it now if scalability is a concern.

I can modify the PR to implement a pool where:

  • Maintain a small pool of workers (configurable, default 3-5)
  • Each worker handles X executions (configurable, default 10-20) before recycling
  • This gives us ~80-90% of the memory leak prevention benefit
  • While reducing resource usage by 80-95% compared to creating a worker per execution

This approach would significantly reduce the initialization overhead while still preventing the unbounded memory growth that's currently happening. Would that be a better approach? I can make these changes if you think this strikes the right balance.

And I agree that migrating to real Python via Task-Runners would be the ideal long-term solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Authored by a community member in linear Issue or PR has been created in Linear for internal review node/improvement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

n8n is taking a lot of RAM memory
4 participants