Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOMKill is hidden during docker-based next build, build appears successful but isn't. #67097

Open
henricook opened this issue Jun 21, 2024 · 3 comments · May be fixed by #67154
Open

OOMKill is hidden during docker-based next build, build appears successful but isn't. #67097

henricook opened this issue Jun 21, 2024 · 3 comments · May be fixed by #67154
Labels
bug Issue was opened via the bug report template. Developer Experience Issues related to Next.js logs, Error overlay, etc.

Comments

@henricook
Copy link

henricook commented Jun 21, 2024

Link to the code that reproduces this issue

https://github.com/henricook/next-build-silent-crash

To Reproduce

  • cd /path/to/next-build-silent-crash
  • npm i
  • docker build . --memory 128M --no-cache

Current vs. Expected behavior

Build crashes due to OOMKill, but with a success exit code (0)

As a result the Docker build continues, and it shouldn't

Provide environment information

Operation System:
 Ubuntu 24.04 LTS

Node: 20.x

Next: 14.2.3

Which area(s) are affected? (Select all that apply)

Developer Experience

Which stage(s) are affected? (Select all that apply)

next build (local)

Additional context

I'm about to open an MR with a proposed fix for this.

Example 'successful failure' during docker build:

Step 5/6 : RUN NODE_OPTIONS="--max-old-space-size=64 --stack-trace-limit=100" npm run build
 ---> Running in 641eed592a9d

> next-build-silent-crash@0.1.0 build
> next build

Attention: Next.js now collects completely anonymous telemetry regarding usage.
This information is used to shape Next.js' roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://nextjs.org/telemetry

  ▲ Next.js 14.2.4

   Creating an optimized production build ...

<--- Last few GCs --->

[31:0x70e542024680]      916 ms: Mark-Compact 61.2 (66.8) -> 59.8 (65.3) MB, 25.58 / 0.00 ms  (average mu = 0.349, current mu = 0.129) allocation failure; scavenge might not succeed
[31:0x70e542024680]      932 ms: Mark-Compact 62.2 (66.6) -> 61.3 (67.0) MB, 8.02 / 0.00 ms  (average mu = 0.396, current mu = 0.487) allocation failure; scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

Removing intermediate container 641eed592a9d
 ---> b49676438387
Step 6/6 : RUN echo "This shouldn't print out if memory was limited to 128M"
 ---> Running in 1eb21f3d9ce5
This shouldn't print out if memory was limited to 128M
Removing intermediate container 1eb21f3d9ce5
 ---> f445121d41d7
Successfully built f445121d41d7
@henricook henricook added the bug Issue was opened via the bug report template. label Jun 21, 2024
@github-actions github-actions bot added the Developer Experience Issues related to Next.js logs, Error overlay, etc. label Jun 21, 2024
@henricook
Copy link
Author

henricook commented Jun 21, 2024

I'm about to open an MR with a proposed fix for this.

Scratch that. I realised that process.on('exit' gets called for all exits, not just SIGKILLs. The code that gets passed to this function is 0 when an OOMKill happens, which has me stumped.

@deh-code
Copy link

Hi, any updates?
I did some research online but i couldn't find out a standard way to intercept the heap out of bound exception.

The best i came up with is a file based approach, like the following

process.on("exit", (code)=>{
        // if server directory is missing inside the build folder assume something went wrong
        if (code === 0 && !_fs.existsSync(path.resolve('.next/server/'))) {
            process.exit(1);
        }

        process.exit(code); 
    })

Still i don't know if this approach cover all the cases, i'd rather wait for a maintainer feedback.
And even though this seems to work for my case, i don't know if performing file system operations while your process is running out of heap memory is a good idea .

Also, the fact that node is killing its process with a 0 code might be more a nodejs related issue rather than a nextjs related one.

@henricook
Copy link
Author

Thanks for your eyes @deh-code! As you might notice from my closed MR I originally thought I'd solved it with just a process.on("exit... before I realised that it also fires on success. Your twist would likely work for this specific example (although a 137 exit code might be more accurate if we're assuming this can only happen on SIGKILL) but I find that I'm still left thinking that surely, surely this must be accounted for with a language construct I don't know about. Or something somewhere is catching and disguising the error.

To that end maybe another example with just a basic node program getting OOMed and hiding the exit code in the same way would be useful.

As a (rubbish) workaround, I'm just checking for existence of this path after I do a next build in my Dockerfile at the moment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue was opened via the bug report template. Developer Experience Issues related to Next.js logs, Error overlay, etc.
Projects
None yet
2 participants