Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUERY] - EMR stuck at bootstrapping #24

Closed
awsjputro opened this issue Apr 11, 2023 · 2 comments
Closed

[QUERY] - EMR stuck at bootstrapping #24

awsjputro opened this issue Apr 11, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@awsjputro
Copy link

awsjputro commented Apr 11, 2023

Describe the bug
This may not be a bug because the root cause is not yet identified. This is more of a question in case this is a known problem/symptom.

I used the same exact package and version with the same code successfully a few months ago (Jan 2023) but I have been encountering this problem in the last few weeks. The EMR launch workflow failed with bootstrapping phase took an hour before it eventually failed with internal error. It is confirmed not the bootstrapping script, but most likely the next step after bootstrapping. I tried different EMR version (initially 6.6.0, tried 6.8.0-6.9.0) and region but gave same result.
Was there a similar behaviour encountered by anyone else?
Any suggestion on where/what to investigate?

To Reproduce
Steps to reproduce the behavior:
Deploy and launch the EMR with the workflow. Error message: "Status": {"State": "TERMINATING", "StateChangeReason": {"Message": "An internal error occurred.""

Expected behavior
Successful EMR launch.

Screenshots

@awsjputro awsjputro added the bug Something isn't working label Apr 11, 2023
@chamcca
Copy link
Contributor

chamcca commented Apr 12, 2023

The behavior you describe is common with bootstrap, steps, or configuration issues, but not necessarily something caused by the aws-emr-launch CDK library. This can sometimes occur if the bootstrap script is successful but inadvertently returns a non-zero exit code. Any non-zero exit code from the bootstrap script is considered a failure and the cluster will be terminated.

@awsjputro
Copy link
Author

awsjputro commented Apr 13, 2023

I have tried to ensure the bootstrap script return/exit with 0 (and confirmed in the output log) but the problem persists. I believe it is the next step right after that having the problem. But yeah I agree that it doesn't mean a problem related to aws-emr-launch package, so I will close this. Thank your for your advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants