Skip to content

Command failed with exit code 10 on Glue Job #1176

@kev-dfs

Description

@kev-dfs

Hi everyone

I programmed a processing of data on Jupyter Notebook (SageMaker) with the awswrangler library. This code work perfectly in this enviorement but when I try run it on Glue, the code finish with the next error: Command Failed with exit code 10. This error in the Knowledge Center say that is an error by Memory. Then I runed a memory profile to check how many memory use the process and I find that the process use 25Gb of memory in a "pandas.merge" because the Dataframes are so big (more than 10 Gb each one).
Next, I tryed create "categories" on the some columns for optimize the memory use, but when the code execute the "merge" again, this categories was lose.
¿How can I improve this? Is better change all for a Spark Job (Programmed in Spark)?
I think that someone must haved this problem and could resolved it.

Please I need guidance.
Thanks You.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions