Peak Memory usage - PySpark 3 on Azure Synapse #40

jacwalte · 2022-08-02T17:07:10Z

Found an odd issue. We recently started running our jobs through Azure Synapse. While using Azure HDI, we were able to record the peakExecutionMemory, but for some reason with Azure Synapse, all the values are 0.

We are using TaskMetrics to get the most information out of the run and within the csv generated, other columns are populated except for the peakExecutionMemory which are all 0.

Is this a known issue?

We are running with python 3.7, pyspark 3.2.1, and scala 2.12 and using the spark-measure_2.12:0.18.jar

jacwalte · 2022-08-02T22:03:38Z

Looks to have been another issue - peak memory is now being reported - Thanks!

LucaCanali · 2022-08-03T07:33:39Z

Good to know this works OK for you.
Cheers, Luca

jacwalte closed this as completed Aug 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peak Memory usage - PySpark 3 on Azure Synapse #40

Peak Memory usage - PySpark 3 on Azure Synapse #40

jacwalte commented Aug 2, 2022

jacwalte commented Aug 2, 2022

LucaCanali commented Aug 3, 2022

Peak Memory usage - PySpark 3 on Azure Synapse #40

Peak Memory usage - PySpark 3 on Azure Synapse #40

Comments

jacwalte commented Aug 2, 2022

jacwalte commented Aug 2, 2022

LucaCanali commented Aug 3, 2022