Skip to content

Commit

Permalink
[SPARK-22043][PYTHON] Improves error message for show_profiles and du…
Browse files Browse the repository at this point in the history
…mp_profiles

## What changes were proposed in this pull request?

This PR proposes to improve error message from:

```
>>> sc.show_profiles()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../spark/python/pyspark/context.py", line 1000, in show_profiles
    self.profiler_collector.show_profiles()
AttributeError: 'NoneType' object has no attribute 'show_profiles'
>>> sc.dump_profiles("/tmp/abc")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../spark/python/pyspark/context.py", line 1005, in dump_profiles
    self.profiler_collector.dump_profiles(path)
AttributeError: 'NoneType' object has no attribute 'dump_profiles'
```

to

```
>>> sc.show_profiles()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../spark/python/pyspark/context.py", line 1003, in show_profiles
    raise RuntimeError("'spark.python.profile' configuration must be set "
RuntimeError: 'spark.python.profile' configuration must be set to 'true' to enable Python profile.
>>> sc.dump_profiles("/tmp/abc")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../spark/python/pyspark/context.py", line 1012, in dump_profiles
    raise RuntimeError("'spark.python.profile' configuration must be set "
RuntimeError: 'spark.python.profile' configuration must be set to 'true' to enable Python profile.
```

## How was this patch tested?

Unit tests added in `python/pyspark/tests.py` and manual tests.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #19260 from HyukjinKwon/profile-errors.
  • Loading branch information
HyukjinKwon committed Sep 18, 2017
1 parent 6308c65 commit 7c72662
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 2 deletions.
12 changes: 10 additions & 2 deletions python/pyspark/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -997,12 +997,20 @@ def runJob(self, rdd, partitionFunc, partitions=None, allowLocal=False):

def show_profiles(self):
""" Print the profile stats to stdout """
self.profiler_collector.show_profiles()
if self.profiler_collector is not None:
self.profiler_collector.show_profiles()
else:
raise RuntimeError("'spark.python.profile' configuration must be set "
"to 'true' to enable Python profile.")

def dump_profiles(self, path):
""" Dump the profile stats into directory `path`
"""
self.profiler_collector.dump_profiles(path)
if self.profiler_collector is not None:
self.profiler_collector.dump_profiles(path)
else:
raise RuntimeError("'spark.python.profile' configuration must be set "
"to 'true' to enable Python profile.")

def getConf(self):
conf = SparkConf()
Expand Down
16 changes: 16 additions & 0 deletions python/pyspark/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -1296,6 +1296,22 @@ def heavy_foo(x):
rdd.foreach(heavy_foo)


class ProfilerTests2(unittest.TestCase):
def test_profiler_disabled(self):
sc = SparkContext(conf=SparkConf().set("spark.python.profile", "false"))
try:
self.assertRaisesRegexp(
RuntimeError,
"'spark.python.profile' configuration must be set",
lambda: sc.show_profiles())
self.assertRaisesRegexp(
RuntimeError,
"'spark.python.profile' configuration must be set",
lambda: sc.dump_profiles("/tmp/abc"))
finally:
sc.stop()


class InputFormatTests(ReusedPySparkTestCase):

@classmethod
Expand Down

0 comments on commit 7c72662

Please sign in to comment.