Skip to content

Commit

Permalink
[SPARK-48650][PYTHON] Display correct call site from IPython Notebook
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR proposes to display correct call site information from IPython Notebook.

### Why are the changes needed?

We added `DataFrameQueryContext` for PySpark error message from #45377, but it does not working very well from IPython Notebook.

### Does this PR introduce _any_ user-facing change?

No API changes, but the user-facing error message from IPython Notebook will be improved:

**Before**
<img width="1124" alt="Screenshot 2024-06-18 at 5 15 56 PM" src="https://github.com/apache/spark/assets/44108233/3e3aee2c-5bb0-4858-b392-e845b7280d31">

**After**
<img width="1163" alt="Screenshot 2024-06-19 at 8 45 05 AM" src="https://github.com/apache/spark/assets/44108233/81741d15-cac9-41e7-815a-5d84f1176c73">

**NOTE:** This also works when command is executed across multiple cells:

<img width="1175" alt="Screenshot 2024-06-19 at 8 42 29 AM" src="https://github.com/apache/spark/assets/44108233/d65fbf79-d621-4ae0-b220-2f7923cc3666">

### How was this patch tested?

Manually tested with IPython Notebook.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #47009 from itholic/error_context_on_notebook.

Authored-by: Haejoon Lee <haejoon.lee@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
  • Loading branch information
itholic authored and HyukjinKwon committed Jun 24, 2024
1 parent e972dae commit 88cc153
Showing 1 changed file with 23 additions and 2 deletions.
25 changes: 23 additions & 2 deletions python/pyspark/errors/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import os
import threading
from typing import Any, Callable, Dict, Match, TypeVar, Type, Optional, TYPE_CHECKING
import pyspark
from pyspark.errors.error_classes import ERROR_CLASSES_MAP

if TYPE_CHECKING:
Expand Down Expand Up @@ -164,9 +165,29 @@ def _capture_call_site(spark_session: "SparkSession", depth: int) -> str:
The call site information is used to enhance error messages with the exact location
in the user code that led to the error.
"""
stack = list(reversed(inspect.stack()))
# Filtering out PySpark code and keeping user code only
pyspark_root = os.path.dirname(pyspark.__file__)
stack = [
frame_info for frame_info in inspect.stack() if pyspark_root not in frame_info.filename
]

selected_frames = stack[:depth]
call_sites = [f"{frame.filename}:{frame.lineno}" for frame in selected_frames]

# We try import here since IPython is not a required dependency
try:
from IPython import get_ipython

ipython = get_ipython()
except ImportError:
ipython = None

# Identifying the cell is useful when the error is generated from IPython Notebook
if ipython:
call_sites = [
f"line {frame.lineno} in cell [{ipython.execution_count}]" for frame in selected_frames
]
else:
call_sites = [f"{frame.filename}:{frame.lineno}" for frame in selected_frames]
call_sites_str = "\n".join(call_sites)

return call_sites_str
Expand Down

0 comments on commit 88cc153

Please sign in to comment.