Skip to content

feat: add PDF export for Superset chart exports and introduce new PDF…#38677

Open
ozguryuksel wants to merge 3 commits intoapache:masterfrom
ozguryuksel:dev
Open

feat: add PDF export for Superset chart exports and introduce new PDF…#38677
ozguryuksel wants to merge 3 commits intoapache:masterfrom
ozguryuksel:dev

Conversation

@ozguryuksel
Copy link
Copy Markdown

@ozguryuksel ozguryuksel commented Mar 16, 2026

User description

… field for email delivery

feat: add PDF export support for Superset charts and new PDF field for email sending

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

CodeAnt-AI Description

Add PDF export for charts and a new chart-data PDF report option

What Changed

  • Users can export charts as PDF from chart menus and the Explore export options; "Export to PDF" appears alongside CSV/XLSX and triggers a PDF download of chart data.
  • Scheduled reports can attach PDFs: existing screenshot-based PDFs remain, and a new "PDF NEW" report format attaches a PDF generated from chart data for chart reports.
  • Server now produces PDF files from chart data and from screenshots so exported PDFs and report attachments are returned with a PDF Content-Type and proper download headers.
  • Explore endpoints and export utilities accept pdf as an export type; export logs record PDF download actions and frontend menus/tests updated to cover PDF exports.
  • Alert/report UI hides the screenshot-width input when the new chart-data "PDF NEW" format is selected; unit tests added/updated for PDF behaviors.

Impact

✅ Export charts as PDF
✅ Attach chart-data PDF to scheduled reports
✅ Clearer PDF export options in chart menus

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

… field for email delivery

feat: add PDF export support for Superset charts and new PDF field for email sending
@github-actions github-actions bot added the api Related to the REST API label Mar 16, 2026
@dosubot dosubot bot added alert-reports Namespace | Anything related to the Alert & Reports feature change:backend Requires changing the backend change:frontend Requires changing the frontend viz:charts:export Related to exporting charts labels Mar 16, 2026
@codeant-ai-for-open-source codeant-ai-for-open-source bot added the size:XL This PR changes 500-999 lines, ignoring generated files label Mar 16, 2026
@codeant-ai-for-open-source
Copy link
Copy Markdown
Contributor

Sequence Diagram

This PR adds PDF as a first class chart export format in dashboard and explore actions, and introduces a new PDF NEW report format for chart schedules. The new scheduled format generates the attachment from chart data instead of screenshot based rendering.

sequenceDiagram
    participant User
    participant Frontend
    participant Backend
    participant PDFBuilder
    participant ReportScheduler
    participant EmailService

    User->>Frontend: Click Export to PDF
    Frontend->>Backend: Request chart export as pdf
    Backend->>PDFBuilder: Build PDF from chart data rows
    Backend-->>Frontend: Return downloadable PDF file

    User->>Frontend: Select PDF NEW for chart report
    Frontend->>ReportScheduler: Save schedule with PDF NEW format
    ReportScheduler->>Backend: Fetch chart data export as pdf
    Backend->>PDFBuilder: Build PDF from chart data rows
    ReportScheduler->>EmailService: Send email with PDF attachment
Loading

Generated by CodeAnt AI

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review bot commented Mar 16, 2026

Code Review Agent Run #dfabaa

Actionable Suggestions - 0
Additional Suggestions - 4
  • superset/utils/pdf.py - 2
    • Missing translation for user-facing text · Line 197-197
      The string "No data available." appears in the generated PDF and should be translatable for internationalization. Use the translation function as done elsewhere in the codebase.
    • Try-except within loop · Line 90-94
      Move the try-except block outside the loop to improve performance. Collect font paths and attempt to load them outside the loop, or use a different approach.
      Code suggestion
       @@ -85,10 +85,11 @@
            font_candidates = [
                "/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf",
                "/usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf",
                "/usr/share/fonts/dejavu/DejaVuSansMono.ttf",
            ]
      -    for font_path in font_candidates:
      -        try:
      -            return ImageFont.truetype(font_path, 15)
      -        except OSError:
      -            continue
      +    for font_path in font_candidates:
      +        if os.path.exists(font_path):
      +            try:
      +                return ImageFont.truetype(font_path, 15)
      +            except OSError:
      +                continue
            return ImageFont.load_default()
  • superset/commands/report/execute.py - 1
    • Inconsistent timeout exception handling · Line 486-486
      The timeout handling for PDF data generation raises ReportSchedulePdfFailedError with a timeout message, whereas CSV generation raises a specific ReportScheduleCsvTimeout exception. This inconsistency may lead to incorrect HTTP status codes (408 for timeout vs default for failure) and deviates from the established pattern for other data formats like CSV and screenshots.
  • superset-frontend/src/features/reports/ReportModal/index.tsx - 1
    • Inconsistent label format · Line 280-280
      The label for the new PDF option uses 'PDF NEW' which doesn't match the parentheses style of other labels in this list. Consider updating it to 'PDF (New) attached in email' for consistency.
      Code suggestion
       @@ -279,4 +279,4 @@
      -            {
      -              label: t('PDF NEW attached in email'),
      -              value: NotificationFormats.PDFNew,
      -            },
      +            {
      +              label: t('PDF (New) attached in email'),
      +              value: NotificationFormats.PDFNew,
      +            },
Review Details
  • Files reviewed - 27 · Commit Range: 7860119..7860119
    • docker/docker-bootstrap.sh
    • superset-frontend/src/dashboard/components/SliceHeader/index.tsx
    • superset-frontend/src/dashboard/components/SliceHeaderControls/SliceHeaderControls.test.tsx
    • superset-frontend/src/dashboard/components/SliceHeaderControls/index.tsx
    • superset-frontend/src/dashboard/components/SliceHeaderControls/types.ts
    • superset-frontend/src/dashboard/components/gridComponents/Chart/Chart.test.tsx
    • superset-frontend/src/dashboard/components/gridComponents/Chart/Chart.tsx
    • superset-frontend/src/dashboard/types.ts
    • superset-frontend/src/explore/components/useExploreAdditionalActionsMenu/index.tsx
    • superset-frontend/src/explore/exploreUtils/exportChart.test.ts
    • superset-frontend/src/explore/exploreUtils/getExploreUrl.test.ts
    • superset-frontend/src/explore/exploreUtils/getLegacyEndpointType.test.ts
    • superset-frontend/src/explore/exploreUtils/getURIDirectory.test.ts
    • superset-frontend/src/explore/exploreUtils/index.ts
    • superset-frontend/src/features/alerts/AlertReportModal.test.tsx
    • superset-frontend/src/features/alerts/AlertReportModal.tsx
    • superset-frontend/src/features/reports/ReportModal/index.tsx
    • superset-frontend/src/features/reports/types.ts
    • superset-frontend/src/logger/LogUtils.ts
    • superset/charts/data/api.py
    • superset/commands/report/execute.py
    • superset/common/chart_data.py
    • superset/reports/models.py
    • superset/utils/pdf.py
    • superset/views/core.py
    • tests/unit_tests/commands/report/execute_test.py
    • tests/unit_tests/utils/pdf_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • Eslint (Linter) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment on lines +470 to +502
chart_data = get_chart_csv_data(chart_url=url, auth_cookies=auth_cookies)
elapsed_seconds = (datetime.utcnow() - start_time).total_seconds()
logger.info(
"Chart data PDF generation from %s as user %s took %.2fs - execution_id: %s", # noqa: E501
url,
username,
elapsed_seconds,
self._execution_id,
)
except SoftTimeLimitExceeded as ex:
elapsed_seconds = (datetime.utcnow() - start_time).total_seconds()
logger.warning(
"Chart data PDF generation timeout after %.2fs - execution_id: %s",
elapsed_seconds,
self._execution_id,
)
raise ReportSchedulePdfFailedError(
"A timeout occurred while generating a pdf."
) from ex
except Exception as ex:
elapsed_seconds = (datetime.utcnow() - start_time).total_seconds()
logger.error(
"Chart data PDF generation failed after %.2fs - execution_id: %s",
elapsed_seconds,
self._execution_id,
)
raise ReportSchedulePdfFailedError(
f"Failed generating pdf {str(ex)}"
) from ex

if not chart_data:
raise ReportSchedulePdfFailedError()
return chart_data
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The chart-data fetch result is returned as PDF bytes without validating the payload format. For multi-query charts, the chart-data API returns a ZIP, and this code will attach ZIP bytes as a .pdf, producing corrupted attachments and runtime integration failures in downstream consumers. [possible bug]

Severity Level: Critical 🚨
- ❌ Multi-query PDF_NEW emails can attach invalid PDF files.
- ❌ Webhook uploads mislabeled ZIP bytes as application/pdf.
- ⚠️ Report run appears successful despite bad attachment.
Suggested change
chart_data = get_chart_csv_data(chart_url=url, auth_cookies=auth_cookies)
elapsed_seconds = (datetime.utcnow() - start_time).total_seconds()
logger.info(
"Chart data PDF generation from %s as user %s took %.2fs - execution_id: %s", # noqa: E501
url,
username,
elapsed_seconds,
self._execution_id,
)
except SoftTimeLimitExceeded as ex:
elapsed_seconds = (datetime.utcnow() - start_time).total_seconds()
logger.warning(
"Chart data PDF generation timeout after %.2fs - execution_id: %s",
elapsed_seconds,
self._execution_id,
)
raise ReportSchedulePdfFailedError(
"A timeout occurred while generating a pdf."
) from ex
except Exception as ex:
elapsed_seconds = (datetime.utcnow() - start_time).total_seconds()
logger.error(
"Chart data PDF generation failed after %.2fs - execution_id: %s",
elapsed_seconds,
self._execution_id,
)
raise ReportSchedulePdfFailedError(
f"Failed generating pdf {str(ex)}"
) from ex
if not chart_data:
raise ReportSchedulePdfFailedError()
return chart_data
chart_data = get_chart_csv_data(chart_url=url, auth_cookies=auth_cookies)
elapsed_seconds = (datetime.utcnow() - start_time).total_seconds()
logger.info(
"Chart data PDF generation from %s as user %s took %.2fs - execution_id: %s", # noqa: E501
url,
username,
elapsed_seconds,
self._execution_id,
)
...
if not chart_data or not chart_data.startswith(b"%PDF-"):
raise ReportSchedulePdfFailedError(
"Chart data export did not return a valid PDF payload."
)
return chart_data
Steps of Reproduction ✅
1. Trigger scheduled `PDF_NEW` chart report execution via `reports.execute`
(`superset/tasks/scheduler.py:57-75`) so code reaches `_get_chart_data_pdf()`
(`execute.py:446-502`).

2. `_get_chart_data_pdf()` requests chart-data export URL with PDF format
(`execute.py:457`) and fetches raw bytes using `get_chart_csv_data()`
(`superset/utils/csv.py:15-29`), which does no content-type validation.

3. For charts producing multiple query results, chart-data API returns a ZIP response in
`_send_chart_response()` (`superset/charts/data/api.py:71-88`) even when `result_format ==
PDF`.

4. Current code only checks non-empty bytes (`execute.py:500-502`) and passes ZIP bytes as
`NotificationContent.pdf`; email/webhook attach them as `.pdf`
(`superset/reports/notifications/email.py:27-35`, `.../webhook.py:84`), yielding invalid
PDF attachments.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/commands/report/execute.py
**Line:** 470:502
**Comment:**
	*Possible Bug: The chart-data fetch result is returned as PDF bytes without validating the payload format. For multi-query charts, the chart-data API returns a ZIP, and this code will attach ZIP bytes as a `.pdf`, producing corrupted attachments and runtime integration failures in downstream consumers.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

Comment on lines +684 to +685
else:
error_text = "PDF NEW is only supported for chart reports"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: When PDF_NEW is configured for a dashboard, the code only sets error_text and returns normal notification content instead of raising an exception. This makes the scheduler mark the run as successful even though report generation is invalid, causing silent misreporting of execution state. [logic error]

Severity Level: Major ⚠️
- ❌ Invalid dashboard PDF_NEW runs recorded as successful.
- ⚠️ Monitoring misses real report configuration failures.
- ⚠️ Operators receive error text, not failure state.
Suggested change
else:
error_text = "PDF NEW is only supported for chart reports"
else:
raise ReportSchedulePdfFailedError(
"PDF NEW is only supported for chart reports"
)
Steps of Reproduction ✅
1. Create or update a dashboard report through API `POST/PUT /api/v1/report`
(`superset/reports/api.py:315-395`) with `report_format=PDF_NEW`; schemas accept any enum
value (`superset/reports/schemas.py:228-231`, `:367-370`) without chart/dashboard
compatibility validation.

2. Scheduler executes it through `reports.execute` (`superset/tasks/scheduler.py:57-75`)
and `send()` (`execute.py:792-799`) calls `_get_notification_content()`
(`execute.py:653+`).

3. In `PDF_NEW` branch for dashboard (`execute.py:679-685`), code sets `error_text`
instead of raising.

4. Function returns `NotificationContent` with text, `send()` completes, and state machine
marks success (`execute.py:905-907` or `:1050-1051`) despite invalid report configuration.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/commands/report/execute.py
**Line:** 684:685
**Comment:**
	*Logic Error: When `PDF_NEW` is configured for a dashboard, the code only sets `error_text` and returns normal notification content instead of raising an exception. This makes the scheduler mark the run as successful even though report generation is invalid, causing silent misreporting of execution state.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

Comment on lines +206 to +208
table_width = sum(column_widths_px)
page_width = max(DEFAULT_PAGE_WIDTH, (PAGE_MARGIN * 2) + table_width)
page_height = DEFAULT_PAGE_HEIGHT
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The computed page width grows directly with column count and can become extremely large, causing high memory allocation when creating PIL images and potentially crashing workers. Add an upper bound to the allowed page width and fail with a controlled error when input exceeds that limit. [possible bug]

Severity Level: Critical 🚨
- ❌ Wide chart PDFs can exhaust memory.
- ❌ Report workers may crash generating PDF_NEW attachments.
Suggested change
table_width = sum(column_widths_px)
page_width = max(DEFAULT_PAGE_WIDTH, (PAGE_MARGIN * 2) + table_width)
page_height = DEFAULT_PAGE_HEIGHT
table_width = sum(column_widths_px)
required_page_width = (PAGE_MARGIN * 2) + table_width
max_page_width = DEFAULT_PAGE_WIDTH * 4
if required_page_width > max_page_width:
raise ValueError("Too many columns to render safely as PDF")
page_width = max(DEFAULT_PAGE_WIDTH, required_page_width)
page_height = DEFAULT_PAGE_HEIGHT
Steps of Reproduction ✅
1. Trigger chart PDF export through `ChartDataRestApi` (`superset/charts/data/api.py:74`,
PDF handling at `:451` and `:466`) using a result set with many columns.

2. In `build_pdf_from_chart_data()` (`superset/utils/pdf.py:185`), each column gets pixel
width (`:202-205`), then all widths are summed (`:206`) and directly used for `page_width`
(`:207`) without limit.

3. PIL allocates the full canvas with `Image.new("RGB", (page_width, page_height),
"white")` (`superset/utils/pdf.py:221`), so very wide data creates very large memory
allocations.

4. The same path is used by scheduled chart-data PDFs
(`superset/commands/report/execute.py:446-470` calls chart data URL with
`result_format=PDF` at `:457`), so worker processes can OOM and fail requests/reports.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/utils/pdf.py
**Line:** 206:208
**Comment:**
	*Possible Bug: The computed page width grows directly with column count and can become extremely large, causing high memory allocation when creating PIL images and potentially crashing workers. Add an upper bound to the allowed page width and fail with a controlled error when input exceeds that limit.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

Comment on lines +236 to +246
@staticmethod
def _generate_pdf(viz_obj: BaseViz) -> FlaskResponse:
payload = viz_obj.get_df_payload()
df = payload.get("df")
records = df.to_dict("records") if df is not None else []
pdf_data = build_pdf_from_chart_data(records)
return Response(
pdf_data,
headers=generate_download_headers("pdf"),
mimetype="application/pdf",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The new PDF export path ignores query execution errors and still returns a successful PDF (often with empty data) when get_df_payload reports failures. This masks real query errors and breaks the API contract compared with other response types that return a 400 payload on failure. Check viz_obj.has_error(payload) before building the PDF and return json_error_response when errors are present. [logic error]

Severity Level: Critical 🚨
- ❌ Explore PDF export hides query failures as success.
- ⚠️ Clients miss expected 400 error payload contract.
- ⚠️ Troubleshooting broken chart queries becomes significantly harder.
Suggested change
@staticmethod
def _generate_pdf(viz_obj: BaseViz) -> FlaskResponse:
payload = viz_obj.get_df_payload()
df = payload.get("df")
records = df.to_dict("records") if df is not None else []
pdf_data = build_pdf_from_chart_data(records)
return Response(
pdf_data,
headers=generate_download_headers("pdf"),
mimetype="application/pdf",
)
@staticmethod
def _generate_pdf(viz_obj: BaseViz) -> FlaskResponse:
payload = viz_obj.get_df_payload()
if viz_obj.has_error(payload):
return json_error_response(payload=payload, status=400)
df = payload.get("df")
records = df.to_dict("records") if df is not None else []
pdf_data = build_pdf_from_chart_data(records)
return Response(
pdf_data,
headers=generate_download_headers("pdf"),
mimetype="application/pdf",
)
Steps of Reproduction ✅
1. In Explore legacy export flow, frontend sets PDF downloads to
`/superset/explore_json/?pdf=true` (verified in
`superset-frontend/src/explore/exploreUtils/index.ts:275-277`, and tests at
`.../exportChart.test.ts:40`).

2. Request reaches `Superset.explore_json()` at `superset/views/core.py:322-337`, which
recognizes `ChartDataResultFormat.PDF` and routes to `generate_json()`
(`core.py:192-205`), then `_generate_pdf()` (`core.py:237-246`).

3. Trigger a chart query failure (realistic path: bad query object/invalid column), where
`viz_obj.get_df_payload()` sets `status=FAILED`, populates `errors`, and leaves `df=None`
(`superset/viz.py:95-117`, payload fields at `viz.py:626-637`).

4. `_generate_pdf()` does not check `viz_obj.has_error(payload)` and still builds a PDF
from empty rows; `build_pdf_from_chart_data()` converts empty data into `"No data
available."` content (`superset/utils/pdf.py:195-197`) and returns HTTP 200.

5. This differs from existing error contract in same view: `get_raw_results()` returns
`json_error_response(..., 400)` when `has_error(payload)` is true
(`superset/views/core.py:170-173`).
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/views/core.py
**Line:** 236:246
**Comment:**
	*Logic Error: The new PDF export path ignores query execution errors and still returns a successful PDF (often with empty data) when `get_df_payload` reports failures. This masks real query errors and breaks the API contract compared with other response types that return a 400 payload on failure. Check `viz_obj.has_error(payload)` before building the PDF and return `json_error_response` when errors are present.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

@sadpandajoe
Copy link
Copy Markdown
Member

@ozguryuksel can you fill out the summary in the PR?

]
table_width = sum(column_widths_px)
page_width = _calculate_page_width(table_width)
page_height = DEFAULT_PAGE_HEIGHT
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The calculated page width is capped, but the column widths are not adjusted when the table is wider than that cap. This causes the right side of wide tables to be drawn outside the canvas and silently clipped in the exported PDF. After calculating page_width, scale or rebalance column_widths_px to fit within the printable width. [logic error]

Severity Level: Major ⚠️
- ❌ Wide-table PDF exports clip rightmost chart data columns.
- ⚠️ Explore `/explore_json` PDF downloads can omit exported columns.
- ⚠️ Chart-data API PDF exports risk truncated table content.
Suggested change
page_height = DEFAULT_PAGE_HEIGHT
available_table_width = page_width - (PAGE_MARGIN * 2)
if table_width > available_table_width and table_width > 0:
scale = available_table_width / table_width
min_cell_width = (CELL_PADDING_X * 2) + char_width
column_widths_px = [
max(int(width * scale), min_cell_width) for width in column_widths_px
]
table_width = sum(column_widths_px)
Steps of Reproduction ✅
1. Run Superset with this PR code and create or open a chart whose underlying data has
many columns and/or extremely long text values (e.g. wide string columns), so that the
rendered table is very wide.

2. From the chart's Explore view, trigger a data PDF export via the `/explore_json/`
endpoint by requesting `ChartDataResultFormat.PDF` (handled in `Superset.explore_json` at
`superset/views/core.py:50-71` and passed to `Superset.generate_json` at
`superset/views/core.py:192-205`).

3. In `Superset._generate_pdf` at `superset/views/core.py:236-242`, the code calls
`build_pdf_from_chart_data(records)` from `superset/utils/pdf.py` with the chart's records
(a list of dict rows).

4. Inside `build_pdf_from_chart_data` at `superset/utils/pdf.py:247-260`,
`column_widths_px` are summed into `table_width`, then `page_width =
_calculate_page_width(table_width)` caps the canvas width at `MAX_PAGE_WIDTH = 4096`
(`superset/utils/pdf.py:39-41`), but the individual `column_widths_px` are not scaled
down; when `table_width` exceeds `page_width - 2 * PAGE_MARGIN`, the rightmost columns are
drawn beyond the image width and are silently clipped in the resulting PDF (only left
columns are visible).
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/utils/pdf.py
**Line:** 270:270
**Comment:**
	*Logic Error: The calculated page width is capped, but the column widths are not adjusted when the table is wider than that cap. This causes the right side of wide tables to be drawn outside the canvas and silently clipped in the exported PDF. After calculating `page_width`, scale or rebalance `column_widths_px` to fit within the printable width.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

@netlify
Copy link
Copy Markdown

netlify bot commented Apr 2, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 411ae31
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69ce64eb8a3a9200080acf40
😎 Deploy Preview https://deploy-preview-38677--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review bot commented Apr 2, 2026

Code Review Agent Run #cae939

Actionable Suggestions - 0
Review Details
  • Files reviewed - 3 · Commit Range: 7860119..411ae31
    • superset/charts/data/api.py
    • superset/utils/pdf.py
    • tests/unit_tests/utils/pdf_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment on lines +141 to +149
font_candidates = [
"/app/.venv/lib/python3.11/site-packages/matplotlib/mpl-data/fonts/ttf/DejaVuSansMono.ttf",
"/app/.venv/lib/python3.11/site-packages/matplotlib/mpl-data/fonts/ttf/DejaVuSans.ttf",
"/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf",
"/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf",
"/usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf",
"/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf",
"/usr/share/fonts/dejavu/DejaVuSansMono.ttf",
"/usr/share/fonts/dejavu/DejaVuSans.ttf",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The matplotlib font paths are hardcoded to a Python 3.11 site-packages directory, so on supported runtimes like Python 3.12 those candidates are always invalid. In environments without system DejaVu/Liberation fonts, this forces a fallback to the default bitmap font and breaks Unicode rendering in exported PDFs. Build the matplotlib font path dynamically from the active interpreter's site-packages path. [logic error]

Severity Level: Major ⚠️
- ⚠️ ChartDataRestApi PDF exports drop non-ASCII characters.
- ⚠️ Scheduled chart-data PDF email reports misrender localized text.
Suggested change
font_candidates = [
"/app/.venv/lib/python3.11/site-packages/matplotlib/mpl-data/fonts/ttf/DejaVuSansMono.ttf",
"/app/.venv/lib/python3.11/site-packages/matplotlib/mpl-data/fonts/ttf/DejaVuSans.ttf",
"/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf",
"/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf",
"/usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf",
"/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf",
"/usr/share/fonts/dejavu/DejaVuSansMono.ttf",
"/usr/share/fonts/dejavu/DejaVuSans.ttf",
import sysconfig
purelib = sysconfig.get_paths().get("purelib", "")
font_candidates = [
f"{purelib}/matplotlib/mpl-data/fonts/ttf/DejaVuSansMono.ttf" if purelib else "",
f"{purelib}/matplotlib/mpl-data/fonts/ttf/DejaVuSans.ttf" if purelib else "",
"/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf",
"/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf",
"/usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf",
"/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf",
"/usr/share/fonts/dejavu/DejaVuSansMono.ttf",
"/usr/share/fonts/dejavu/DejaVuSans.ttf",
]
font_candidates = [path for path in font_candidates if path]
Steps of Reproduction ✅
1. Deploy Superset from this PR into a virtualenv whose Python version and site-packages
path do NOT match `/app/.venv/lib/python3.11/site-packages` and where DejaVu/Liberation
fonts are installed only via `matplotlib` under the active interpreter's site-packages
(not under `/usr/share/fonts/...`). In this environment `_load_table_font` at
`superset/utils/pdf.py:137-21` uses `font_candidates` starting with the hardcoded
`/app/.venv/lib/python3.11/...` paths (lines 141–149), which do not exist, and the
`/usr/share/...` fallbacks also do not exist.

2. In the same deployment, create or open a chart whose column labels or data include
non-ASCII characters (for example Turkish glyphs matching `TURKISH_GLYPH_PROBE =
"ĞÜŞİÖÇğüşıöç"` defined in `superset/utils/pdf.py:46`) and ensure the chart is saved so it
has a stored query context.

3. Trigger a chart-data PDF export by calling the Chart data API endpoint
`ChartDataRestApi.get_data` (exposed at `/<int:pk>/data/` in
`superset/charts/data/api.py:32-47`) with `format=pdf`, e.g. `GET
/api/v1/chart/<chart_id>/data/?format=pdf&type=full`. Inside `get_data`, when
`is_pdf_format` is true, it calls `build_pdf_from_chart_data(_normalize_pdf_rows(data))`
at `superset/charts/data/api.py:16-23 (offset 440 block)`, which in turn calls
`_load_table_font()` at `superset/utils/pdf.py:292`.

4. Because none of the hardcoded `font_candidates` paths (lines 141–149 in
`superset/utils/pdf.py`) exist in this environment, `_load_table_font` falls through the
loop and returns `ImageFont.load_default()`, a bitmap ASCII font. The generated PDF
returned by `ChartDataRestApi.get_data` with `mimetype="application/pdf"` at
`superset/charts/data/api.py:22-27 (offset 440 block)` will render non-ASCII characters
(e.g., Turkish glyphs) as `'?'`, silently degrading Unicode rendering in chart-data PDF
exports. The same degraded PDFs will be attached to scheduled chart-data PDF reports,
since `_get_chart_data_pdf` in `superset/commands/report/execute.py:17-42 (offset 430
block)` ultimately fetches the same PDF bytes from the chart data endpoint.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/utils/pdf.py
**Line:** 141:149
**Comment:**
	*Logic Error: The matplotlib font paths are hardcoded to a Python 3.11 site-packages directory, so on supported runtimes like Python 3.12 those candidates are always invalid. In environments without system DejaVu/Liberation fonts, this forces a fallback to the default bitmap font and breaks Unicode rendering in exported PDFs. Build the matplotlib font path dynamically from the active interpreter's site-packages path.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review bot commented Apr 3, 2026

Code Review Agent Run #fd5368

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: 411ae31..a341ec5
    • superset/utils/pdf.py
    • tests/unit_tests/utils/pdf_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

alert-reports Namespace | Anything related to the Alert & Reports feature api Related to the REST API change:backend Requires changing the backend change:frontend Requires changing the frontend size/XL size:XL This PR changes 500-999 lines, ignoring generated files viz:charts:export Related to exporting charts

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants