Introduce print preview extraction #22612

darkdh · 2024-03-14T23:40:29Z

In nutshell:
This PR make AIChatPanel a printing::mojom::PrintPreviewUI in order to initiate print preview then we compose print preview result into a pdf and we convert each page of pdf into image and do OCR for each image, finally we concat the results from OCR. This feature will only be rolled out on docs.google.com initially, we will introduce it as general fallback in future PR.

Since there are two PrintPreviewUI sharing printing::mojom::PrintRenderFrame, we have to deal with

PrintPreviewUI ID and print preview requests ID conflicts
PrintRenderFrame will be double bound if AIChatUI is doing print preview extraction with print dialog open.
Print dialog shows up when AIChatUI initiate print preview extraction due to PrintManagerHost::RequestPrintPreview calling PrintPreviewDialogController::PrintPreview

Unlike print dialog, AIChatPanel will disconnect PrintRenderFrame when print preview is done or failed.
We also introduce a new printing service(PdftoBitmapConverter) to convert the selected pdf page into a SkBitmap and
since large context will be truncated, we short-circuit the per page OCR process if context limit is reached or page limit is reached.

Submitter Checklist:

I confirm that no security/privacy review is needed and no other type of reviews are needed, or that I have requested them
There is a ticket for my issue
Used Github auto-closing keywords in the PR description above
Wrote a good PR/commit description
Squashed any review feedback or "fixup" commits before merge, so that history is a record of what happened in the repo, not your PR
Added appropriate labels (QA/Yes or QA/No; release-notes/include or release-notes/exclude; OS/...) to the associated issue
Checked the PR locally:
- npm run test -- brave_browser_tests, npm run test -- brave_unit_tests wiki
- npm run presubmit wiki, npm run gn_check, npm run tslint
Ran git rebase master (if needed)

Reviewer Checklist:

A security review is not needed, or a link to one is included in the PR description
New files have MPL-2.0 license header
Adequate test coverage exists to prevent regressions
Major classes, functions and non-trivial code blocks are well-commented
Changes in component dependencies are properly reflected in gn
Code follows the style guide
Test plan is specified in PR before merging

After-merge Checklist:

The associated issue milestone is set to the smallest version that the
changes has landed on
All relevant documentation has been updated, for instance:

Test Plan: (Windows and MacOS only)

Regression test on previous google doc support

Open a google doc with only 1 page of content and summarize it
Summary should be relevant

Test on full page google doc support

Open a google doc with only multiple pages of content and summarize it
Summary should be relevant

Test on page limit (20)

Open a google doc with 19 pages of blank pages and 2 pages with completely different scoped of contents
Summarize the page
Summary should be only relevant to 20th page but not 21st page

Compatibility with print dialog

Open a google doc and print it
When print dialog is open, open Leo panel and summarize the page
Summary should show up
Close print dialog and trigger print again
Print dialog should still show up with print preview available

browser/ui/webui/ai_chat/ai_chat_ui_page_handler.cc

github-actions · 2024-03-14T23:55:29Z

browser/ui/webui/ai_chat/ai_chat_ui_page_handler.cc

+      pdf_to_bitmap_converter_.BindNewPipeAndPassReceiver());
+  pdf_to_bitmap_converter_.set_disconnect_handler(
+      base::BindOnce(&AIChatUIPageHandler::BitmapConverterDisconnected,
+                     base::Unretained(this)));


_{reported by reviewdog 🐶}
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated

Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml

Cc @thypon @goodov @iefremov

We own the mojo remote

github-actions · 2024-03-14T23:55:29Z

browser/ui/webui/ai_chat/ai_chat_ui_page_handler.cc

+  pdf_to_bitmap_converter_->GetBitmap(
+      std::move(pdf_region.region),
+      base::BindOnce(&AIChatUIPageHandler::OnGetBitmaps,
+                     base::Unretained(this)));


_{reported by reviewdog 🐶}
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated

Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml

Cc @thypon @goodov @iefremov

We own the mojo remote

chromium_src/chrome/browser/printing/print_view_manager.h

build/commands/lib/util.js

goodov · 2024-03-18T07:04:01Z

services/printing/pdf_to_bitmap_converter.cc

+    const SkImageInfo info =
+        SkImageInfo::Make(size.width(), size.height(), kBGRA_8888_SkColorType,
+                          kOpaque_SkAlphaType);
+    if (!bitmap.tryAllocPixels(info, info.minRowBytes())) {


what's the actual memory footprint of these bitmaps? A single A4 page @ 300dpi might take ~30MB in 8-bit RGBA. I'm afraid right now this will eat a lot of RAM, just think of 100+ page documents.

Please consider a queued conversion here, you're calling the text-recognition API on per-page basis anyway.

Ah yes, that is a good point. We currently impose page limit of 20 on text recognition process so it would be wasting resources to generate bitmaps more than that limit. I will add a max_pages to the API and call it with the same limit.

Addressed in cc93d16

Just did a profiling on a nine page doc https://docs.google.com/document/d/1OhFiLobEhBthcoaEbPIe_oNNmIq19lCYIuA_YEnmJAQ

recorded a new trace with legacy UI

With 300 dpi, the dimension for rendering 560px x 795px image is 1.87in x 2.65in and the size gfx::Size size = gfx::ToCeiledSize(*page_size); (420pt x 596pt = 5.83in x 8.28in with CSS standard 96 dpi) passed into chrome_pdf::RenderPDFPageToBitmap should be sufficient enough to contain the image.

okay, I think I figured what's going on and why things seems to be working fine, but I wasn't able to connect the dots.

GetPDFPageSizeByIndex returns the size in points, NOT pixels. Each point is 1/72 of an inch (see chrome_pdf::CalculatePosition and printing::kPointsPerInch).

You allocate the bitmap using points you get from this call, NOT pixels.

You call chrome_pdf::RenderPDFPageToBitmap with the bitmap you allocated and a requested dpi of 300, but the renderer can't really fit the 300 dpi image into the bitmap you passed.

The renderer recalculates width/height and renders the image at pixel-to-point dpi (72), ignoring 300 dpi you pass.

So after these manipulations you get 72 dpi image that happens to pass OCR. Will it work in documents with a smaller font size? Did you really want to run the OCR with 72 dpi images?

Either way, please add comments on what's actually going on and maybe increase the dpi to cover documents with smaller fonts.

yep, we definitely need upscale to 300 dpi because when I changed font size from 11 to 5, the OCR result is totally wrong.

Thanks for the summary, upscale bitmap to 300 dpi in f942df0. And now the bitmap size for each page is 16.58MB which is the maximum bitmap allocation for the utility process because we do conversion and OCR on per page basis.

Thanks for the summary, upscale bitmap to 300 dpi in f942df0. And now the bitmap size for each page is 16.58MB which is the maximum bitmap allocation for the utility process because we do conversion and OCR on per page basis.

awesome!

atuchin-m · 2024-03-18T09:50:54Z

browser/ui/webui/ai_chat/ai_chat_ui_page_handler.cc

+
+void AIChatUIPageHandler::OnGetBitmaps(
+    const std::optional<std::vector<SkBitmap>>& bitmaps) {
+  VLOG(3) << __func__ << ": bitmap size: " << (bitmaps ? bitmaps->size() : -1);


Do you manipulate with bitmaps on UI thread?
If bitmaps is huge, even simple operations could result in a short UI hang.

Do you mean any huge vectors in general or a huge vector with SkiBitmap specifically?
I can remove this VLOG though.

Also these bitmaps will be OCR on different thread in PreviewPageTextExtractor

with 4166f2a, we no longer have this function

browser/ui/webui/ai_chat/print_preview_extractor.h

goodov · 2024-04-12T07:03:50Z

browser/ui/webui/DEPS

@@ -1,4 +1,7 @@
 include_rules = [
+  "+brave/services/printing/public/mojom",


nit: this should be added into webui/ai_chat/DEPS

fixed and squashed.

goodov · 2024-04-12T07:07:28Z

browser/ui/webui/ai_chat/ai_chat_ui_page_handler.cc

@@ -85,6 +84,10 @@ AIChatUIPageHandler::AIChatUIPageHandler(

  favicon_service_ = FaviconServiceFactory::GetForProfile(
      profile_, ServiceAccessType::EXPLICIT_ACCESS);
+#if BUILDFLAG(ENABLE_PRINT_PREVIEW)
+  print_preview_extractor_ = std::make_unique<PrintPreviewExtractor>(


this should only be created when you will use it.

fixed and squashed.

goodov · 2024-04-12T09:16:51Z

browser/ui/webui/ai_chat/print_preview_extractor.cc

+    // Stop processing if we have reached the maximum number of pages or the
+    // maximum length of the content
+    if (current_page_index_ + 1 >= kMaxPreviewPages ||
+        preview_text_.str().length() >= max_page_content_length_) {


preview_text_.str() will always allocate a new string, please don't do that.

Do you really need to use std::stringstream? std::string and base::StrAppend should work just fine.

fixed and squashed.

goodov · 2024-04-12T09:22:42Z

browser/ui/webui/ai_chat/print_preview_extractor.h

+  bool IsPrintPreviewUIBound() const;
+  void SetPreviewUIId();
+  void ClearPreviewUIId();
+  void OnPrintPreviewRequest(int request_id);


most of these methods should be private.

fixed and squashed.

github-actions · 2024-04-12T16:33:26Z

[puLL-Merge] - brave/brave-core@22612

Description

This PR adds print preview support to the AI Chat feature in Brave. It allows extracting text from PDF documents using OCR in the print preview flow. The main motivation is to enable AI Chat to provide assistance on document-like websites such as Google Docs.

Changes

Changes

browser/ai_chat/BUILD.gn, browser/ai_chat/ai_chat_ui_browsertest.cc: Added new browser tests for AI Chat print preview functionality.
browser/ai_chat/page_content_fetcher_browsertest.cc: Removed print preview related tests as they were moved to ai_chat_ui_browsertest.cc.
browser/ui/BUILD.gn, browser/ui/webui/ai_chat/ai_chat_ui_page_handler.cc, browser/ui/webui/ai_chat/ai_chat_ui_page_handler.h, browser/ui/webui/ai_chat/print_preview_extractor.cc, browser/ui/webui/ai_chat/print_preview_extractor.h: Implemented the print preview extractor which creates print previews, converts PDF pages to bitmaps, and extracts text using OCR.
Several patches to hook into Chromium's print preview flow and expose necessary interfaces.
components/ai_chat/content/browser/ai_chat_tab_helper.cc, components/ai_chat/content/browser/ai_chat_tab_helper.h, components/ai_chat/content/browser/page_content_fetcher.cc: Added logic to trigger print preview based text extraction for certain document hosts.
components/ai_chat/core/browser/conversation_driver.cc, components/ai_chat/core/browser/conversation_driver.h: Added OnPrintPreviewRequested() observer method.
components/ai_chat/core/browser/constants.cc, components/ai_chat/core/browser/constants.h, components/ai_chat/core/browser/utils.cc, components/ai_chat/core/browser/utils.h: Moved some common constants and OCR utility functions.
services/printing/*: Added mojo interfaces and implementation for a PDF to bitmap converter service.
test/data/leo/*: Added test HTML files for print preview testing.

Security Considerations

The print preview extractor runs in the browser process and converts potentially untrusted web page content to PDF. Need to ensure the PDF library handles untrusted input securely. Low risk as Chromium's print preview does this conversion already.
OCR is performed on PDF page bitmaps. The OCR library needs to handle arbitrary image input securely. Low-medium risk depending on robustness of OCR implementation.
There are several new IPC interfaces added (e.g. PdfToBitmapConverter). Need to validate that the IPC bindings are secure and cannot be abused by the renderer. Low risk if using standard mojo binding security practices.

Let me know if you have any other questions! The PR looks good overall with some important new functionality. The main areas to double-check are around the new mojo IPC interfaces and handling of untrusted PDFs and images in the print preview extractor.

…apConverter

… OCR after conversion. Also decouple OCR logic from FetchPageContent.

…s docs

…tPreviewExtractor

darkdh self-assigned this Mar 14, 2024

darkdh force-pushed the preview-extraction branch from 355acdb to 4bc2f03 Compare March 14, 2024 23:50

darkdh requested review from petemill, bbondy and goodov March 14, 2024 23:51

darkdh marked this pull request as ready for review March 14, 2024 23:51

darkdh requested review from a team as code owners March 14, 2024 23:51

github-actions bot reviewed Mar 14, 2024

View reviewed changes

github-actions bot added the needs-security-review label Mar 14, 2024

github-actions bot assigned fmarier, goodov, iefremov and thypon Mar 14, 2024

darkdh force-pushed the preview-extraction branch from 4bc2f03 to 23cd9a8 Compare March 15, 2024 16:24

github-actions bot added the puLL-Merge label Mar 15, 2024

darkdh force-pushed the preview-extraction branch from 23cd9a8 to f57bc2b Compare March 15, 2024 17:39

fmarier removed the needs-security-review label Mar 15, 2024

fmarier removed their assignment Mar 15, 2024

darkdh force-pushed the preview-extraction branch from f57bc2b to 1322766 Compare March 15, 2024 20:37

goodov reviewed Mar 18, 2024

View reviewed changes

chromium_src/chrome/browser/printing/print_view_manager.h Outdated Show resolved Hide resolved

build/commands/lib/util.js Outdated Show resolved Hide resolved

goodov reviewed Mar 18, 2024

View reviewed changes

atuchin-m reviewed Mar 18, 2024

View reviewed changes

darkdh force-pushed the preview-extraction branch 3 times, most recently from 6625979 to 3434c0f Compare March 18, 2024 21:30

darkdh requested a review from goodov March 19, 2024 20:19

github-actions bot reviewed Apr 11, 2024

View reviewed changes

browser/ui/webui/ai_chat/print_preview_extractor.h Show resolved Hide resolved

darkdh force-pushed the preview-extraction branch 3 times, most recently from 42bb043 to 5eb9df9 Compare April 11, 2024 22:46

darkdh requested a review from goodov April 11, 2024 23:19

goodov approved these changes Apr 12, 2024

View reviewed changes

darkdh force-pushed the preview-extraction branch from 3f39328 to 6320b0b Compare April 12, 2024 16:32

darkdh added 14 commits April 12, 2024 11:55

Extract page content through print preview

9099778

Re-export PrintViewManager

b01ab76

Merge extracted texts after OCR

fff56eb

Impose page limit and content limit for preview page processing

d54d984

Reveal print preview extraction only on doc.google.com initially

6f1b518

Always reset PrintPreviewUI remote before binding to avoid double bind

60a3118

Fix deps and buildflag

9b931e1

Add AIChatUIBrowserTest and fix existing tests

3d35be8

Use PrintViewManager_BraveImpl to friend PrintViewManager

ee0f05b

Share kMaxPreviewPages between PreviewPageTextExtractor and PdfToBitm…

d2c55fa

…apConverter

Use PreviewPageTextExtractor to convert every pdf page into image and…

cf0853f

… OCR after conversion. Also decouple OCR logic from FetchPageContent.

Scale up bitmap size to fit in 300 dpi image to accomodate small font…

957be1b

…s docs

Handle ::prefs::kPrintPreviewDisabled is true situation

ef72686

Decoule print preview extraction logic from AIChat* classes into Prin…

2797b68

…tPreviewExtractor

darkdh force-pushed the preview-extraction branch from 6320b0b to 1d2a2bc Compare April 12, 2024 18:59

darkdh enabled auto-merge April 12, 2024 18:59

Initiate print preview extraction when requested

06125c3

darkdh force-pushed the preview-extraction branch from 1d2a2bc to 06125c3 Compare April 12, 2024 19:24

darkdh merged commit 9d4cfea into master Apr 12, 2024
19 checks passed

darkdh deleted the preview-extraction branch April 12, 2024 20:33

github-actions bot added this to the 1.67.x - Nightly milestone Apr 12, 2024

emerick mentioned this pull request Apr 15, 2024

Test failure: PrintBrowserTest.PdfWindowDotPrint brave/brave-browser#37572

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce print preview extraction #22612

Introduce print preview extraction #22612

darkdh commented Mar 14, 2024 •

edited

github-actions bot Mar 14, 2024

darkdh Mar 15, 2024

github-actions bot Mar 14, 2024

darkdh Mar 15, 2024

goodov Mar 18, 2024

darkdh Mar 18, 2024

darkdh Mar 19, 2024

darkdh Mar 20, 2024

darkdh Mar 20, 2024

darkdh Apr 2, 2024 •

edited

goodov Apr 3, 2024

darkdh Apr 3, 2024

darkdh Apr 3, 2024

goodov Apr 4, 2024

atuchin-m Mar 18, 2024

darkdh Mar 18, 2024

darkdh Mar 18, 2024

darkdh Apr 2, 2024

goodov Apr 12, 2024

darkdh Apr 12, 2024

goodov Apr 12, 2024

darkdh Apr 12, 2024

goodov Apr 12, 2024

darkdh Apr 12, 2024

goodov Apr 12, 2024

darkdh Apr 12, 2024

github-actions bot commented Apr 12, 2024

Changes

Security Considerations

		@@ -1,4 +1,7 @@
		include_rules = [
		"+brave/services/printing/public/mojom",

Introduce print preview extraction #22612

Introduce print preview extraction #22612

Conversation

darkdh commented Mar 14, 2024 • edited

Submitter Checklist:

Reviewer Checklist:

After-merge Checklist:

Test Plan: (Windows and MacOS only)

Regression test on previous google doc support

Test on full page google doc support

Test on page limit (20)

Compatibility with print dialog

github-actions bot Mar 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot Mar 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darkdh Apr 2, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Apr 12, 2024

Description

Changes

Security Considerations

darkdh commented Mar 14, 2024 •

edited

darkdh Apr 2, 2024 •

edited