New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce print preview extraction #22612
Conversation
355acdb
to
4bc2f03
Compare
pdf_to_bitmap_converter_.BindNewPipeAndPassReceiver()); | ||
pdf_to_bitmap_converter_.set_disconnect_handler( | ||
base::BindOnce(&AIChatUIPageHandler::BitmapConverterDisconnected, | ||
base::Unretained(this))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:
- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated
Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml
Cc @thypon @goodov @iefremov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We own the mojo remote
pdf_to_bitmap_converter_->GetBitmap( | ||
std::move(pdf_region.region), | ||
base::BindOnce(&AIChatUIPageHandler::OnGetBitmaps, | ||
base::Unretained(this))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:
- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated
Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml
Cc @thypon @goodov @iefremov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We own the mojo remote
4bc2f03
to
23cd9a8
Compare
23cd9a8
to
f57bc2b
Compare
f57bc2b
to
1322766
Compare
const SkImageInfo info = | ||
SkImageInfo::Make(size.width(), size.height(), kBGRA_8888_SkColorType, | ||
kOpaque_SkAlphaType); | ||
if (!bitmap.tryAllocPixels(info, info.minRowBytes())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the actual memory footprint of these bitmaps? A single A4 page @ 300dpi might take ~30MB in 8-bit RGBA. I'm afraid right now this will eat a lot of RAM, just think of 100+ page documents.
Please consider a queued conversion here, you're calling the text-recognition API on per-page basis anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, that is a good point. We currently impose page limit of 20 on text recognition process so it would be wasting resources to generate bitmaps more than that limit. I will add a max_pages to the API and call it with the same limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in cc93d16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just did a profiling on a nine page doc https://docs.google.com/document/d/1OhFiLobEhBthcoaEbPIe_oNNmIq19lCYIuA_YEnmJAQ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 300 dpi, the dimension for rendering 560px x 795px image is 1.87in x 2.65in and the size gfx::Size size = gfx::ToCeiledSize(*page_size);
(420pt x 596pt = 5.83in x 8.28in with CSS standard 96 dpi) passed into chrome_pdf::RenderPDFPageToBitmap
should be sufficient enough to contain the image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, I think I figured what's going on and why things seems to be working fine, but I wasn't able to connect the dots.
GetPDFPageSizeByIndex
returns the size in points, NOT pixels. Each point is 1/72 of an inch (seechrome_pdf::CalculatePosition
andprinting::kPointsPerInch
).- You allocate the bitmap using points you get from this call, NOT pixels.
- You call
chrome_pdf::RenderPDFPageToBitmap
with the bitmap you allocated and a requested dpi of 300, but the renderer can't really fit the 300 dpi image into the bitmap you passed. - The renderer recalculates width/height and renders the image at pixel-to-point dpi (72), ignoring 300 dpi you pass.
So after these manipulations you get 72 dpi image that happens to pass OCR. Will it work in documents with a smaller font size? Did you really want to run the OCR with 72 dpi images?
Either way, please add comments on what's actually going on and maybe increase the dpi to cover documents with smaller fonts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, we definitely need upscale to 300 dpi because when I changed font size from 11 to 5, the OCR result is totally wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the summary, upscale bitmap to 300 dpi in f942df0. And now the bitmap size for each page is 16.58MB which is the maximum bitmap allocation for the utility process because we do conversion and OCR on per page basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the summary, upscale bitmap to 300 dpi in f942df0. And now the bitmap size for each page is 16.58MB which is the maximum bitmap allocation for the utility process because we do conversion and OCR on per page basis.
awesome!
|
||
void AIChatUIPageHandler::OnGetBitmaps( | ||
const std::optional<std::vector<SkBitmap>>& bitmaps) { | ||
VLOG(3) << __func__ << ": bitmap size: " << (bitmaps ? bitmaps->size() : -1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you manipulate with bitmaps
on UI thread?
If bitmaps
is huge, even simple operations could result in a short UI hang.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean any huge vectors in general or a huge vector with SkiBitmap specifically?
I can remove this VLOG though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also these bitmaps will be OCR on different thread in PreviewPageTextExtractor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with 4166f2a, we no longer have this function
6625979
to
3434c0f
Compare
42bb043
to
5eb9df9
Compare
browser/ui/webui/DEPS
Outdated
@@ -1,4 +1,7 @@ | |||
include_rules = [ | |||
"+brave/services/printing/public/mojom", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this should be added into webui/ai_chat/DEPS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed and squashed.
@@ -85,6 +84,10 @@ AIChatUIPageHandler::AIChatUIPageHandler( | |||
|
|||
favicon_service_ = FaviconServiceFactory::GetForProfile( | |||
profile_, ServiceAccessType::EXPLICIT_ACCESS); | |||
#if BUILDFLAG(ENABLE_PRINT_PREVIEW) | |||
print_preview_extractor_ = std::make_unique<PrintPreviewExtractor>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should only be created when you will use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed and squashed.
// Stop processing if we have reached the maximum number of pages or the | ||
// maximum length of the content | ||
if (current_page_index_ + 1 >= kMaxPreviewPages || | ||
preview_text_.str().length() >= max_page_content_length_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preview_text_.str()
will always allocate a new string, please don't do that.
Do you really need to use std::stringstream
? std::string
and base::StrAppend
should work just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed and squashed.
bool IsPrintPreviewUIBound() const; | ||
void SetPreviewUIId(); | ||
void ClearPreviewUIId(); | ||
void OnPrintPreviewRequest(int request_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most of these methods should be private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed and squashed.
3f39328
to
6320b0b
Compare
[puLL-Merge] - brave/brave-core@22612 DescriptionThis PR adds print preview support to the AI Chat feature in Brave. It allows extracting text from PDF documents using OCR in the print preview flow. The main motivation is to enable AI Chat to provide assistance on document-like websites such as Google Docs. ChangesChanges
Security Considerations
Let me know if you have any other questions! The PR looks good overall with some important new functionality. The main areas to double-check are around the new mojo IPC interfaces and handling of untrusted PDFs and images in the print preview extractor. |
… OCR after conversion. Also decouple OCR logic from FetchPageContent.
…tPreviewExtractor
6320b0b
to
1d2a2bc
Compare
1d2a2bc
to
06125c3
Compare
Resolves brave/brave-browser#36649
In nutshell:
This PR make AIChatPanel a
printing::mojom::PrintPreviewUI
in order to initiate print preview then we compose print preview result into a pdf and we convert each page of pdf into image and do OCR for each image, finally we concat the results from OCR. This feature will only be rolled out ondocs.google.com
initially, we will introduce it as general fallback in future PR.Since there are two
PrintPreviewUI
sharingprinting::mojom::PrintRenderFrame
, we have to deal withPrintRenderFrame
will be double bound if AIChatUI is doing print preview extraction with print dialog open.PrintManagerHost::RequestPrintPreview
callingPrintPreviewDialogController::PrintPreview
Unlike print dialog, AIChatPanel will disconnect
PrintRenderFrame
when print preview is done or failed.We also introduce a new printing service(PdftoBitmapConverter) to convert the selected pdf page into a SkBitmap and
since large context will be truncated, we short-circuit the per page OCR process if context limit is reached or page limit is reached.
Submitter Checklist:
QA/Yes
orQA/No
;release-notes/include
orrelease-notes/exclude
;OS/...
) to the associated issuenpm run test -- brave_browser_tests
,npm run test -- brave_unit_tests
wikinpm run presubmit
wiki,npm run gn_check
,npm run tslint
git rebase master
(if needed)Reviewer Checklist:
gn
After-merge Checklist:
changes has landed on
Test Plan: (Windows and MacOS only)
Regression test on previous google doc support
Test on full page google doc support
Test on page limit (20)
Compatibility with print dialog