Conversation
- Introduce OpenContentKind.PdfDocument and PDF extension helpers - Support PDFs in context menu and "Open With" registration - Add FileUtilities.GetVisualDocumentFilter for images/PDFs - Route PDF files to new PdfDocumentRenderer for native text and OCR - Refactor to treat images and PDFs as visual documents - No breaking changes to existing image workflows
- Introduce PdfTextLineOverlay and PdfTextCanvas for PDF text selection - Add PDF page navigation controls to GrabFrame UI - Support loading, rendering, and extracting text from PDFs - Enable selection, search, and copy of PDF text lines - Update pan/zoom logic for PDF overlays and spacebar-panning - Refactor event handlers and utilities for PDF support - Update dialogs, menus, and help text to include PDFs - Improve XAML formatting and documentation consistency
Updated GrabFrame logic to accept both image and PDF files by using IoUtilities.IsVisualDocumentFile. Added PdfPig NuGet package for PDF handling. Improved debug message to reflect support for PDFs.
Expanded FilesIoTests to cover file type and filter logic. Added PdfDocumentRendererTests for rendering, coordinate mapping, line grouping, and OCR overlap handling.
Added a new "Open File..." menu item with a document icon to the NotifyIconWindow context menu. Implemented its click handler to asynchronously open the file picker using App.OpenFileWithPickerAsync().
Introduce FileUtilities.GetOpenDocumentFilter() for unified open file dialog filters, replacing hardcoded strings and supporting images, PDFs, spreadsheets, markdown, and text files. Refactor GrabFrame to generalize and enhance spacebar-based pan/zoom logic, including new state tracking, improved event handling, and better user experience for both images and PDFs.
Refactored ZoomBorder to use PreviewMouseDown/Up/Move for panning, improving modifier key support and event robustness. Added isPanning state, IsSpacePanModifierPressed, and RequireSpaceToPan for flexible pan activation. Middle mouse now resets zoom/pan. Improved mouse capture/release and removed obsolete handlers.
Added static helpers for file picker and drag-and-drop file handling in App.xaml.cs. Renamed TryToOpenFile to TryToOpenFilePathAsync and updated usages. Ensured EditTextWindow is activated after opening a file.
Expanded FilesIoTests to cover GetOpenDocumentFilter and drag-and-drop file handling. Added tests for document type filters, dropped file path extraction, and drag-drop effects. Included necessary using directives for System.IO and System.Windows.
Added a 300ms grace period after releasing Space before disabling pan mode, making panning with Space+mouse smoother. Ensured pan/zoom is always enabled for PDFs and moved focus away from buttons to prevent accidental activation during panning. Refactored event handling and focus logic for more robust and user-friendly pan/zoom interactions. Also improved mouse capture logic and code clarity in ZoomBorder.
Replaces individual Markdig extension calls with UseAdvancedExtensions for a cleaner and more maintainable pipeline configuration. This enables a broad set of advanced features in one step.
There was a problem hiding this comment.
Pull request overview
This PR extends Text Grab’s document-handling flow so Grab Frame and related open/OCR paths can work with PDFs in addition to images, while also adding spreadsheet paste/find-replace improvements and a few supporting infrastructure updates.
Changes:
- Adds PDF loading/rendering support to Grab Frame, including page navigation, PDF text overlays, and OCR/native-text extraction paths.
- Expands shared file/open/context-menu utilities so PDFs are treated as “visual documents” alongside images across the app.
- Improves Edit Text spreadsheet workflows with table-aware paste and spreadsheet find/replace support, plus adds unit tests for the new utility logic.
Reviewed changes
Copilot reviewed 34 out of 34 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
Text-Grab/Views/GrabFrame.xaml.cs |
Adds PDF state, loading, page switching, overlay selection, and zoom/pan behavior in Grab Frame. |
Text-Grab/Views/GrabFrame.xaml |
Adds PDF page controls and a dedicated PDF text overlay canvas. |
Text-Grab/Views/FirstRunWindow.xaml |
Updates first-run help text to mention PDFs in Grab Frame. |
Text-Grab/Views/EditTextWindow.xaml.cs |
Adds spreadsheet paste/search/replace helpers and shared open-filter usage. |
Text-Grab/Views/EditTextWindow.xaml |
Wires spreadsheet key handling and minor XAML cleanup. |
Text-Grab/Utilities/PdfDocumentRenderer.cs |
New PDF rendering/text extraction utility for native text and OCR fallback. |
Text-Grab/Utilities/OcrUtilities.cs |
Routes PDF files through the new PDF extraction path. |
Text-Grab/Utilities/MarkdownDocumentUtilities.cs |
Refactors regex declarations and markdown pipeline setup. |
Text-Grab/Utilities/IoUtilities.cs |
Classifies PDFs as a supported open-content kind and visual document type. |
Text-Grab/Utilities/ImplementAppOptions.cs |
Extends “Open with” registration to include PDF support. |
Text-Grab/Utilities/FileUtilities.cs |
Adds reusable open-file filters for visual documents and broader document opening. |
Text-Grab/Utilities/ContextMenuUtilities.cs |
Extends Explorer context-menu registration from images to visual documents including PDFs. |
Text-Grab/Utilities/ClipboardUtilities.cs |
Adds HTML-table-to-tab-separated clipboard conversion for spreadsheet pasting. |
Text-Grab/Text-Grab.csproj |
Adds the PDF parsing dependency. |
Text-Grab/Styles/TextBoxStyles.xaml |
Formatting-only XAML cleanup. |
Text-Grab/Styles/ListViewScrollFix.xaml |
Formatting-only XAML cleanup. |
Text-Grab/Styles/DataGridStyles.xaml |
Formatting-only XAML cleanup. |
Text-Grab/Styles/ButtonStyles.xaml |
Formatting-only XAML cleanup. |
Text-Grab/Pages/GeneralSettings.xaml |
Updates settings text to mention PDF support. |
Text-Grab/Models/FindResult.cs |
Adds spreadsheet cell location metadata for find results. |
Text-Grab/Enums.cs |
Adds PdfDocument as an open-content enum value. |
Text-Grab/Controls/ZoomBorder.cs |
Reworks pan handling for space-to-pan and PDF overlay interaction. |
Text-Grab/Controls/PdfTextLineOverlay.cs |
New selectable overlay control for PDF text lines. |
Text-Grab/Controls/NotifyIconWindow.xaml.cs |
Adds tray-menu file-open command. |
Text-Grab/Controls/NotifyIconWindow.xaml |
Adds tray-menu “Open File…” UI. |
Text-Grab/Controls/FindAndReplaceWindow.xaml.cs |
Adds spreadsheet-aware search, navigation, and replace/delete operations. |
Text-Grab/Controls/FindAndReplaceWindow.xaml |
Updates result list layout to show spreadsheet locations. |
Text-Grab/App.xaml.cs |
Adds picker-based file opening and shared dropped-file helpers; routes PDFs to Grab Frame. |
Tests/PdfDocumentRendererTests.cs |
Adds unit tests for PDF geometry and line-grouping helpers. |
Tests/FilesIoTests.cs |
Adds tests for content-kind classification, filters, and dropped-file helpers. |
Tests/ClipboardUtilitiesTests.cs |
Adds tests for HTML table clipboard conversion. |
.github/workflows/buildDev.yml |
Updates build workflow action versions. |
.github/workflows/Release.yml |
Updates release workflow action versions. |
.codetesting/AnalysisReport_20260125_220624_678.md |
Removes a stale analysis report artifact. |
Comment on lines
+882
to
+911
| <StackPanel | ||
| x:Name="PdfPagePanel" | ||
| Margin="0,0,6,0" | ||
| Orientation="Horizontal" | ||
| Visibility="Collapsed"> | ||
| <Button | ||
| x:Name="PreviousPdfPageButton" | ||
| Width="30" | ||
| Height="30" | ||
| Padding="0" | ||
| Click="PreviousPdfPageButton_Click" | ||
| ToolTip="Previous PDF page"> | ||
| <ui:SymbolIcon Symbol="ChevronLeft24" /> | ||
| </Button> | ||
| <TextBlock | ||
| x:Name="PdfPageTextBlock" | ||
| Margin="6,0" | ||
| VerticalAlignment="Center" | ||
| Text="Page 1 / 1" /> | ||
| <Button | ||
| x:Name="NextPdfPageButton" | ||
| Width="30" | ||
| Height="30" | ||
| Margin="6,0,0,0" | ||
| Padding="0" | ||
| Click="NextPdfPageButton_Click" | ||
| ToolTip="Next PDF page"> | ||
| <ui:SymbolIcon Symbol="ChevronRight24" /> | ||
| </Button> | ||
| </StackPanel> |
Comment on lines
+3588
to
+3590
| _loadedPdfDocument = await PdfDocumentRenderer.LoadAsync(path); | ||
| _currentImagePath = Path.GetFullPath(path); | ||
| await ShowPdfPageAsync(0); |
Comment on lines
+577
to
+586
| private async Task ShowPdfPageAsync(int pageIndex) | ||
| { | ||
| if (_loadedPdfDocument is null) | ||
| return; | ||
|
|
||
| reDrawTimer.Stop(); | ||
| ResetGrabFrame(); | ||
| await Task.Delay(300); | ||
|
|
||
| _currentPdfPageContent = await _loadedPdfDocument.GetPageContentAsync(pageIndex); |
Comment on lines
+158
to
+165
| List<PdfPageTextLine> combinedLines = [.. pageContent.NativeLines]; | ||
| IReadOnlyList<PdfPageTextLine> imageOcrLines = await GetOcrLinesAsync( | ||
| pageContent.RenderedPage, | ||
| resolvedLanguage, | ||
| sourceRect => ShouldIncludeOcrLine(sourceRect, pageContent.ImageRegions)); | ||
|
|
||
| combinedLines.AddRange(imageOcrLines); | ||
| return SortLines(combinedLines); |
Comment on lines
2222
to
2225
| FrameText = ""; | ||
| wordBorders.Clear(); | ||
| pdfTextLineOverlays.Clear(); | ||
| UpdateFrameText(); |
| private const double DefaultRenderScale = 2.0; | ||
| private readonly WinPdfDocument renderDocument; | ||
| private readonly PigPdfDocument textDocument; | ||
| private readonly Dictionary<int, PdfPageContent> pageCache = []; |
|
|
||
| if (IoUtilities.IsPdfFileExtension(Path.GetExtension(absolutePath))) | ||
| { | ||
| PdfDocumentRenderer pdfDocument = await PdfDocumentRenderer.LoadAsync(absolutePath); |
| ocrText = await GrabTemplateExecutor.ExecuteTemplateOnBitmapAsync(grabTemplate, bmp, selectedLanguage); | ||
| if (IoUtilities.IsPdfFileExtension(Path.GetExtension(path))) | ||
| { | ||
| PdfDocumentRenderer pdfDocument = await PdfDocumentRenderer.LoadAsync(path); |
Comment on lines
+2491
to
+2499
| var updates = targets | ||
| .Where(r => r.RowIndex.HasValue && r.ColumnIndex.HasValue) | ||
| .GroupBy(r => (r.RowIndex!.Value, r.ColumnIndex!.Value)) | ||
| .Select(g => | ||
| { | ||
| int row = g.Key.Item1, col = g.Key.Item2; | ||
| string oldValue = row < tableDocument.Rows.Count && col < tableDocument.Rows[row].Count | ||
| ? tableDocument.Rows[row][col] ?? string.Empty : string.Empty; | ||
| return (RowIndex: row, ColumnIndex: col, Value: pattern.Replace(oldValue, replaceWith)); |
Comment on lines
+257
to
+278
| int cellStart; | ||
| string endTag; | ||
| if (tdPos >= 0 && (thPos < 0 || tdPos <= thPos)) | ||
| { | ||
| cellStart = tdPos; | ||
| endTag = "</td>"; | ||
| } | ||
| else | ||
| { | ||
| cellStart = thPos; | ||
| endTag = "</th>"; | ||
| } | ||
|
|
||
| int openEnd = rowHtml.IndexOf('>', cellStart); | ||
| if (openEnd < 0) break; | ||
|
|
||
| int contentStart = openEnd + 1; | ||
| int contentEnd = rowHtml.IndexOf(endTag, contentStart, StringComparison.OrdinalIgnoreCase); | ||
| contentEnd = contentEnd >= 0 ? contentEnd : rowHtml.Length; | ||
|
|
||
| cells.Add(CleanHtmlCellContent(rowHtml[contentStart..contentEnd])); | ||
| pos = contentEnd + endTag.Length; |
Owner
Author
|
@copilot apply changes based on the comments in this thread |
…replace, colspan/rowspan Agent-Logs-Url: https://github.com/TheJoeFin/Text-Grab/sessions/c5acdc4a-84dd-4662-b5d3-335886f0b193 Co-authored-by: TheJoeFin <7809853+TheJoeFin@users.noreply.github.com>
Agent-Logs-Url: https://github.com/TheJoeFin/Text-Grab/sessions/c5acdc4a-84dd-4662-b5d3-335886f0b193 Co-authored-by: TheJoeFin <7809853+TheJoeFin@users.noreply.github.com>
Contributor
Applied all changes from the review thread (commit
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Expanded file filtering to include PDFs for OCR, updated UI messages to reference "files" instead of just "images," clarified variable typing in spreadsheet undo logic, and set StaysOpenOnClick for relevant menu items to enhance user experience.
Replaced custom Copy with standard Cut/Copy/Paste commands in the spreadsheet context menu. Added command bindings and keyboard shortcut handling for these actions. Implemented TryCutSpreadsheetCellValues for cut logic with clipboard failure detection. Updated clipboard helper to return success status. Added unit tests for cut scenarios.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #641