From 1418b04ed258452e50c71d76ee1945d5d934698d Mon Sep 17 00:00:00 2001 From: KB Bot Date: Wed, 15 Jan 2025 07:37:53 +0000 Subject: [PATCH 1/3] Added new kb article convert-pdf-table-to-datatable --- .../convert-pdf-table-to-datatable.md | 62 +++++++++++++++++++ 1 file changed, 62 insertions(+) create mode 100644 knowledge-base/convert-pdf-table-to-datatable.md diff --git a/knowledge-base/convert-pdf-table-to-datatable.md b/knowledge-base/convert-pdf-table-to-datatable.md new file mode 100644 index 00000000..45961f7d --- /dev/null +++ b/knowledge-base/convert-pdf-table-to-datatable.md @@ -0,0 +1,62 @@ +--- +title: Converting PDF Table Content to DataTable +description: Learn how to transform a table from a PDF file into a DataTable object using the Telerik Document Processing libraries. +type: how-to +page_title: How to Convert PDF Table to DataTable with Telerik Document Processing +slug: convert-pdf-table-to-datatable +tags: document, processing, table, datatable, convert +res_type: kb +ticketid: 1675626 +--- + +## Environment + +| Version | Product | Author | +| ---- | ---- | ---- | +| 2024.4.1106| Telerik Document Processing Libraries|[Desislava Yordanova](https://www.telerik.com/blogs/author/desislava-yordanova)| + +## Description + +Learn how to convert a specific table from a PDF file into a DataTable object using Telerik Document Processing libraries. + +## Solution + +Telerik Document Processing libraries do not offer a direct method to convert PDF table to a DataTable object. However, a feasible workaround is available. This method involves utilizing MS Excel or [RadSpreadsheet](https://docs.telerik.com/devtools/winforms/controls/spreadsheet/overview) for the intermediary conversion step. + +1. Select and copy the desired table's content from the PDF file. +2. Paste the copied content into MS Excel or RadSpreadsheet. This step converts the PDF table into an Excel format. +3. Save the document into XLSX with [RadSpreadProcessing]({%slug radspreadprocessing-overview%}). +4. Use the RadSpreadProcessing library to convert the Excel document into a DataTable. Utilize the [DataTableFormatProvider]({%slug radspreadprocessing-formats-and-conversion-using-data-table-format-provider%}) from RadSpreadProcessing for this conversion. + +Here is a code snippet demonstrating the conversion of an XLSX document to a DataTable using RadSpreadProcessing: + +```csharp +using Telerik.Windows.Documents.Spreadsheet.FormatProviders.OpenXml.Xlsx; +using Telerik.Windows.Documents.Spreadsheet.Model; +using System.Data; +using Telerik.Windows.Documents.Spreadsheet.FormatProviders; + +// Load the XLSX file +Workbook workbook; +using (FileStream input = new FileStream("path_to_your_xlsx_file.xlsx", FileMode.Open)) +{ + IWorkbookFormatProvider formatProvider = new XlsxFormatProvider(); + workbook = formatProvider.Import(input); +} + +// Convert the first worksheet to DataTable +Worksheet worksheet = workbook.Worksheets[0]; +DataTable dataTable = new DataTable(); + +DataTableFormatProvider dataTableFormatProvider = new DataTableFormatProvider(); +dataTable = dataTableFormatProvider.Export(worksheet); +``` + +This solution provides a way to parse PDF table content and use it as a DataTable, leveraging the powerful features of Telerik Document Processing libraries. + +## See Also + +- [RadWordsProcessing Overview](https://docs.telerik.com/devtools/document-processing/libraries/radwordsprocessing/overview) +- [RadSpreadProcessing Overview](https://docs.telerik.com/devtools/document-processing/libraries/radspreadprocessing/overview) +- [Using DataTable Format Provider](https://docs.telerik.com/devtools/document-processing/libraries/radspreadprocessing/formats-and-conversion/data-table/using-data-table-format-provider) +- [Import and Export to Excel File Formats](https://docs.telerik.com/devtools/document-processing/libraries/radspreadprocessing/formats-and-conversion/import-and-export-to-excel-file-formats/xlsx/xlsx) From 0903f4a6aa561fc9a3ca97e624cd13e3d8308390 Mon Sep 17 00:00:00 2001 From: Desislava Yordanova Date: Mon, 27 Jan 2025 13:18:58 +0200 Subject: [PATCH 2/3] new KB - Converting PDF Table Content to DataTable --- knowledge-base/convert-pdf-table-to-datatable.md | 16 ++++++++-------- .../using-data-table-format-provider.md | 3 ++- 2 files changed, 10 insertions(+), 9 deletions(-) diff --git a/knowledge-base/convert-pdf-table-to-datatable.md b/knowledge-base/convert-pdf-table-to-datatable.md index 45961f7d..5cfdae21 100644 --- a/knowledge-base/convert-pdf-table-to-datatable.md +++ b/knowledge-base/convert-pdf-table-to-datatable.md @@ -4,7 +4,7 @@ description: Learn how to transform a table from a PDF file into a DataTable obj type: how-to page_title: How to Convert PDF Table to DataTable with Telerik Document Processing slug: convert-pdf-table-to-datatable -tags: document, processing, table, datatable, convert +tags: document, processing, table, datatable, convert, pdf, excel res_type: kb ticketid: 1675626 --- @@ -17,14 +17,14 @@ ticketid: 1675626 ## Description -Learn how to convert a specific table from a PDF file into a DataTable object using Telerik Document Processing libraries. +Learn how to convert a specific table from a PDF file into a DataTable object using **Telerik Document Processing** libraries. ## Solution -Telerik Document Processing libraries do not offer a direct method to convert PDF table to a DataTable object. However, a feasible workaround is available. This method involves utilizing MS Excel or [RadSpreadsheet](https://docs.telerik.com/devtools/winforms/controls/spreadsheet/overview) for the intermediary conversion step. +Telerik Document Processing libraries **do not** offer a **direct** method to convert a PDF table to a DataTable object. However, a feasible workaround is available. This method involves utilizing MS Excel or [RadSpreadsheet](https://docs.telerik.com/devtools/winforms/controls/spreadsheet/overview) for the intermediary conversion step. 1. Select and copy the desired table's content from the PDF file. -2. Paste the copied content into MS Excel or RadSpreadsheet. This step converts the PDF table into an Excel format. +2. Paste the copied content into **MS Excel** or **RadSpreadsheet**. This step converts the PDF table into an Excel format. 3. Save the document into XLSX with [RadSpreadProcessing]({%slug radspreadprocessing-overview%}). 4. Use the RadSpreadProcessing library to convert the Excel document into a DataTable. Utilize the [DataTableFormatProvider]({%slug radspreadprocessing-formats-and-conversion-using-data-table-format-provider%}) from RadSpreadProcessing for this conversion. @@ -56,7 +56,7 @@ This solution provides a way to parse PDF table content and use it as a DataTabl ## See Also -- [RadWordsProcessing Overview](https://docs.telerik.com/devtools/document-processing/libraries/radwordsprocessing/overview) -- [RadSpreadProcessing Overview](https://docs.telerik.com/devtools/document-processing/libraries/radspreadprocessing/overview) -- [Using DataTable Format Provider](https://docs.telerik.com/devtools/document-processing/libraries/radspreadprocessing/formats-and-conversion/data-table/using-data-table-format-provider) -- [Import and Export to Excel File Formats](https://docs.telerik.com/devtools/document-processing/libraries/radspreadprocessing/formats-and-conversion/import-and-export-to-excel-file-formats/xlsx/xlsx) +- [RadWordsProcessing Overview]({%slug radwordsprocessing-overview%}) +- [RadSpreadProcessing Overview]({%slug radspreadprocessing-overview%}) +- [Using DataTable Format Provider]({%slug radspreadprocessing-formats-and-conversion-using-data-table-format-provider%}) +- [Import and Export to Excel File Formats]({%slug radspreadprocessing-formats-and-conversion-xlsx-xlsxformatprovider%}) diff --git a/libraries/radspreadprocessing/formats-and-conversion/data-table/using-data-table-format-provider.md b/libraries/radspreadprocessing/formats-and-conversion/data-table/using-data-table-format-provider.md index 12945027..e7b8d865 100644 --- a/libraries/radspreadprocessing/formats-and-conversion/data-table/using-data-table-format-provider.md +++ b/libraries/radspreadprocessing/formats-and-conversion/data-table/using-data-table-format-provider.md @@ -62,4 +62,5 @@ Example 3 demonstrates how you can export an existing Worksheet to a DataTable. # See Also -* [Settings]({%slug radspreadprocessing-formats-and-conversion-data-table-formatprovider-settings%}) \ No newline at end of file +* [Settings]({%slug radspreadprocessing-formats-and-conversion-data-table-formatprovider-settings%}) +* [Converting PDF Table Content to DataTable]({%slug convert-pdf-table-to-datatable%}) \ No newline at end of file From 4126686a968ef97db3bcc1d8f504fb4f2fb74e22 Mon Sep 17 00:00:00 2001 From: Desislava Yordanova Date: Tue, 25 Feb 2025 07:48:06 +0200 Subject: [PATCH 3/3] review and apply the feedback --- knowledge-base/convert-pdf-table-to-datatable.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/knowledge-base/convert-pdf-table-to-datatable.md b/knowledge-base/convert-pdf-table-to-datatable.md index 5cfdae21..ad1bba30 100644 --- a/knowledge-base/convert-pdf-table-to-datatable.md +++ b/knowledge-base/convert-pdf-table-to-datatable.md @@ -1,6 +1,6 @@ --- -title: Converting PDF Table Content to DataTable -description: Learn how to transform a table from a PDF file into a DataTable object using the Telerik Document Processing libraries. +title: Converting PDF Table Content to DataTable using RadSpreadProcessing and UI interaction +description: Learn how to transform a table from a PDF file into a DataTable object using the Telerik Document Processing libraries and UI interaction. type: how-to page_title: How to Convert PDF Table to DataTable with Telerik Document Processing slug: convert-pdf-table-to-datatable @@ -17,7 +17,7 @@ ticketid: 1675626 ## Description -Learn how to convert a specific table from a PDF file into a DataTable object using **Telerik Document Processing** libraries. +Learn how to convert a specific table from a PDF file into a [DataTable](https://learn.microsoft.com/en-us/dotnet/api/system.data.datatable?view=net-5.0) object using **Telerik Document Processing** libraries. ## Solution