Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
classifier
pdf
machine-learning
csharp
lightgbm
pdf-document
document-layout
layout-analysis
pdf-document-processor
document-layout-analysis
ml-net
pdfpig
publaynet
-
Updated
Mar 16, 2020 - C#