Step by Step Guide.

This guide will show you how to run the Threshold Optimizer. You will first need to have a project that can run the Extraction Benchmark successfully.
See Anonymizing Zones from Documents.md for an example of how to configure the Extraction Benchmark.

Configure the Threshold Optimizer

Add the following script to the Project level script. The script below is for KTA. If you are using KTT, KTM or RPA then use the event Document_AfterProcess instead of Document_AfterExtract as Document_AfterProcess is run after the locators have run and after the fields have been both formatted and validated - this is important because we are looking at the valid value on each field. KTA is different as validation happens outside of KT.

'#Language "WWB-COM"
Option Explicit
' Class script: Document

Private Sub Document_AfterExtract(ByVal pXDoc As CASCADELib.CscXDocument)
   Dim F As Long, Field As CscXDocField, TruthDoc As New CscXDocument, Truth As CscXDocField
   TruthDoc.Load(pXDoc.FileName)
   Open "c:\temp\parascript_alpha.txt" For Append As #1
   For F=0 To pXDoc.Fields.Count-1
      Set Field=pXDoc.Fields(F)
      If Field.PageIndex>-1 And TruthDoc.Fields.Exists(Field.Name) Then
         Set Truth=TruthDoc.Fields.ItemByName(Field.Name)
         If Truth.Text<>"" Then
            Print #1, FileName_WithoutPath(pXDoc.FileName)   & vbTab & vbTab & Field.Name & vbTab & Truth.Text & vbTab & Field.Text;
            Print #1, vbTab;
            Print #1, Format(Field.Confidence,"0.00%") & vbTab & Format(String_LevenshteinDistance(Field.Text,Truth.Text,IgnoreCase:=True))
         End If
      End If
   Next
   Close #1
End Sub

Add these functions as well. String_LevenshteinDistance/Min/Max and FileName_WithoutPath
Replace the filename in the script with a name that makes sense for your configuration. I have parascript_alpha.txt because I am testing Parascript's Alphabetic OCR profile.
Select all of your documents (CTRL-A) in the Test Window.
Run "Extact Docmuents" (F6) to test all of your documents.
Open the text *while the documents are being extracted you can view the live updates results file in Visual Studio Code l

Copy data to Excel

When Extraction has finished, copy the data from the output file (CTRL-A, CTRL-C).
Duplicate the Worksheet in the Excel Document. Rename it. Paste data into A7 (CTRL-V).
Add your data to the Summary page by just changing the reference of the cell in column A to point to the first cell in your new Worksheet.
Go to Visual Studio Code and delete the contents of file (CTRL-A, Delete) and then save it (CTRL-S).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step-by-step Guide.md

Step-by-step Guide.md

Step by Step Guide.

Configure the Threshold Optimizer

Copy data to Excel

Files

Step-by-step Guide.md

Latest commit

History

Step-by-step Guide.md

File metadata and controls

Step by Step Guide.

Configure the Threshold Optimizer

Copy data to Excel