Skip to content

Post‐processing flow

Mustafa Batuhan Ceylan edited this page Oct 31, 2023 · 1 revision

Base64.ai Post Processing Documentation

This document explains and demonstrates how to do common workflows inside Base64.ai's Post-Processing flow.

1. Code node

Code nodes modify the document output by modifying the result object.

When a fresh Code node is added, the code will look like below:

// Loop over input items and add a new field called 'myNewField' to the JSON of each one
for (const item of $input.all()) {
  const result = item.json.body.results[0];
}

return $input.all();

Any changes made to the result object here will reflect to the output (assuming these changes form a connection chain into the Respond to Webhook node)

1.1 Adding a new field

As shown above, adding a new field requires changes to the result. In this case result.fields object, which holds the results.

This code snippet shows adding a new field using constant values:

// Loop over input items and add a new field called 'myNewField' to the JSON of each one
for (const item of $input.all()) {
  const result = item.json.body.results[0];

  result.fields['fizz'] = {
    key: 'Fizz',
    value: 'Buzz',
    confidence: 0.99, // optional
    isValid: true // optional
  }
}

return $input.all();

1.2 Validating a field

Similar to adding a new field, validating an existing field is as simple as setting its isValid key, to true.

This code snippet shows validating the expiration date by checking it is before the current day:

// Loop over input items and add a new field called 'myNewField' to the JSON of each one
for (const item of $input.all()) {
  const result = item.json.body.results[0];

  const expirationDate = result.fields.expirationDate?.value;
  if (expirationDate) {
    const expirationDateObj = new Date(expirationDate);
    const today = new Date();
    result.fields.expirationDate.isValid = expirationDateObj < today;
  }
}

return $input.all();

1.3 Changing review status

Base64.ai's Human-In-The-Loop system allows for document reviewing. This system also allows for algorithmic changes to the review status, during post-processing step.

To do this to change the value of result.status to one of the following values:

  • approved
  • rejected
  • needsReview (This is the default value if reviewing is enabled)
  • autoApproved (This is the default value if reviewing is disabled)

This code snippet shows rejection of a result if the document type is not a driver license:

// Loop over input items and add a new field called 'myNewField' to the JSON of each one
for (const item of $input.all()) {
  const result = item.json.body.results[0];
  if (!result.model.type.startsWith('driver_license')) {
    result.status = 'rejected';
  }
}

return $input.all();

1.4 Reading and writing table values

Tables are stored under 2 locations in the result object:

  • result.features.tables - holds the table data without any location info or complex structure
  • result.features.dom.pages[i].tables - holds the table data with its location information and complex structure of rows and cells

In post-processing it is advised to work with the more complex structure, as it is more expansive. Simpler structure can also be used however this does not tie into the review system. Our review system is based on the complex structure, so we recommend using that one unless you need simple structure for your workflow specifically.

Changes made to result.features.tables will not reflect to our HITL review page, and will not be preserved after review. For that reason working with the simpler structure is more advisable in the Export flow.

This code snippet shows how to loop over the existing table structure and rewrites all cell values with a constant value:

// Loop over input items and add a new field called 'myNewField' to the JSON of each one
for (const item of $input.all()) {
  const results = item.json.body.results;
  // loop over results
  for (let resultIdx = 0; resultIdx < results.length; resultIdx++) {
    let result = results[resultIdx];
    for (let page of result?.features?.dom?.pages) {
      let tables = page.tables
      for (let table of tables) {
        let rows = table.rows
        for (let row of rows) {
          let cells = row.cells
          for (let cell of cells) {
            cell.text = "your text"
          }
        }
      }
    }
  }
}

1.5 Reading signatures and faces from result

Base64.ai's document understanding suite can detect signatures and in documents. The signatures are stored in the result.features.signatures array while faces are stored in result.features.faces array.

These arrays store objects containing the confining box of the signature/face, and the cropped image of the signature/face itself in the form of a PNG image in DataURL format.

Both arrays contain elements that share the same structue, meaning they have the same keys.

Here is an example of a signature object under result.features.signatures[i]:

{
  "topLeft": {
    "x": 124,
    "y": 427
  },
  "topRight": {
    "x": 245,
    "y": 357
  },
  "bottomLeft": {
    "x": 136,
    "y": 449
  },
  "bottomRight": {
    "x": 258,
    "y": 379
  },
  "pageNumber": 0,
  "left": 134,
  "top": 432,
  "width": 140,
  "height": 25,
  "right": 274,
  "bottom": 457,
  "confidence": 0.85,
  "image": "..."
}