Skip to content

Latest commit

 

History

History
40 lines (20 loc) · 4.41 KB

File metadata and controls

40 lines (20 loc) · 4.41 KB

Examples for TRP.js

This folder contains example projects using the Amazon Textract Response Parser for JavaScript/TypeScript from various different build environments, to help you get started.

⚠️ Note: While all of the example projects reference local API response JSON files, some also make Amazon Textract API calls by default - so running them may incur (typically very small) charges. See Amazon Textract Pricing for details.

Pre-requisites for running the examples

Local builds of TRP.js

The projects use the local build of the library for pre-publication testing, so you'll need to run npm run build in the parent src-js folder before they'll work.

To instead switch to published TRP.js versions (if you're using an example as a skeleton for your own project):

  • For NodeJS projects, Replace the package.json relative path in "amazon-textract-response-parser": "file:../.." with a normal version spec like "amazon-textract-response-parser": "^0.4.0", and re-run npm install
  • For browser IIFE projects, edit the <script> tag in the HTML to point to your chosen CDN or downloaded trp.min.js location

API credentials for Amazon Textract

For the example projects that demonstrate actual integration with Amazon Textract, we create a TextractClient with empty configuration. This assumes that your AWS IAM credentials and default region are pre-configured for access through e.g. environment variables.

If you're new to setting up AWS credentials for CLI and SDK access in general, refer to the credentials guidance in the AWS SDK for JavaScript (v3) Developer Guide and/or the AWS CLI user guide.

Working with multi-page documents or many documents at once

The 'synchronous' request/response APIs used in these examples generally only support images or single-page documents. Multi-page documents will need to use Asynchronous Textract APIs instead. Since Asynchronous APIs like StartDocumentAnalysis return a job ID rather than an immediate result, applications will need to wait and GetDocumentAnalysis to retrieve the result once it's ready. You'll also need to upload the source document to Amazon S3 rather than passing it directly in the API request.

Furthermore, Amazon Textract applies quota limits on these APIs.

As a result, applications processing multi-page documents will generally need to orchestrate uploading the source file to S3; starting the analysis job; and resuming the processing flow once notified via Amazon SNS that the analysis is ready (which is much more quota-efficient than polling the GetDocumentAnalysis API)... Particularly spiky workflows (where many documents are submitted at once) may also want to implement queuing to manage inbound request rates.

A full end-to-end solution for this involves deploying cloud infrastructure like AWS Lambda functions and Amazon SNS topics, so is outside the scope of these TRP samples. Instead, refer to: