(C) JP.Cordeiro for CATS
This repo contains AWS SAM templates that deploy a Proof Of Value for a serverless ECM application. This application uses Amazon ML services like Comprehend and Rekognition to index documents and images, and then sends the results to the Amazon DynamoDB Service for indexing.
This application uses an event-based architecture, with Amazon EventBridge as the serverless event bus.
.
├── README.MD <-- This instructions file
├── template.yaml <-- SAM template for Application
├── analyzers <-- Source code for Lambda functions
│ └── analyzeText <-- Text analyzer
│ └── analyzeImage <-- Image analyzer
│ └── package.json <-- NodeJS dependencies and scripts
├── converters <-- Source code for Lambda functions
│ └── processDOCX <-- Converts DOCX file into text
│ └── processPDF <-- Converts PDF files into text
│ └── package.json <-- NodeJS dependencies and scripts
├── loaders <-- Source code for Lambda functions
│ └── loadToDB <-- Load indexing info into DB and target bucket
│ └── package.json <-- NodeJS dependencies and scripts
├── parser <-- Source code for a lambda function
│ └── parserFunction <-- Parses input bucket
│ └── package.json <-- NodeJS dependencies and scripts
- AWS and AWS SAM CLI configured with Administrator permission
- NodeJS 12.x installed
- You can use GitPod (https://www.gitpod.io/) as you development environment with all this requirements.
-
Clone the repo onto your local development machine using
git clone
. -
Then run:
sam build -u
sam deploy --guided --capabilities CAPABILITY_NAMED_IAM
Follow the prompts in the deploy process to set the stack name, AWS Region, unique bucket names, DynamoDB domain endpoint, and other parameters.
Once completed, run
aws s3 sync ACQsite/. s3://myecm-s3-acq
pushd viewsite
myURL=$(aws ssm get-parameter --name ScanAPIUrl | jq -r '.Parameter.Value')
sed -i "/.get(/c\ .get('${myURL}')" src/FeaturedDocuments.js
npm install
npm run build
popd
aws s3 sync viewsite/build/. s3://myecm-s3-view
to install acquisition and viewer website.
The Acquisition URL will be : https://myecm-s3-acq.s3-eu-west-1.amazonaws.com/index.html
The Acquisition URL will be : https://myecm-s3-view.s3-eu-west-1.amazonaws.com/index.html
You can also use provision and decommission scripts to fully automate the processus.
- Upload PDF, DOCX or JPG, PNG files to the TP or BATCH buckets.
- After a few seconds you will see the index in DynamoDB has been updated with labels and entities for the object and the files moved to the ECS bucket.
==============================================
This application features are extracted from Serverless Document Repository repo provided by Amazon.
SPDX-License-Identifier: MIT-0