AWS Lambda to tag pictures in S3

1. Overview

AWS S3 Service supports the tagging feature described here. The best practices for tagging schemas are described here. This Lambda take part from an application according to the below architecture where it is highlighted with red: manage tags.

2. Application functionality

Tags for the pictures are calculated by many Lambda microservices (orange squares in diagram), using invocations of AWS Machine Learning Rekognition service (black squares in diagram), and inserted into DynamoDB database (blue squares in diagram). Each Lambda calculate different tags, specified in the above diagram. These tags and the results of the ML invocations are stored in DynamoDB database. When each record with tags is inserted into the database, DynamoDB sends an event to DynamoDB Stream. The events from this stream are forwarded to this Lambda. This Lambda waits for all the workers to complete, take all tags for one Picture, and tags the Picture in S3.

The red squares from diagram are topics, from Simple Notification Service, that distribute and filter the events to Lambda.

The green squares from diagram are buckets, from Simple Storage Service, that store the initial Picture and the output of the Lambda workers (image and text files).

The application is a Proof Of Concept for AWS ML Rekognition Service and also for a data concept: from unstructured to structured data. The unstructured data are the initial pictures that are uploaded by the User into S3. All the above lambda are triggered when each picture is uploaded into S3. The structured data are the output JSON structures of the AWS ML service, and the tags calculated by Lambdas.

3. AWS SDK features used

Lambda handler supports messages from multiple sources that are detected in JSON: DynamoDB and SQS
Lambda handler messages are received as JSON and converted to DynamoDB events
Lambda handler output is returned to Lambda service as graph of POJO that are converted to JSON by Lambda service; when input event contains multiple records, the output contain one Response object for each record
The log is done in all application using Log4j2, that sends output to AWS CloudWatch service
Each AWS client used is encapsulated in a POJO application service
The application uses a static client (connection) for each AWS service used; this connection is used by all the instances of the Lambda handler generated by Lambda service (AWS client is thread safe)
All the Lambdas from the application use a common archive that contains the Tags Schema and Domain POJO (in libs folder: tagscommons-1.0.1.jar)
Use DynamoDb Enhanced Client to query DynamoDB on Partition Key
Use S3 Client to perform tags operations on S3 objects
The JUnit4 tests use AWS messages from JSON files

4. Tag Schema

The Tag Schema used to tag one picture with the structured data from the same picture, was defined particularly for the business of this application: pictures taken by me in travels and exhibitions. An extensive example of how to define tags schemas could be found here

The content and versions of Tag Schema are presented in TagSchema.md file. S3 tagging system is limited to 10 tags for one Picture. The pictures are tagging automatically and consistent due to the schema and the application. This is a base condition for a valuable categorisation and query of the structured data. Below is the last version of the schema. The naming of the tags were chosen according to AWS best-practices, using a structured approach (hierarchy of concepts) and a strategy according to the business of personal data. The Use Case of the schema is Data Classification and Query. Other Use Cases could be found here

Version 3_05.08.2023_9tags

Tag	Possible values
personal:data:content:has_faces	1,2,3,4,5_max
personal:data:content:has_faces:is_me	true
personal:data:content:has_faces:is_me:face_occluded	true
personal:data:content:has_text	true
personal:data:content:labels:has_persons	1,2,3,...
personal:data:content:labels:has_landmarks	true
personal:data:content:labels:dominant_category	{AWS Rekognition category: 41 categories}
personal:data:content:labels:dominant_label	{AWS Rekognition label: 3000+ labels}
personal:schema:version	3_05.08.2023_9tags

5. API documentation

The project has API documentation here (javadoc.html)

6. Test in Eclipse

The project has JUnit4 tests that can run using Eclipse feature Run As -> JUnit Test, on file /src/test/java/com/amazonaws/lambda/mihai/tagpicture/test/AllTests.java

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
javadoc		javadoc
libs/com/amazonaws/lambda/mihai/tagscommons		libs/com/amazonaws/lambda/mihai/tagscommons
src		src
README.md		README.md
TagSchema.md		TagSchema.md
aws_tag_pictures_v3_b2_w2.png		aws_tag_pictures_v3_b2_w2.png
javadoc.html		javadoc.html
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

javadoc

javadoc

libs/com/amazonaws/lambda/mihai/tagscommons

libs/com/amazonaws/lambda/mihai/tagscommons

src

src

README.md

README.md

TagSchema.md

TagSchema.md

aws_tag_pictures_v3_b2_w2.png

aws_tag_pictures_v3_b2_w2.png

javadoc.html

javadoc.html

pom.xml

pom.xml

Repository files navigation

AWS Lambda to tag pictures in S3

1. Overview

2. Application functionality

3. AWS SDK features used

4. Tag Schema

Version 3_05.08.2023_9tags

5. API documentation

6. Test in Eclipse

About

Releases

Packages

Languages

mihai1git/TagS3Pictures

Folders and files

Latest commit

History

Repository files navigation

AWS Lambda to tag pictures in S3

1. Overview

2. Application functionality

3. AWS SDK features used

4. Tag Schema

Version 3_05.08.2023_9tags

5. API documentation

6. Test in Eclipse

About

Resources

Stars

Watchers

Forks

Languages