Skip to content

A simple file obfuscation/ingestion pipeline that runs on AWS with automated infrastructure deployment and execution.

Notifications You must be signed in to change notification settings

tigstep/file_obfuscator_importer_pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

file_obfuscator_importer_pipeline

Diagram

alt text

Requirements

This project requires Terraform, Python3 and an AWS account.

Tools/Services Used

  • Python3
  • Terraform
  • AWS
    • EC2
    • S3
    • Lambda
    • RDS
    • Step Functions
    • SNS

Short Description

A simple file obfuscation/ingestion pipeline that runs on AWS with automated infrastructure deployment and execution

Process Description

  • A file is put to an S3 bucket from an EC2 instance
  • The put event generates an alert that triggers the sfn_triggerer lambda, which, in turn, kicks off the state machine
  • Inside the State Machine
    • File Obfuscator Lambda queries the configuration table (created as part of the terraform apply), extracts the column names for the given file name that need to be obfuscated, obfuscates the columns in the source file and writes the result to a separate S3 prefix
    • Next rds_inserter lambda inserts both original and obfuscated files into two separate RDS MySQL tables
    • Next notifier lambda publishes a Success/Failure SNS notification to its topic, based on the outcome of the previous states of the State Machine. The published notifications at this point is available for future consumption.

Execution

In order to execute run wrapper.py script

Execution Process Description

  • wrapper.py executes three scripts in sequence
    • lambda_deployer.py - Looks into /src/lambda and creates AWS Lambda Deployment Packages
    • terraform apply - Deploys AWS infrastructure using the Terraform script included and the deployment packages created during the previous step
    • sql_executor.py - Creates and populates the configuration table in the RDS instance, created during the terraform apply step

To Do

  • Implement better logging
  • Use Redis for configuration lookup instead of MySQL
  • Improve the security by making NACLs and SGs stricter

Warnings

  • Current configuration of this project will be using AWS services that are beyond the Free Tier!

About

A simple file obfuscation/ingestion pipeline that runs on AWS with automated infrastructure deployment and execution.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published