Serverless CSV to JSON Processor

An automated, event-driven data processing pipeline built on AWS. This project uses a serverless architecture to automatically convert CSV files uploaded to an S3 bucket into JSON format.

The entire cloud infrastructure is defined using Infrastructure as Code (IaC) with the AWS Serverless Application Model (SAM) framework.

How It Works

This application follows a simple, powerful, and common serverless pattern:

A user uploads a .csv file (e.g., data.csv) to a designated "source" S3 bucket.
The S3 ObjectCreated event automatically triggers an AWS Lambda function.
The Python-based Lambda function reads the CSV file, parses its contents, and converts the data into a JSON array.
The function then saves the new JSON file (e.g., data.json) to a separate "destination" S3 bucket.

Key Features

Fully Serverless: No servers to provision or manage. The application scales automatically with demand.
Event-Driven: The processing pipeline is triggered by real-time events, making it efficient and responsive.
Infrastructure as Code (IaC): The entire stack (S3 buckets, Lambda function, and IAM permissions) is defined in the template.yaml file, allowing for repeatable and automated deployments.
Decoupled: The source and destination buckets are separate, following best practices for data processing pipelines.

Skills & Technologies Demonstrated

This project showcases proficiency in modern cloud-native development:

Cloud: Amazon Web Services (AWS)
Serverless: AWS Lambda, AWS SAM
Storage: Amazon S3
Programming: Python 3.11, Boto3 (AWS SDK)
Infrastructure as Code (IaC): AWS CloudFormation, YAML
Concepts: Event-Driven Architecture, Serverless Patterns, IAM Roles & Policies

Prerequisites

Before you begin, ensure you have the following tools installed and configured:

AWS CLI: Configured with your AWS credentials (aws configure).
AWS SAM CLI: The framework used to build and deploy.
Docker: Required by SAM CLI to build the Lambda deployment package locally.
Python 3.11

How to Deploy

You can deploy this entire application to your own AWS account in two commands:

1. Build the Application

This command packages the Lambda function and prepares it for deployment.

sam build

2. Deploy the Application

This command will guide you through the deployment process, prompting for a "Stack Name" (e.g., csv-processor) and other parameters. It will automatically create the S3 buckets and Lambda function for you.

sam deploy --guided

After the deployment succeeds, the SAM CLI will output the names of the two new S3 buckets.

How to Test the Pipeline

Create a sample CSV file named test.csv:

id,name,role
1,Alice,Engineer
2,Bob,Manager
3,Charlie,Analyst

Find your source bucket name. You can find this in the output of the sam deploy command or in the AWS CloudFormation console's "Outputs" tab for your stack.

Upload the file to the source S3 bucket.

aws s3 cp test.csv s3://<your-source-bucket-name>/

Check the destination bucket. Within a few seconds, a new file named test.json should appear in your destination S3 bucket. You can check this in the AWS S3 console or by running:

aws s3 ls s3://<your-destination-bucket-name>/

The contents of test.json will be:

[
    {
        "id": "1",
        "name": "Alice",
        "role": "Engineer"
    },
    {
        "id": "2",
        "name": "Bob",
        "role": "Manager"
    },
    {
        "id": "3",
        "name": "Charlie",
        "role": "Analyst"
    }
]

Cleaning Up

To completely remove all resources created by this project, run the sam delete command. This will delete the CloudFormation stack, both S3 buckets, the Lambda function, and all associated IAM roles.

sam delete

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.aws-sam		.aws-sam
events		events
hello_world		hello_world
tests		tests
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
samconfig.toml		samconfig.toml
template.yaml		template.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Serverless CSV to JSON Processor

How It Works

Key Features

Skills & Technologies Demonstrated

Prerequisites

How to Deploy

How to Test the Pipeline

Cleaning Up

About

Uh oh!

Releases

Packages

Languages

SovendeSkov/Serverless-Data-Processor

Folders and files

Latest commit

History

Repository files navigation

Serverless CSV to JSON Processor

How It Works

Key Features

Skills & Technologies Demonstrated

Prerequisites

How to Deploy

How to Test the Pipeline

Cleaning Up

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages