Pipeline is an implementation of an event pipeline using Amazon Web Services (AWS) Simple Storage Service (S3) and Simple Queue Service (SQS). S3 can be configured to publish a notification to the Simple Notification Service (SNS) whenever a new object is created in a bucket. SNS can then publish the notification into an SQS queue which Pipeline will consume from. Pipeline will consume from the SQS queue, then download the object from S3, optionally decompress it, then emit each line of data over a server-sent events (SSE) HTTP endpoint.
To build this code locally, clone the repository then use Maven to build the jar:
git clone https://github.com/smoketurner/pipeline.git cd pipeline mvn package cd pipeline-application java -jar target/pipeline-application/pipeline-application-1.0.0-SNAPSHOT.jar server pipeline.yml
The Pipeline service should be listening on port
8080 for API requests, and Dropwizard's administrative interface is available on port
8180 (both of these ports can be changed in the
pipeline.yml configuration file).
To deploy the Pipeline service into production, it can safely sit behind any HTTP-based load balancer (nginx, haproxy, F5, etc.).
NOTE: The pipeline service provides no authentication or authorization of requests. It is recommended to use a separate service such as Kong or the Amazon API Gateway to authenticate and authorize users.
The Pipeline service provides RESTful URLs for consuming events.
API documentation is also available via Swagger at
curl -X GET localhost:8080/v1/events -i HTTP/1.1 200 OK Date: Thu, 03 Dec 2015 20:22:25 GMT Content-Type: text/event-stream Transfer-Encoding: chunked event: ping data: ping
As messages are published into the SQS queue as new files are uploaded to S3, Pipeline will consume the SQS messages, download the S3 files, and publish the events over the HTTP connection.
Please file bug reports and feature requests in GitHub issues.
Copyright (c) 2018 Smoke Turner, LLC
This library is licensed under the Apache License, Version 2.0.