Skip to content

This repo dockerizes the Streamsets tutorials. https://github.com/streamsets/tutorials. The tutorials have been used as starting point for a major project.

Notifications You must be signed in to change notification settings

Vannazux/dockerizing-streamsets-tutorials

 
 

Repository files navigation

Dockerizing Streamsets tutorials

This branch takes care of setting up the Streamsets datacollector tutorial 2 via Docker.

Requirements:

  • Docker
  • Docker compose
  • Streamsets sample data. Place it into the folder: streamsets/data/tutorial_data

Instructions:

Once this repo has been cloned and sample data has been downloaded, open your command line and initialize the docker containers using: $ docker-compose up. This can take a while.

Once the containers are up and running import.

1.-Import the producer pipeline and consumer pipeline into streamsets by going to http://localhost:18630

You should see the following pipelines (Filesystem was used instead of AWS S3)

consumer producer

Preview or Start the pipelines right away.

start-pipeline

Access Kibana via http://localhost:5601.

This is what the folder structure should look like after including the necesary data and executing the pipeline:

.
├── build
│   ├── Dockerfile
│   └── start.sh
├── consumer.json
├── docker-compose.yml
├── images
│   ├── ...
├── producer.json
├── readme.md
├── streamsets
│   ├── Dockerfile
│   ├── data
│       ├── pipelines
│       │   └── ...
│       ├── runInfo
│       │   └── ...
│       ├── sdc.id
│       └── tutorial_data
│           ├── ccsample

About

This repo dockerizes the Streamsets tutorials. https://github.com/streamsets/tutorials. The tutorials have been used as starting point for a major project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 75.3%
  • Dockerfile 24.7%