Skip to content

This is an example apache-beam-pipeline for word frequency counts from tweets

Notifications You must be signed in to change notification settings

souvikg10/beam-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apache Beam example

Pre-requisites

  • Java 8
  • JDK
  • Maven

Installation

$ Java --version
$ brew update
$ brew install maven

Execution Direct Runner

mvn compile exec:java -Dexec.mainClass=tweetAnalysis.App \
     -Dexec.args="--inputFile=pom.xml --output=counts" -Pdirect-runner

Execution Spark Local mode

mvn compile exec:java -Dexec.mainClass=tweetAnalysis.App \
     -Dexec.args="--runner=SparkRunner --inputFile=pom.xml --output=data/counts" -Pspark-runner

Execution Flink Cluster

mvn package exec:java -Dexec.mainClass=tweetAnalysis.App \
     -Dexec.args="--runner=FlinkRunner --flinkMaster=<flink master> --filesToStage=target/tweet-Analysis-App-0.1.jar \
                  --inputFile=pom.xml --output=/data/counts" -Pflink-runner

About

This is an example apache-beam-pipeline for word frequency counts from tweets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages