Example applications for use with PNDA
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
flink-batch-java-hashtagcount-metrics
flink-streaming-host-network-data-usage
flink-streaming-word-count
flink-wordcount-python-app
jupyter-notebooks
kafka-spark-opentsdb
literary-word-count-app
spark-batch-python
spark-batch
spark-streaming-python
spark-streaming
spark2-streaming-python
traffic-loss-analysis-app
.gitignore
CHANGELOG.md
LICENSE
README.md

README.md

Example Applications

This repository contains a number of example applications that can be built and run on PNDA. Each application directory contains more detailed information.

Spark Streaming

Examples of consuming data from Kafka and populating both HBase and OpenTSDB with simple Scala based Spark Streaming applications.

Spark

Example of consuming data ingested by Gobblin on a batch basis and producing Parquet datasets, optimized for consumption by Impala.

Jupyter

Example of a notebook for manipulating network data.

H2O

Application that runs the H2O data science platform as an application on PNDA.

Flink Streaming

  • Count Words (scala) Count the words from Socket.
  • Flink Windows (java) host-network-data-usage illustrating Flink windows, triggers and event processing.
  • Count Hashtags (java) specific word count from input file illustrating metrics, counters and accumulators.

Compound Packages

An example of a package containing multiple application component types, in this case a Spark app and related Jupyter notebook.