Skip to content

Commit

Permalink
ReadMe Improvements and Github Workflows (#9)
Browse files Browse the repository at this point in the history
  • Loading branch information
abhimanyugupta07 committed Aug 17, 2020
1 parent d8ebc74 commit 7abdce6
Show file tree
Hide file tree
Showing 4 changed files with 154 additions and 23 deletions.
49 changes: 49 additions & 0 deletions .github/workflows/codeql-analysis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: "Code scanning - action"

on:
pull_request:
schedule:
- cron: '0 0 * * 4'

jobs:
CodeQL-Build:

runs-on: ubuntu-latest

steps:
- name: Checkout repository
uses: actions/checkout@v2
with:
# We must fetch at least the immediate parents so that if this is
# a pull request then we can checkout the head.
fetch-depth: 2

# If this run was triggered by a pull request event, then checkout
# the head of the pull request instead of the merge commit.
- run: git checkout HEAD^2
if: ${{ github.event_name == 'pull_request' }}

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v1
with:
languages: java

# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v1

# ℹ️ Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl

# ✏️ If the Autobuild fails above, remove it and uncomment the following three lines
# and modify them (or add more) to build your code if your project
# uses a compiled language

#- run: |
# make bootstrap
# make release

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v1
31 changes: 31 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Build

on: [pull_request]

jobs:
test:
name: Package and run all tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Init Coveralls
shell: bash
run: |
COVERALLS_TOKEN=${{ secrets.COVERALLS_REPO_TOKEN }}
if [[ -z "${COVERALLS_TOKEN}" ]];
then
echo "Coveralls token not available"
COVERALLS_SKIP=true
else
echo "Coveralls token available"
COVERALLS_SKIP=false
fi
echo ::set-env name=COVERALLS_SKIP::${COVERALLS_SKIP}
- name: Set up JDK
uses: actions/setup-java@v1
with:
java-version: 8
- name: Run Maven Targets
run: mvn package jacoco:report coveralls:report --batch-mode --show-version --activate-profiles coveralls -Dcoveralls.skip=$COVERALLS_SKIP -DrepoToken=${{ secrets.COVERALLS_REPO_TOKEN }}
95 changes: 73 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,28 @@
# Drone Fly
A service which allows Hive metastore (HMS) `MetaStoreEventListener` implementations to be deployed in a separate context to the metastore's own.

# Overview
## Overview
[![Maven Central](https://maven-badges.herokuapp.com/maven-central/com.expediagroup/drone-fly-app/badge.svg?subject=com.expediagroup:drone-fly-app)](https://maven-badges.herokuapp.com/maven-central/com.expediagroup/drone-fly-app)
[![Build Status](https://github.com/ExpediaGroup/drone-fly/workflows/Build/badge.svg)](https://github.com/ExpediaGroup/drone-fly/actions?query=workflow:"Build")
[![Coverage Status](https://coveralls.io/repos/github/ExpediaGroup/drone-fly/badge.svg?branch=master)](https://coveralls.io/github/ExpediaGroup/drone-fly?branch=master)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Docker](https://img.shields.io/badge/docker-drone--fly-blue)](https://hub.docker.com/r/expediagroup/drone-fly-app)

Drone Fly is a distributed Hive metastore events forwarder service that allows users to deploy metastore listeners outside the Hive metastore service.

With the advent of event driven systems, the number of listeners that a user needs to install in the metastore is ever increasing. These listeners can be both internal or can be provided by third party tools for integration purposes. More and more processing is being added to these listeners to address various business use cases.
With the advent of event-driven systems, the number of listeners that a user needs to install in the metastore is ever increasing. These listeners can be both internal or can be provided by third party tools for integration purposes. More and more processing is being added to these listeners to address various business use cases.

Adding these listeners directly on the classpath of your Hive metastore couples them with it and can lead to performance degradation or in the worst case, it could take down the entire metastore (e.g. by running out memory, thread starvation etc.) Drone Fly decouples your HMS from the event listeners by providing a virtual Hive context. The event listeners can be provided on the Drone Fly's classpath and it then forwards the events received from [Kafka metastore Listener](https://github.com/ExpediaGroup/apiary-extensions/tree/master/apiary-metastore-events/kafka-metastore-events/kafka-metastore-listener) on to the respective listeners.

## Start using

A Terraform module for Kubernetes deployment is available [here](https://github.com/ExpediaGroup/apiary-drone-fly).

Docker images can be found in Expedia Group's [dockerhub](https://hub.docker.com/search/?q=expediagroup%2Fdrone-fly&type=image).

## System architecture

The diagram below shows a typical Hive metastore setup without using Drone Fly. In this example there are a number of HiveMetastoreListeners installed which send Hive events to other systems like Apache Atlas, AWS SNS, Apache Kafka and other custom implementations.
The diagram below shows a typical Hive metastore setup without using Drone Fly. In this example, there are several HiveMetastoreListeners installed which send Hive events to other systems like Apache Atlas, AWS SNS, Apache Kafka and other custom implementations.

![Hive Metastore setup without Drone Fly.](drone-fly-before.png "Multiple Hive metastore listeners are deployed in HMS context.")

Expand All @@ -20,39 +32,78 @@ With Drone Fly, the setup gets modified as shown in the diagram below. The only

Drone Fly can be set up to run in dockerized containers where each instance is initiated with one listener to get even further decoupling.

# Using with Docker
## Usage
### Using with Docker

To install a new HMS listener within the Drone Fly context, it is recommended that you build your Docker image using the Drone Fly base [Docker image](https://hub.docker.com/r/expediagroup/drone-fly-app).

A sample image to install the [Apiary-SNS-Listener](https://github.com/ExpediaGroup/apiary-extensions/tree/master/apiary-metastore-events/sns-metastore-events/apiary-metastore-listener) would be as follows:

```
from expediagroup/drone-fly-app:0.0.1
ENV APIARY_EXTENSIONS_VERSION 6.0.1
ENV AWS_REGION us-east-1
RUN cd /app/libs && \
wget -q https://search.maven.org/remotecontent?filepath=com/expediagroup/apiary/apiary-metastore-listener/${APIARY_EXTENSIONS_VERSION}/apiary-metastore-listener-${APIARY_EXTENSIONS_VERSION}-all.jar -O apiary-metastore-listener-${APIARY_EXTENSIONS_VERSION}-all.jar
```

#### Running Drone Fly Docker image

docker run --env APIARY_BOOTSTRAP_SERVERS="localhost:9092" \
--env APIARY_LISTENER_LIST="com.expediagroup.sampleListener1,com.expediagroup.sampleListener2" \
--env APIARY_KAFKA_TOPIC_NAME="dronefly" \
expediagroup/drone-fly-app:<image-version>

Then [Drone Fly Terraform](https://github.com/ExpediaGroup/apiary-drone-fly) module can be used to install your Docker image in a Kubernetes container.


The Drone Fly image will be used as a base image by downstream projects which need a Hive Listener.
### Using Uber Jar

Drone Fly uses the Jib plugin which will build a docker image during the `package` phase. The image can also be built directly:
Drone Fly build also produces an [uber jar](https://mvnrepository.com/artifact/com.expediagroup/drone-fly-app) so it can be started as a stand-alone Java service.

mvn compile jib:dockerBuild -pl drone-fly-app

# Running DroneFly
#### Running Drone Fly Jar

java -Dloader.path=lib/ -jar drone-fly-app-<version>-exec.jar \
--apiary.bootstrap.servers=localhost:9092 \
--apiary.kafka.topic.name=apiary \
--apiary.listener.list="com.expediagroup.sampleListener1,com.expediagroup.sampleListener2"

# Running DroneFly Docker image

docker run --env APIARY_BOOTSTRAP_SERVERS="localhost:9092" \
--env APIARY_LISTENER_LIST="com.expediagroup.sampleListener1,com.expediagroup.sampleListener2" \
--env APIARY_KAFKA_TOPIC_NAME="dronefly" \
expediagroup/drone-fly-app:<image-version>
--apiary.listener.list="com.expediagroup.sampleListener1,com.expediagroup.sampleListener2"

The properties `instance.name`, `apiary.bootstrap.servers`, `apiary.kafka.topic.name` and `apiary.listener.list` can also be provided in the spring properties file.

java -Dloader.path=lib/ -jar drone-fly-app-<version>-exec.jar --spring.config.location=file:///dronefly.properties

# Terraform
The parameter `-Dloader-path` is the path where Drone Fly will search for configured HMS listeners.

A Terraform module for kubernetes deployment is available [here](https://github.com/ExpediaGroup/apiary-drone-fly).
## Configuring Drone Fly

# Legal
This project is available under the [Apache 2.0 License](http://www.apache.org/licenses/LICENSE-2.0.html).
### Drone Fly configuration reference
The table below describes all the available configuration values for Drone Fly.

Copyright 2020 Expedia, Inc.
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| apiary.bootstrap.servers | Kafka bootstrap servers that receive Hive metastore events. | `string` | n/a | yes |
| apiary.kafka.topic.name | Kafka topic name that receives Hive metastore events. | `string` | n/a | yes |
| apiary.listener.list | Comma separated list of Hive metastore listeners to load from the classpath, e.g. `com.expedia.HMSListener1,com.expedia.HMSListener2` | `string` | `"com.expediagroup.dataplatform.dronefly.app.service.listener.LoggingMetastoreListener"` | no |
| instance.name | Instance name for a Drone Fly instance. `instance.name` is also used to derive the Kafka consumer group. Therefore, in a multi-instance deployment, a unique `instance.name` for each Drone Fly instance needs to be provided to avoid all instances ending up in the same Kafka consumer group. | `string` | `drone-fly` | no |
| endpoint.port | Port on which Drone Fly Spring Boot app will start. | `string` | `8008` | no |


## Metrics

Drone Fly exposes standard [JVM and Kafka metrics](https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#production-ready-metrics-meter) using [Prometheus on Spring Boot Actuator](https://docs.spring.io/spring-boot/docs/current/reference/html/production-ready-features.html#production-ready-metrics-export-prometheus) endpoint `/actuator/prometheus`.

### Some of the useful metrics to track are:

```
system_cpu_usage
kafka_consumer_records_consumed_total_records_total
jvm_memory_committed_bytes
```


## Legal
This project is available under the [Apache 2.0 License](http://www.apache.org/licenses/LICENSE-2.0.html).

Copyright 2020 Expedia, Inc.
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<parent>
<groupId>com.expediagroup</groupId>
<artifactId>eg-oss-parent</artifactId>
<version>1.3.1</version>
<version>2.0.0</version>
</parent>

<artifactId>drone-fly-parent</artifactId>
Expand Down

0 comments on commit 7abdce6

Please sign in to comment.