Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-2366: Copycat #99

Closed
wants to merge 29 commits into from
Closed
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
11981d2
Add copycat-data and copycat-api
ewencp Jul 24, 2015
0233456
Add copycat-avro and copycat-runtime
ewencp Jul 24, 2015
e14942c
Add Copycat file connector.
ewencp Jul 25, 2015
31cd1ca
Add CLI tools for Copycat.
ewencp Jul 26, 2015
4a9b4f3
Add some helpful Copycat-specific build and test targets that cover a…
ewencp Jul 26, 2015
dec1379
Switch to using new consumer coordinator instead of manually assignin…
ewencp Jul 29, 2015
e849e10
Remove duplicated TopicPartition implementation.
ewencp Jul 29, 2015
5a618c6
Remove offset serializers, instead reusing the existing serializers a…
ewencp Jul 29, 2015
1243a7c
Merge remote-tracking branch 'origin/trunk' into copycat
ewencp Jul 29, 2015
220e42d
Replace Avro serializer with JSON serializer.
ewencp Jul 30, 2015
0aefe21
Add log4j settings for Copycat.
ewencp Jul 30, 2015
25b5739
Fix sink task offset commit concurrency issue by moving it to the wor…
ewencp Jul 30, 2015
4674d13
Address review comments, clean up some code styling.
ewencp Jul 31, 2015
6ba87de
Remove most of the Avro-based mock runtime data API, only preserving …
ewencp Jul 31, 2015
122423e
Style cleanup
ewencp Jul 31, 2015
be5c387
Minor cleanup
ewencp Aug 1, 2015
e345142
Remove Copycat reflection utils, use existing Utils and ConfigDef fun…
ewencp Aug 1, 2015
0b5a1a0
Normalize naming to use partition for both source and Kafka, adjustin…
ewencp Aug 1, 2015
b194c73
Split Copycat converter option into two options for key and value.
ewencp Aug 1, 2015
6787a85
Make Converter generic to match serializers since some serialization …
ewencp Aug 2, 2015
d713a21
Address Gwen's review comments.
ewencp Aug 12, 2015
b29cb2c
Merge remote-tracking branch 'origin/trunk' into copycat
ewencp Aug 13, 2015
d55d31e
Reorganize Copycat code to put it all under one top-level directory.
ewencp Aug 13, 2015
0fa7a36
Mark Copycat classes as unstable and reduce visibility of some classe…
ewencp Aug 13, 2015
c0e5fdc
Merge remote-tracking branch 'origin/trunk' into copycat
ewencp Aug 13, 2015
656a003
Clarify and expand the explanation of the Copycat Coordinator interface.
ewencp Aug 13, 2015
7bf8075
Make Copycat CLI speific to standalone mode, clean up some config and…
ewencp Aug 13, 2015
8c108b0
Rename Coordinator to Herder to avoid confusion with the consumer coo…
ewencp Aug 14, 2015
a3a47a6
Simplify Copycat exceptions, make them a subclass of KafkaException.
ewencp Aug 14, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions bin/copycat.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/sh
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

base_dir=$(dirname $0)

if [ "x$KAFKA_LOG4J_OPTS" = "x" ]; then
export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/copycat-log4j.properties"
fi

exec $(dirname $0)/kafka-run-class.sh org.apache.kafka.copycat.cli.Copycat "$@"
8 changes: 8 additions & 0 deletions bin/kafka-run-class.sh
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,14 @@ do
CLASSPATH=$CLASSPATH:$file
done

for pkg in "copycat-data" "copycat-api" "copycat-runtime" "copycat-file" "copycat-json"
do
for file in $base_dir/${pkg}/build/libs/${pkg}*.jar $base_dir/${pkg}/build/dependant-libs/*.jar;
do
CLASSPATH=$CLASSPATH:$file
done
done

# classpath addition for release
for file in $base_dir/libs/*.jar;
do
Expand Down
269 changes: 257 additions & 12 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,11 @@ buildscript {
}

def slf4jlog4j='org.slf4j:slf4j-log4j12:1.7.6'
def slf4japi="org.slf4j:slf4j-api:1.7.6"
def junit='junit:junit:4.6'
def easymock='org.easymock:easymock:3.3.1'
def powermock='org.powermock:powermock-module-junit4:1.6.2'
def powermock_easymock='org.powermock:powermock-api-easymock:1.6.2'

allprojects {
apply plugin: 'idea'
Expand Down Expand Up @@ -59,7 +64,7 @@ rat {
// And some of the files that we have checked in should also be excluded from this check
excludes.addAll([
'**/.git/**',
'build/**',
'**/build/**',
'CONTRIBUTING.md',
'gradlew',
'gradlew.bat',
Expand Down Expand Up @@ -204,20 +209,25 @@ for ( sv in ['2_10_5', '2_11_7'] ) {
}
}

tasks.create(name: "jarAll", dependsOn: ['jar_core_2_10_5', 'jar_core_2_11_7', 'clients:jar', 'examples:jar', 'contrib:hadoop-consumer:jar', 'contrib:hadoop-producer:jar', 'log4j-appender:jar', 'tools:jar']) {
def copycatPkgs = ['copycat-data', 'copycat-api', 'copycat-runtime', 'copycat-json', 'copycat-file']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick comment on packaging. Could we put all copycat projects under a single directory (like what we did contrib/)? Also, is there a need to have different jars for api, data and runtime? It seems that they all need to be used together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data + api can definitely be combined. I personally like using this separation to enforce clean layering, but there's not much risk of messing that up with those two.

The reason for splitting api and runtime is that connector developers should only need the api jar. Again, this is a way to really force proper layering, be clear about what we want to expose as public API to connector developers, etc.

def pkgs = ['clients', 'examples', 'contrib:hadoop-consumer', 'contrib:hadoop-producer', 'log4j-appender', 'tools'] + copycatPkgs

tasks.create(name: "jarCopycat", dependsOn: copycatPkgs.collect { it + ":jar" }) {}
tasks.create(name: "jarAll", dependsOn: ['jar_core_2_10_5', 'jar_core_2_11_7'] + pkgs.collect { it + ":jar" }) {
}

tasks.create(name: "srcJarAll", dependsOn: ['srcJar_2_10_5', 'srcJar_2_11_7', 'clients:srcJar', 'examples:srcJar', 'contrib:hadoop-consumer:srcJar', 'contrib:hadoop-producer:srcJar', 'log4j-appender:srcJar', 'tools:srcJar']) { }
tasks.create(name: "srcJarAll", dependsOn: ['srcJar_2_10_5', 'srcJar_2_11_7'] + pkgs.collect { it + ":srcJar" }) { }

tasks.create(name: "docsJarAll", dependsOn: ['docsJar_2_10_5', 'docsJar_2_11_7', 'clients:docsJar', 'examples:docsJar', 'contrib:hadoop-consumer:docsJar', 'contrib:hadoop-producer:docsJar', 'log4j-appender:docsJar', 'tools:docsJar']) { }
tasks.create(name: "docsJarAll", dependsOn: ['docsJar_2_10_5', 'docsJar_2_11_7'] + pkgs.collect { it + ":docsJar" }) { }

tasks.create(name: "testAll", dependsOn: ['test_core_2_10_5', 'test_core_2_11_7', 'clients:test', 'log4j-appender:test', 'tools:test']) {
tasks.create(name: "testCopycat", dependsOn: copycatPkgs.collect { it + ":test" }) {}
tasks.create(name: "testAll", dependsOn: ['test_core_2_10_5', 'test_core_2_11_7'] + pkgs.collect { it + ":test" }) {
}

tasks.create(name: "releaseTarGzAll", dependsOn: ['releaseTarGz_2_10_5', 'releaseTarGz_2_11_7']) {
}

tasks.create(name: "uploadArchivesAll", dependsOn: ['uploadCoreArchives_2_10_5', 'uploadCoreArchives_2_11_7', 'clients:uploadArchives', 'examples:uploadArchives', 'contrib:hadoop-consumer:uploadArchives', 'contrib:hadoop-producer:uploadArchives', 'log4j-appender:uploadArchives', 'tools:uploadArchives']) {
tasks.create(name: "uploadArchivesAll", dependsOn: ['uploadCoreArchives_2_10_5', 'uploadCoreArchives_2_11_7'] + pkgs.collect { it + ":uploadArchives" }) {
}

project(':core') {
Expand All @@ -239,8 +249,8 @@ project(':core') {
compile 'org.scala-lang.modules:scala-parser-combinators_2.11:1.0.4'
}

testCompile 'junit:junit:4.6'
testCompile 'org.easymock:easymock:3.0'
testCompile "$junit"
testCompile "$easymock"
testCompile 'org.objenesis:objenesis:1.2'
testCompile "org.scalatest:scalatest_$baseScalaVersion:2.2.5"

Expand Down Expand Up @@ -371,11 +381,11 @@ project(':clients') {
archivesBaseName = "kafka-clients"

dependencies {
compile "org.slf4j:slf4j-api:1.7.6"
compile "$slf4japi"
compile 'org.xerial.snappy:snappy-java:1.1.1.7'
compile 'net.jpountz.lz4:lz4:1.2.0'

testCompile 'junit:junit:4.6'
testCompile "$junit"
testRuntime "$slf4jlog4j"
}

Expand Down Expand Up @@ -423,7 +433,7 @@ project(':tools') {
compile 'com.fasterxml.jackson.core:jackson-databind:2.5.4'
compile "$slf4jlog4j"

testCompile 'junit:junit:4.6'
testCompile "$junit"
testCompile project(path: ':clients', configuration: 'archives')
}

Expand Down Expand Up @@ -471,7 +481,7 @@ project(':log4j-appender') {
compile project(':clients')
compile "$slf4jlog4j"

testCompile 'junit:junit:4.6'
testCompile "$junit"
testCompile project(path: ':clients', configuration: 'archives')
}

Expand All @@ -496,3 +506,238 @@ project(':log4j-appender') {
}
test.dependsOn('checkstyleMain', 'checkstyleTest')
}

project(':copycat-data') {
apply plugin: 'checkstyle'
archivesBaseName = "copycat-data"

dependencies {
compile project(':clients')
compile "$slf4japi"

testCompile "$junit"
testRuntime "$slf4jlog4j"
}

task testJar(type: Jar) {
classifier = 'test'
from sourceSets.test.output
}

test {
testLogging {
events "passed", "skipped", "failed"
exceptionFormat = 'full'
}
}

javadoc {
include "**/org/apache/kafka/copycat/data/*"
}

artifacts {
archives testJar
}

configurations {
archives.extendsFrom (testCompile)
}

/* FIXME Re-enable this with KAFKA-2367 when the placeholder data API is replaced
checkstyle {
configFile = new File(rootDir, "checkstyle/checkstyle.xml")
}
test.dependsOn('checkstyleMain', 'checkstyleTest') */
}

project(':copycat-api') {
apply plugin: 'checkstyle'
archivesBaseName = "copycat-api"

dependencies {
compile project(':copycat-data')
compile "$slf4japi"

testCompile "$junit"
testRuntime "$slf4jlog4j"
}

task testJar(type: Jar) {
classifier = 'test'
from sourceSets.test.output
}

test {
testLogging {
events "passed", "skipped", "failed"
exceptionFormat = 'full'
}
}

javadoc {
include "**/org/apache/kafka/copycat/*"
}

artifacts {
archives testJar
}

configurations {
archives.extendsFrom (testCompile)
}

checkstyle {
configFile = new File(rootDir, "checkstyle/checkstyle.xml")
}
test.dependsOn('checkstyleMain', 'checkstyleTest')
}

project(':copycat-json') {
apply plugin: 'checkstyle'
archivesBaseName = "copycat-json"

dependencies {
compile project(':copycat-api')
compile "$slf4japi"
compile 'com.fasterxml.jackson.core:jackson-databind:2.5.4'

testCompile "$junit"
testCompile "$easymock"
testCompile "$powermock"
testCompile "$powermock_easymock"
testRuntime "$slf4jlog4j"
}

task testJar(type: Jar) {
classifier = 'test'
from sourceSets.test.output
}

test {
testLogging {
events "passed", "skipped", "failed"
exceptionFormat = 'full'
}
}

javadoc {
include "**/org/apache/kafka/copycat/*"
}

artifacts {
archives testJar
}

configurations {
archives.extendsFrom(testCompile)
}

checkstyle {
configFile = new File(rootDir, "checkstyle/checkstyle.xml")
}
test.dependsOn('checkstyleMain', 'checkstyleTest')

tasks.create(name: "copyDependantLibs", type: Copy) {
from (configurations.runtime) {
exclude('kafka-clients*')
exclude('copycat-*')
}
into "$buildDir/dependant-libs"
}

jar {
dependsOn copyDependantLibs
}
}

project(':copycat-runtime') {
apply plugin: 'checkstyle'
archivesBaseName = "copycat-runtime"

dependencies {
compile project(':copycat-api')
compile project(':clients')
compile "$slf4japi"

testCompile "$junit"
testCompile "$easymock"
testCompile "$powermock"
testCompile "$powermock_easymock"
testRuntime "$slf4jlog4j"
testRuntime project(":copycat-json")
}

task testJar(type: Jar) {
classifier = 'test'
from sourceSets.test.output
}

test {
testLogging {
events "passed", "skipped", "failed"
exceptionFormat = 'full'
}
}

javadoc {
include "**/org/apache/kafka/copycat/*"
}

artifacts {
archives testJar
}

configurations {
archives.extendsFrom(testCompile)
}

checkstyle {
configFile = new File(rootDir, "checkstyle/checkstyle.xml")
}
test.dependsOn('checkstyleMain', 'checkstyleTest')
}

project(':copycat-file') {
apply plugin: 'checkstyle'
archivesBaseName = "copycat-file"

dependencies {
compile project(':copycat-api')
compile "$slf4japi"

testCompile "$junit"
testCompile "$easymock"
testCompile "$powermock"
testCompile "$powermock_easymock"
testRuntime "$slf4jlog4j"
}

task testJar(type: Jar) {
classifier = 'test'
from sourceSets.test.output
}

test {
testLogging {
events "passed", "skipped", "failed"
exceptionFormat = 'full'
}
}

javadoc {
include "**/org/apache/kafka/copycat/*"
}

artifacts {
archives testJar
}

configurations {
archives.extendsFrom(testCompile)
}

checkstyle {
configFile = new File(rootDir, "checkstyle/checkstyle.xml")
}
test.dependsOn('checkstyleMain', 'checkstyleTest')
}
Loading