Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-1880: Add support for checking binary/source compatibility #5620

Open
wants to merge 17 commits into
base: trunk
Choose a base branch
from

Conversation

viktorsomogyi
Copy link
Contributor

This PR aims to extend the gradle build with a task that can generate a report for API/ABI compatibility.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@viktorsomogyi
Copy link
Contributor Author

@asasvari @andrasbeni @smurakozi @akatona84 would you please review this?

@ijuma ijuma added this to the 2.1.0 milestone Sep 7, 2018
@ijuma ijuma self-requested a review September 7, 2018 22:23
@ijuma
Copy link
Contributor

ijuma commented Sep 7, 2018

Thanks for picking this up. Does this fail PRs if there is a compatibility break? And does it only check public APIs?

README.md Outdated Show resolved Hide resolved
@viktorsomogyi
Copy link
Contributor Author

@ijuma at this point it is only a report that can be run manually, it's not part of the build task. The reason for this is that the compare tool (japicmp) requires two sets compiled jars (old and new) and it is quite lengthy to compile both: it takes about 3-4 minutes and I think it should be enough if only Jenkins bounces the commit back. Improvement idea 1. should reduce it to the half of this time. Let me know if you think we should add this to the local build as well.
There is the failOnSemanticIncompatibility option that I added which fails the task if there binary incompatible changes on a minor/maintenance release branch. Just propagated the corresponding cli option from here: http://siom79.github.io/japicmp/CliTool.html. Alternatively the failOnBinaryIncompatibility option could be used too on minor and maintenance branches to achieve the same effect.
For now I specified 3 packages as API to check: org.apache.clients.*, org.apache.common.* and org.apache.connect.*. I'm hoping you can give me more input on how to refine this list, I can easily imagine that this list is by far not complete (for instance shall we include streams too?).

Adding it to the Jenkins jobs however can absolutely be done with some extra steps. For instance in case of trunk: since the previous release was 2.0.0 we need to specify that to the Jenkins build as well (unfortunately "specifying the previous release" can't be automated, we need to figure it out manually for each job).
I've created a test Jenkins job on which I've done something like this:

# do the build/test before this report
# 
# branches must be local for now, we need to check them out
git checkout $BASE_BRANCH #2.0.0
git checkout $BRANCH #trunk (2.1.0.dev)
# this is to check out the PR
git fetch origin pull/5620/head:api-compatibility-report
git checkout api-compatibility-report
#
# This can be specified in an "Invoke Gradle Script" build step but copied the command line launch here
./gradlew apiCompatibilityReport -PbaseBranch=$BASE_BRANCH -PnewBranch=$BRANCH -PfailOnSemanticIncompatibility=true

So if you're on the 2.1.x development branch, you need to set 2.0 as the base branch
I didn't want to add it to the standard build because generating this report usually takes 3-4 minutes as it practically checks out the two branch to compare and jars them.

Some possible ways to improve this:

  1. Use the currently built artifacts if they are available. For instance in the case above we wouldn't need to compile trunk once again as during the build process all the required jars were built. This would spare 1-2 minutes for the build.
  2. Checking dependencies: binary incompatible dependencies might be an issue in case of minor and maintenance releases. For instance we upgrade the Jackson dependency to one which is binary incompatible with the previous one could break a Kafka user when they want to upgrade by swapping out the client jar.

gradle/compatibility.gradle Outdated Show resolved Hide resolved
gradle/compatibility.gradle Outdated Show resolved Hide resolved
@viktorsomogyi viktorsomogyi changed the title KAFKA-1880: Add support for checking binary/source compatibility (WIP) KAFKA-1880: Add support for checking binary/source compatibility Sep 12, 2018
@viktorsomogyi
Copy link
Contributor Author

viktorsomogyi commented Sep 12, 2018

Added WIP to the PR title as I'm actually pretty close on implementing idea #1 in my previous comment. It might take 1-2 more days with testing it out.

@ijuma
Copy link
Contributor

ijuma commented Sep 13, 2018

Some answers:

Public packages are specified in the gradle build in the following way (confusing, I know):

javadoc {
include "/org/apache/kafka/clients/admin/*"
include "
/org/apache/kafka/clients/consumer/"
include "**/org/apache/kafka/clients/producer/
"
include "/org/apache/kafka/common/*"
include "
/org/apache/kafka/common/acl/"
include "**/org/apache/kafka/common/annotation/
"
include "/org/apache/kafka/common/errors/*"
include "
/org/apache/kafka/common/header/"
include "**/org/apache/kafka/common/resource/
"
include "/org/apache/kafka/common/serialization/*"
include "
/org/apache/kafka/common/config/"
include "**/org/apache/kafka/common/config/provider/
"
include "/org/apache/kafka/common/security/auth/*"
include "
/org/apache/kafka/common/security/plain/"
include "**/org/apache/kafka/common/security/scram/
"
include "/org/apache/kafka/common/security/token/delegation/*"
include "
/org/apache/kafka/common/security/oauthbearer/"
include "**/org/apache/kafka/server/policy/
"
include "**/org/apache/kafka/server/quota/*"
}

Generally, the goal is to compare the APIs with a released version. Can we not simply rely on the released artifacts instead of compiling manually?

@viktorsomogyi
Copy link
Contributor Author

@ijuma yea I was thinking about adding that feature. Basically I would need to create dependencies out of the provided versions and resolve them. I think it can be done but last time I looked it didn't seem trivial. Let me look into it and come back with what I'll find. Thanks for the public packages.

@viktorsomogyi
Copy link
Contributor Author

@ijuma A couple of questions to clarify the use-case. So I was presuming that generally there are 5 use-cases.

  • The developer wants to check if they are compliant, therefore they check their changes against trunk
  • We want to enable a check in the PR Jenkins jobs that check PRs if they are compliant (so enforce binary compatibility rule in minor and patch/maintenance releases)
  • We want to compare random releases to see how did they change from version to version to evaluate how risky it is for users to migrate (or just compare releases just for fun).
  • When creating patches it is a good thing to see if the patch is good and doesn't break binary compatibility, so let's check against a released version that is being patched.
  • When upgrading dependencies we'd like to see if we're breaking binary compatibility in them. (I don't see this generally as a big problem but might be an issue though.)

I'm curious if you think we'd aim for these changes or do we need something else?

Also I managed to figure out a way for downloading dependencies (it doesn't seem to be too complex afterall), but it requires a bit more time to implement a well-working solution. Also I'll limit the minimum version to be 0.11.0.0 for version downloads. The reason is that certain projects are added during the life of Kafka and going back too much would overcomplicate the whole checker (as I need to add exceptions for certain projects etc.) and I don't see the value in those situations. I think it is not often a use-case when such an old version is asked. And if it is, we can fall back to "compile then compare" instead of "download and compare". If needed I can do it after more important things are ready.

@ijuma
Copy link
Contributor

ijuma commented Sep 14, 2018

  • The developer wants to check if they are compliant, therefore they check their changes against trunk

Yes.

  • We want to enable a check in the PR Jenkins jobs that check PRs if they are compliant (so enforce binary compatibility rule in minor and patch/maintenance releases)

Yes.

  • We want to compare random releases to see how did they change from version to version to evaluate how risky it is for users to migrate (or just compare releases just for fun).

Nice to have, but could be done separately.

  • When creating patches it is a good thing to see if the patch is good and doesn't break binary compatibility, so let's check against a released version that is being patched.

Is this different from the Jenkins PR point?

  • When upgrading dependencies we'd like to see if we're breaking binary compatibility in them. (I don't see this generally as a big problem but might be an issue though.)

Helpful, but one would hope that the dependencies do this on their own.

Also I managed to figure out a way for downloading dependencies (it doesn't seem to be too complex afterall), but it requires a bit more time to implement a well-working solution. Also I'll limit the minimum version to be 0.11.0.0 for version downloads.

I'd say we could even just do 2.0 and trunk.

@viktorsomogyi
Copy link
Contributor Author

viktorsomogyi commented Sep 18, 2018

@ijuma answeing your question: yes it is a subclass of the PR point technically not different.

I've reworked this pretty much and accomodated for most of these use cases (except the dependency checking).
Does the following by default.

  • It compares the last release with the current branch that we're building on. This is good for a generic development activity and doesn't require any extra time as it downloads the last release from the configured maven repos.
  • it is part of the check lifecycle step, so whenever a user runs a gradle build or gradle check it'll run after the assemble task.
  • You can run it against two arbitrary branches / tags and even released versions. The git references must be local, otherwise it won't work.
  • The release download only works for Kafka 2.0.0+. If somebody adds a new project which is not released yet, then the download will fail (and therefore it fails the build). The build detects though if the source or target references are released versions or not (and therefore has to be built) but if a released version doesn't contain a project (such as 0.7 didn't contain connect) it'll fail when downloading that. This is an unsolved problem as it might be nicer to maybe ignore these but I didn't spend too much time with this as projects aren't added that often and I tried to exclude any "examples" and "system-test" project (that might be added more frequently). Anyway, I think this is a bridge that we need to cross when we're there.
  • By default it only shows incompatibilities but this can be set through the onlyIncompatible gradle property.
  • By default it doesn't fail the build in any circumstances but it can be set. There are three flags for this: failOnBinaryIncompatibility, failOnSourceIncompatibility and failOnSemanticIncompatibility. The first two I think straight enough but the third triggers the mode which we actually need. So it will enable failing the build if there is a binary incompatible change between the source and the target if they are minor or patch (maintenance) releases. The reason why I didn't enable this by default is that there is already a bunch of incompatibilities in 2.1 and it would immediately fail the build. I think it would make sense to keep it this way until releasing 2.1 and then switching the default from false to true on this when bumping the version.
  • I've added an extra gradle property called lastRelease that must be updated whenever the version is bumped after a release. Although it is a manual step, I didn't see any easy way to automate this. The only way would be to query the maven poms for the "last release" which could be a bit of a magic.

@viktorsomogyi viktorsomogyi changed the title (WIP) KAFKA-1880: Add support for checking binary/source compatibility KAFKA-1880: Add support for checking binary/source compatibility Sep 18, 2018
@viktorsomogyi
Copy link
Contributor Author

@TolerableCoder would you please also review it?

Copy link
Contributor

@akatona84 akatona84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general, just some minor stuff

build.gradle Outdated Show resolved Hide resolved
gradle/compatibility.gradle Outdated Show resolved Hide resolved
@viktorsomogyi
Copy link
Contributor Author

viktorsomogyi commented Feb 25, 2019

@ijuma is this still needed? What can I do to make this eligible for merging? (Besides rebasing it :) )

@ijuma
Copy link
Contributor

ijuma commented Feb 25, 2019

Yes! Maybe @omkreddy can help review.

@viktorsomogyi
Copy link
Contributor Author

Thanks :)
It's pretty much the end of the day for me but tomorrow I'll whip this into shape, rebase it, review it once more.

@viktorsomogyi
Copy link
Contributor Author

@omkreddy @ijuma updated this PR, please check it if it's good from a usability perspective and otherwise.
I only have one doubt which is that the report generator by default prints the report to stdout. It might be a bit annoying. Unfortunately in the JApiCmp lib itself there is no way to work this around except reusing some of it's code (25-30 lines). If that's OK to do I can push a change with that.
As of its current state . it builds before the build task but after assemble, therefore part of the build pipeline. Not a huge overhead as by default it downloads the artifacts of the latest release from maven and builds a report against that. The description I wrote about general behavior in #5620 (comment) is still valid.

@viktorsomogyi
Copy link
Contributor Author

In fact I pushed it so it can be compared if needed.

@viktorsomogyi
Copy link
Contributor Author

restest this please

@viktorsomogyi
Copy link
Contributor Author

retest this please

@viktorsomogyi
Copy link
Contributor Author

@omkreddy would you please review this once you get some time?

@viktorsomogyi
Copy link
Contributor Author

retest this please

@viktorsomogyi viktorsomogyi force-pushed the compatibility-report branch 2 times, most recently from 639f5c2 to b58d89c Compare June 25, 2019 13:19
@viktorsomogyi
Copy link
Contributor Author

@cmccabe thanks a lot for reviewing this PR. I've updated it considering your suggestions. Now I use the S3 bucket to download the released artefacts instead of resolving them from maven repositories. The user can use -ParchivesLocation=http://whatever-i-want.com/my-kafka-packages to specify their own location. If a custom local branch is specified then the code will attempt to check it out to a temp folder and build it.

@viktorsomogyi
Copy link
Contributor Author

retest this please

@viktorsomogyi
Copy link
Contributor Author

@ijuma @cmccabe any opinions if this can get merged in if I rebase it?
I'm trying to clean up my open PR list so if you guys think this one is useful and something that the project needs I can rebase it, otherwise I'll close this.

@cmccabe
Copy link
Contributor

cmccabe commented May 1, 2020

Sorry, I'm really overloaded right now, but I do think this is important. Can I check back in a week? Mention me on the PR then?

@viktorsomogyi
Copy link
Contributor Author

@cmccabe just checking back, would you please let me know if you have time ro review this or shall we postpone it?

@viktorsomogyi
Copy link
Contributor Author

@cmccabe just checking back, please let me know if you can set aside some time to review this.

@viktorsomogyi
Copy link
Contributor Author

@cmccabe just checking back if you have some bandwidth for a review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants